Packages:
www.python.org
www.python.org/pypi
http://py.vaults.ca/parnassus/
Text:
http://www.python.org/doc/
http://www.diveintopython.org/
http://ASPn.activestate.com/ASPN/Python/Cookbook/
http://gnosis.cx/TPiP/
Install Python 2.3 and also install win32all toolkit from Mark Hammond which
comes with PythonWin (IDE with integrated debugger). The debugger feature is
not obvious but the traditional shortcuts of F9, F10 and so on will work.
When you press F5 it will try to run the program using the default run mode.
If you want to change the run mode, you should click on the running man icon
and change the Debugging combo box value (Step-in the debugger will step the
code from the beginning, Run in the debugger requires you to set up a
breakpoint with F9 first).
Even better, get a copy of Komodo (Personal or Professional) and get the
power of auto-completion, more integrated debugger, remote debugging, cross-
platform development (Linux, Solaris, Windows).
How to add modules into a Python installation without setting $PYTHONPATH?
Locate the site-packages
Directory (run python, type "import sys", type "sys.path") and create modulename.pth. Inside this file put a relative
directory path (or absolute directory path) to the new module that you are
trying to add. Inside this relative directory path, you should have the
modulename.py that can be used via "import modulename" or "from modulename
import classname"
If the newmodule is within the same directory, you can just import it as is
without using silly pth file.
Change a file.py into an executable? C:\py\cx_freeze-3.0.beta2-win32-py23
\cx_Freeze-3.0.beta2\FreezePython.exe --install-dir hello.py
There are six sequence types: strings, Unicode strings, lists, tuples,
buffers, and xrange objects.
There is currently only one standard mapping type, the dictionary.
tuple = immutable sequence
dictionary == hash
Accessing an element in a dictionary uses a similar contrUCt as accessing a
sequence. Python gets confused. Initialize the variable like so:
x = {} or x = []
or more verbosely
x = dict() or x = list()
x = { 1:"one", 2:"two" } # dictionary
x = (1, 2) # tuple
x = [1, 2] # list
There is a difference between x[] and x[:] assuming x is a sequence. The
first one creates another reference to x. The second one copies elements
of x and seems to perform a deepcopy (?).
Instead of using filter, use list comprehension.
def evennumber(x):
if x % 2 == 0: return True
else: return False
>>> filter(evennumber, [4, 5, 6])
[4, 6]
>>> [x for x in [4, 5, 6] if evennumber(x)]
[4, 6]
Instead of using map, use list comprehension.
>>> map(evenumber, [4, 5, 6])
[True, False, True]
>>> [evennumber(x) for x in [4, 5, 6]]
[True, False, True]
Remember starting in Python 2.2, built-in function open() is an alias to file
().
I am confused about setattr/getattr vs. property.
Built-in functions str vs. repr, both return string representation of an
object but str doesn't try to return a string that is acceptable to eval.
Check out built-in function range that createst a list of an arithmetic
progressions. Eg. range(0, 10) -> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] or range(1,
5, 2) -> [1, 3] or range(-1, -8, -3) -> [-1, -4, -7].
Built-in function reduce is interesting. Definition: Apply function of two arguments cumulatively to the items of sequence, from left to right, so as to reduce the sequence to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). Another example. reduce(func, [1, 2, 3, 4, 5]) is the same as:
x = func(1, 2)
x = func(x, 3)
x = func(x, 4)
x = func(x, 5)
return x
AND
Yet another example of reduce:
reduce(operator.add, range(n), m) # really is the same as the command: sum(range(n), m)
AND
It seems that reduce is great for associative operations but gets confusing if used for other purposes.
What is the difference of built-in functions eval and exec/execfile?
Built-in functions enumerate vs. iter? Similar but iter returns only the
value whereas enumerate returns the index position as well. From its
returned object, use its next() to go through each item.
list = ['1', '2', '3']
e = enumerate(list)
>>> e.next()
(0, '1')
>>> e.next()
(1, '2')
>>> e.next()
(2, '3')
>>> e.next()
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
StopIteration
---------
>>> i = iter(list)
>>> i.next()
'1'
>>> i.next()
'2'
>>> i.next()
'3'
>>> i.next()
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
StopIteration
How can an optional argument be the first and third position for built-in
function slice?
list[start:stop:step] # possible to specify "step"
Seems that module operator contains code used by Python keyWords like +, -,
and so on.
If you import a module that imports another module, the scope prevents you
from using the second module unless, of course, you import it yourself.
() is a tuple, an immutable (read-only) sequence
[] is a sequence
{} is a dictionary
Use built-in function type to query an object's type. Use it to do a
compare, and you should use module import to get the list of possible types.
What is built-in function slice, staticmethod?
What good is zip?
zip([[1, 2, 3], ['a', 'b', 'c']])
[([1, 2, 3],), (['a', 'b', 'c'],)]
Boolean operations "or" and "and" always return one of their operands.
"import foo" doesn't require a module object named foo to exist. Why?
A special member of every module is __dict__.
Functions have another special attribute f.__dict__ (a.k.a. f.func_dict)
which contains the namespace used to support function attributes.
Variables with double leading underscore are "mangled" to provide a simple
but effective way to define class private variables. Any identifier of the
form __spam (at least two leading underscores, at most one trailing
underscore) is textually replaced with _classname__spam, where classname is
the current class name with any leading underscores stripped.
What is the difference between function and method?
Special attributes of several types of object: __dict__, __class__,
__bases__, and __name__.
When dealing with exception, remember that Python will traverse through the
list of handled exceptions and pick the first one that matches. Check
http://docs.python.org/lib/module-exceptions.Html for a class hierarchy of
exceptions.
Wanting to write an extension module (usually a module written in C which
Python can call)? You can use Pyrex or SWIG.
Missing your switch/case statement? Consider using a dictionary:
functions = { 'a':function_1, 'b':function_2, 'c':function_3 }
func = functions[value]
func()
Or to call a function by name (a la Java's reflection),
method = getattr(self, 'theMethod_' + str(value))
method()
Want to speed up Python? See, for example, Psyco, Pyrex, PyInline, Py2Cmod,
and Weave.
Default values are created exactly once, when the function is defined. Thus,
it is usually a bad practice to use a mutable object as a default value since
subsequent call will remember the previous value. Use immutable. This
feature can be useful though if you want to cache and to avoid using global
variables. Eg.
def foo(D=[]): # Danger: shared reference to one dict for all calls
D.append("duh")
return D
foo() # you get "duh"
foo() # here you get "duh duh"
Never use relative package imports. If you're writing code that's in the
package.sub.m1 module and want to import package.sub.m2, do not just write
import m2, even though it's legal. Write from package.sub import m2 instead.
Relative imports can lead to a module being initialized twice, leading to
confusing bugs.
If only instances of a specific class use a module, then it is reasonable to
import the module in the class's __init__ method and then assign the module
to an instance variable so that the module is always available (via that
instance variable) during the life of the object.
What the heck is * and **? Optional arguments and keyword parameters? Used
to pass from one function to another?
Many ways to call a function by-reference, the best is to actually have the
function return a tuple. A trick is to by convention designate the first item in the
tuple to be the return code. If the first item says it is okay, then we can use
the rest of the tuple; otherwise don't. This is helpful if you want to return
(2,) to indicate 2 as an error code -> lets caller know not to check the rest
of the tuple.
In a class, several special methods (or are they called functions?):
__init__: the constructor
__call__: called the class' instance is called as in instance().
Want to copy an object? For sequences use [:], for dictionaries use copy(),
for others use copy.copy() or copy.deepcopy().
3.2 Why does -22 / 10 return -3?
It seems that all integer operation will round down if it has to round something.
String to number? Use int(). Watch out if you are trying to convert a
string representation of a hexadecimal value (eg. "0x0f").
Number to string? For decimal, use str(). For hexadecimal, use hex(). For
octal, use oct(). You can also use the % operator for formatting, eg. "%
06d" % 144 # '000144'
Interestingly, in order to insert some characters into a string, you need to
first convert the string into a list (a = list("hello there"), use a list
operation to insert the characters (a[6:8] = "you"), and finally converting
it back into a string (''.join(a)).
Three techniques to call functions by names: dictionary, getattr(), and
locals()/eval().
Perl's chomp equivalence? Use s.rstrip() or s.splitlines()[0].
Reverse a sequence? For a list, just call l.reverse() and assign to another
variable. For non-list, use either:
a) for i in range(len(sequence)-1, -1, -1): print sequence[i]
b) for i in sequence[::-1]: print i # works only on Python 2.3 and above
Replicating a list with * doesn't create copies, it only creates references
to the existing objects.
Sort a list based on values in another list? First merge them using zip,
then sort the zipped tuple, next extract using list comprehension:
>>> list1 = ["what", "I'm", "sorting", "by"]
>>> list2 = ["something", "else", "to", "sort"]
>>> pairs = zip(list1, list2)
>>> pairs
[('what', 'something'), ("I'm", 'else'), ('sorting', 'to'), ('by', 'sort')]
>>> pairs.sort()
>>> result = [ x[1] for x in pairs ]
>>> result
['else', 'sort', 'to', 'something']
The del statement does not necessarily call __del__ -- it simply decrements
the object's reference count, and if this reaches zero __del__ is called.
Despite the cycle collector, it's still a good idea to define an eXPlicit
close() method on objects to be called whenever you're done with them. The
close() method can then remove attributes that refer to subobjecs. Don't call
__del__ directly -- __del__ should call close() and close() should make sure
that it can be called more than once for the same object.
What is weakref? If the only reference to an object is a weakref, then
garbage collector is free to destroy it. Primarily used to implement caches
or mappings holding large objects.
Threading? Use threading module, not the low-level thread module. Python's
threading support doesn't seem to be good.
Truncate a file? f = open(filename, "r+"), and use f.truncate(offset);
Or also, os.ftruncate(fd, offset) for fd opened by os.open().
Copy file? Use module shutil.
Read/write binary data? Use struct module. Then you can use its members
pack and unpack. Example:
import struct
f = open(filename, "rb") # Open in binary mode for portability
s = f.read(8)
x, y, z = struct.unpack(">hhl", s)
The '>' in the format string forces big-endian data; the letter 'h' reads one
"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from
the string.
For homogenous list of ints or floats, you can use the array module.
Bi-directional pipe trying to avoid deadlocks? Use temporary file (not
really an elegent solution) or use excectpy or pexcept.
Random number generator? Use module random.
Want to redirect stdout/stderr to a file?
sys.stdout = file(logFile, "w")
sys.stderr = sys.stdout
pathlist = os.environ['PATH'].split(os.pathsep)
except can take a list of exception classes
DOTALL in re makes . matches \n as well
import types, types can be used to tell what type of object reference you
have
There is no ternary operator (aka. conditional operator) aka ( a ? b : c).
There is a workaround but it makes the code even more confusing to read.
Sorry.
Locking file can be achieved using fcntl. On some system, like HP-UX,
the file needs to be opened with w+ or r+ (or something with +,
check documentation). Example:
import fcntl
...
file = open(filename, "w+")
fcntl.lockf(file.fileno(), fcntl.LOCK_EX)
...
file.close()
Silly way to retrieve current time in string:
import time, shutil
now = time.localtime()
timestr = ".%s%s%s.%s%s%s" % (now.tm_year, now.tm_mon, now.tm_mday, now.tm_hour, now.tm_min, now.tm_sec)
Want to backup while preserving permission (like cp -p)?
shutil.copy2(src, dst)
Want to compare type or want to know what type of an object you have?
Keyword type will do it. Use as follow:
if type(foo) == type([]): print "it is a list"
# You can also compare it against types.Types.
Want to modify the sys.path at runtime without modifying source code?
Set environment PYTHONPATH and its content will be prefixed in front
of the default values of sys.path.
An example on regular expression, rex, re, regexp:
rex = re.compile("File (.*) is removed; (.*) not included in release tag (.*)")
sreMatch = rex.search(line)
if sreMatch: # or perhaps can be written as if sreMatch != None
print "%s, %s, %s" % (sreMatch.group(1), sreMatch.group(2), sreMatch.group(3))
Best way to remove duplicates of items in a list:
foo = [1, 1, 2, 2, 3, 4]
set = {}
map(set.__setitem__, foo, [])
foo = set.keys()
Declare and initialize a dictionary?
foo = {
"CCU":("weight","SBP","DBP"),
"OE":("weight","HR","Temp","blood glucose")
}
input = sys.stdin.readline()
try:
id = int(input)
except ValueError:
print "Can't convert input to an integer."
id = -1
input = sys.stdin.readline()
if input[0] in ("y", "Y"):
print "it is a yes"
# check if python version is at least 2.3
if sys.version < '2.3': print "gotta run at least 2.3."
# check if euid is root or 0
if os.getuid() != 0: print "gotta run as root."
#environment variable CCSYSDIR
os.environ['CCSYSDIR']
# get password line entryfor username from /etc/passwd
import pwd
userattr = pwd.getpwdnam("username")
# userattr[2] == user
# userattr[3] == group
# set permission of a file so that it is rwx for ?
os.chmod(file, stat.S_IREADstat.S_IWRITEstat.S_IEXEC)
# take one keystroke and return from function?
raw_input("press enter to continue..."
# quickly open and write
file.open("thefile.txt", "w")
file.write("junk goes into this file\n\
second line of junk file\n\
and third and final line\n")
file.close()
LIBRARY MODULES
# get reference count of an object, usually one higher than expected
sys.getrefcount(obj)
import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
print root, "consumes",
print sum([getsize(join(root, name)) for name in files]),
print "bytes in", len(files), "non-directory files"
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories
import os
from os.path import join
# Delete everything reachable from the directory named in 'top'.
# CAUTION: This is dangerous! For example, if top == '/', it
# could delete all your disk files.
for root, dirs, files in os.walk(top, topdown=False):
for name in files:
os.remove(join(root, name))
for name in dirs:
os.rmdir(join(root, name))
Instead of using os.listdir(), use dircache.
lst = dircache.listdir(path) # get a list of all directories/files under path
dircache.annotate(path, lst) # extra, add trailing / for directory name
Single file comparison? filecmp.cmp(fname1, fname2 [,shallow=1])
filecmp.cmp returns 0 for match, 1 for no match.
Multiple files comparison in two directories? filecmp.cmpfiles(dirname1, dirname2, fnamelist [,shallow=1]))
filecmp.cmpfiles returns a tuple of three lists: (matches,mismatches,errors)
Single directory comparison? file.dircmp(dirname1, dirname2 [,ignore=... [,hide=...])
ignore defaults to '["RCS","CVS","tags"]' and hide defaults to '[os.curdir,os.pardir]' (i.e., '[".",".."]').
fileinput, something like cat?
glob: list pathnames matching pattern
pathnames = glob.glob('/Users/quilty/Book/chap[3-4].txt')
linecache: efficient random access to a file
linecache.getline('/etc/hosts', 15)
linecache.checkcache # has file been modified since last cached
os.path: path name manipulation
os.path.commonprefix
os.path.expanduser
os.path.expandvars
os.path.join
os.path.normpath: remove redundant path information
os.path.splitdrive: useful on Windows
os.path.walk ~= os.walk?
file.readline() # slow but memory-friendly
file.readlines() # fast but memory-hungry
xreadlines is better but is deprecated. Use idiom 'for line in file.open("bla"):' instead.
The [commands] module exists primarily as a convenience wrapper for calls to `os.popen*()`.
commands.getoutput can be implemented as such:
def getoutput(cmd):
import os
return os.popen('{ '+cmd+'; } 2>&1').read()
# check out the usage of '+cmd+' for whatever that means...
Use dbm module to create a 'dictionary-on-disk'. This allows you use to store to disk pairs of key/value where both key and value are strings. You work with the dbm as though it is a in-memory dictionary.
If you need to store a key/value pair where the value is not just a string, use shelve module. Of course, it still can't store objects that are not pickle-able like file objects.
If shelve is not powerful enough for your need, try ZODB.
Prefer cPickle over picke module. Example usage:
import cPickle
from somewhere import my_complex_object
s = cPickle.dumps(my_complex_object)
new_obj = cPickle.loads(s)
Module name collision? Use the keyword as.
import repr as _repr
from repr import repr as newrepr
datetime module giving you a headache since you don't know how to tell it to set the dst? Pass -1 as the last argument when creating the object datetime.datetime and Python will figure out the dst for you.
Single file comparison? filecmp.cmp(fname1, fname2 [,shallow=1])
filecmp.cmp returns 0 for match, 1 for no match.
Multiple files comparison in two directories? filecmp.cmpfiles(dirname1, dirname2, fnamelist [,shallow=1]))
filecmp.cmpfiles returns a tuple of three lists: (matches,mismatches,errors)
Single directory comparison? file.dircmp(dirname1, dirname2 [,ignore=... [,hide=...])
ignore defaults to '["RCS","CVS","tags"]' and hide defaults to '[os.curdir,os.pardir]' (i.e., '[".",".."]').
fileinput, something like cat?
glob: list pathnames matching pattern
pathnames = glob.glob('/Users/quilty/Book/chap[3-4].txt')
linecache: efficient random access to a file
linecache.getline('/etc/hosts', 15)
linecache.checkcache # has file been modified since last cached
os.path: path name manipulation
os.path.commonprefix
os.path.expanduser
os.path.expandvars
os.path.join
os.path.normpath: remove redundant path information
os.path.splitdrive: useful on Windows
os.path.walk ~= os.walk?
file.readline() # slow but memory-friendly
file.readlines() # fast but memory-hungry
xreadlines is better but is deprecated. Use idiom 'for line in file.open("bla"):' instead.
The [commands] module exists primarily as a convenience wrapper for calls to `os.popen*()`.
commands.getoutput can be implemented as such:
def getoutput(cmd):
import os
return os.popen('{ '+cmd+'; } 2>&1').read()
# check out the usage of '+cmd+' for whatever that means...
Use dbm module to create a 'dictionary-on-disk'. This allows you use to store to disk pairs of key/value where both key and value are strings. You work with the dbm as though it is a in-memory dictionary.
If you need to store a key/value pair where the value is not just a string, use shelve module. Of course, it still can't store objects that are not pickle-able like file objects.
If shelve is not powerful enough for your need, try ZODB.
Prefer cPickle over picke module. Example usage:
import cPickle
from somewhere import my_complex_object
s = cPickle.dumps(my_complex_object)
new_obj = cPickle.loads(s)
Module name collision? Use the keyword as.