您现在的位置： Linux教程網 >> UnixLinux > >> Linux綜合 >> Linux資訊 >> 更多Linux

Python Tips

　　Packages: www.python.org www.python.org/pypi http://py.vaults.ca/parnassus/ Text: http://www.python.org/doc/ http://www.diveintopython.org/ http://ASPn.activestate.com/ASPN/Python/Cookbook/ http://gnosis.cx/TPiP/ Install Python 2.3 and also install win32all toolkit from Mark Hammond which comes with PythonWin (IDE with integrated debugger). The debugger feature is not obvious but the traditional shortcuts of F9, F10 and so on will work. When you press F5 it will try to run the program using the default run mode. If you want to change the run mode, you should click on the running man icon and change the Debugging combo box value (Step-in the debugger will step the code from the beginning, Run in the debugger requires you to set up a breakpoint with F9 first). Even better, get a copy of Komodo (Personal or Professional) and get the power of auto-completion, more integrated debugger, remote debugging, cross- platform development (Linux, Solaris, Windows). How to add modules into a Python installation without setting $PYTHONPATH? Locate the site-packages Directory (run python, type "import sys", type "sys.path") and create modulename.pth. Inside this file put a relative directory path (or absolute directory path) to the new module that you are trying to add. Inside this relative directory path, you should have the modulename.py that can be used via "import modulename" or "from modulename import classname" If the newmodule is within the same directory, you can just import it as is without using silly pth file. Change a file.py into an executable? C:\py\cx_freeze-3.0.beta2-win32-py23 \cx_Freeze-3.0.beta2\FreezePython.exe --install-dir hello.py There are six sequence types: strings, Unicode strings, lists, tuples, buffers, and xrange objects. There is currently only one standard mapping type, the dictionary. tuple = immutable sequence dictionary == hash Accessing an element in a dictionary uses a similar contrUCt as accessing a sequence. Python gets confused. Initialize the variable like so: x = {} or x = [] or more verbosely x = dict() or x = list() x = { 1:"one", 2:"two" } # dictionary

x = (1, 2) # tuple x = [1, 2] # list There is a difference between x[] and x[:] assuming x is a sequence. The first one creates another reference to x. The second one copies elements of x and seems to perform a deepcopy (?). Instead of using filter, use list comprehension. def evennumber(x): if x % 2 == 0: return True else: return False >>> filter(evennumber, [4, 5, 6]) [4, 6] >>> [x for x in [4, 5, 6] if evennumber(x)] [4, 6] Instead of using map, use list comprehension. >>> map(evenumber, [4, 5, 6]) [True, False, True] >>> [evennumber(x) for x in [4, 5, 6]] [True, False, True] Remember starting in Python 2.2, built-in function open() is an alias to file (). I am confused about setattr/getattr vs. property. Built-in functions str vs. repr, both return string representation of an object but str doesn't try to return a string that is acceptable to eval. Check out built-in function range that createst a list of an arithmetic progressions. Eg. range(0, 10) -> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] or range(1, 5, 2) -> [1, 3] or range(-1, -8, -3) -> [-1, -4, -7]. Built-in function reduce is interesting. Definition: Apply function of two arguments cumulatively to the items of sequence, from left to right, so as to reduce the sequence to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). Another example. reduce(func, [1, 2, 3, 4, 5]) is the same as: x = func(1, 2) x = func(x, 3) x = func(x, 4) x = func(x, 5) return x AND Yet another example of reduce: reduce(operator.add, range(n), m) # really is the same as the command: sum(range(n), m) AND It seems that reduce is great for associative operations but gets confusing if used for other purposes. What is the difference of built-in functions eval and exec/execfile? Built-in functions enumerate vs. iter? Similar but iter returns only the value whereas enumerate returns the index position as well. From its returned object, use its next() to go through each item. list = ['1', '2', '3'] e = enumerate(list) >>> e.next() (0, '1') >>> e.next() (1, '2') >>> e.next() (2, '3') >>> e.next() Traceback (most recent call last): File "<interactive input>", line 1, in ? StopIteration --------- >>> i = iter(list) >>> i.next() '1' >>> i.next() '2' >>> i.next() '3' >>> i.next() Traceback (most recent call last): File "<interactive input>", line 1, in ?

StopIteration How can an optional argument be the first and third position for built-in function slice? list[start:stop:step] # possible to specify "step" Seems that module operator contains code used by Python keyWords like +, -, and so on. If you import a module that imports another module, the scope prevents you from using the second module unless, of course, you import it yourself. () is a tuple, an immutable (read-only) sequence [] is a sequence {} is a dictionary Use built-in function type to query an object's type. Use it to do a compare, and you should use module import to get the list of possible types. What is built-in function slice, staticmethod? What good is zip? zip([[1, 2, 3], ['a', 'b', 'c']]) [([1, 2, 3],), (['a', 'b', 'c'],)] Boolean operations "or" and "and" always return one of their operands. "import foo" doesn't require a module object named foo to exist. Why? A special member of every module is __dict__. Functions have another special attribute f.__dict__ (a.k.a. f.func_dict) which contains the namespace used to support function attributes. Variables with double leading underscore are "mangled" to provide a simple but effective way to define class private variables. Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with any leading underscores stripped. What is the difference between function and method? Special attributes of several types of object: __dict__, __class__, __bases__, and __name__. When dealing with exception, remember that Python will traverse through the list of handled exceptions and pick the first one that matches. Check http://docs.python.org/lib/module-exceptions.Html for a class hierarchy of exceptions. Wanting to write an extension module (usually a module written in C which Python can call)? You can use Pyrex or SWIG. Missing your switch/case statement? Consider using a dictionary: functions = { 'a':function_1, 'b':function_2, 'c':function_3 } func = functions[value] func() Or to call a function by name (a la Java's reflection), method = getattr(self, 'theMethod_' + str(value)) method() Want to speed up Python? See, for example, Psyco, Pyrex, PyInline, Py2Cmod,

and Weave. Default values are created exactly once, when the function is defined. Thus, it is usually a bad practice to use a mutable object as a default value since subsequent call will remember the previous value. Use immutable. This feature can be useful though if you want to cache and to avoid using global variables. Eg. def foo(D=[]): # Danger: shared reference to one dict for all calls D.append("duh") return D foo() # you get "duh" foo() # here you get "duh duh" Never use relative package imports. If you're writing code that's in the package.sub.m1 module and want to import package.sub.m2, do not just write import m2, even though it's legal. Write from package.sub import m2 instead. Relative imports can lead to a module being initialized twice, leading to confusing bugs. If only instances of a specific class use a module, then it is reasonable to import the module in the class's __init__ method and then assign the module to an instance variable so that the module is always available (via that instance variable) during the life of the object. What the heck is * and **? Optional arguments and keyword parameters? Used to pass from one function to another? Many ways to call a function by-reference, the best is to actually have the function return a tuple. A trick is to by convention designate the first item in the tuple to be the return code. If the first item says it is okay, then we can use the rest of the tuple; otherwise don't. This is helpful if you want to return (2,) to indicate 2 as an error code -> lets caller know not to check the rest of the tuple. In a class, several special methods (or are they called functions?): __init__: the constructor __call__: called the class' instance is called as in instance(). Want to copy an object? For sequences use [:], for dictionaries use copy(), for others use copy.copy() or copy.deepcopy(). 3.2 Why does -22 / 10 return -3? It seems that all integer operation will round down if it has to round something. String to number? Use int(). Watch out if you are trying to convert a string representation of a hexadecimal value (eg. "0x0f"). Number to string? For decimal, use str(). For hexadecimal, use hex(). For octal, use oct(). You can also use the % operator for formatting, eg. "% 06d" % 144 # '000144' Interestingly, in order to insert some characters into a string, you need to first convert the string into a list (a = list("hello there"), use a list operation to insert the characters (a[6:8] = "you"), and finally converting it back into a string (''.join(a)).

Three techniques to call functions by names: dictionary, getattr(), and locals()/eval(). Perl's chomp equivalence? Use s.rstrip() or s.splitlines()[0]. Reverse a sequence? For a list, just call l.reverse() and assign to another variable. For non-list, use either: a) for i in range(len(sequence)-1, -1, -1): print sequence[i] b) for i in sequence[::-1]: print i # works only on Python 2.3 and above Replicating a list with * doesn't create copies, it only creates references to the existing objects. Sort a list based on values in another list? First merge them using zip, then sort the zipped tuple, next extract using list comprehension: >>> list1 = ["what", "I'm", "sorting", "by"] >>> list2 = ["something", "else", "to", "sort"] >>> pairs = zip(list1, list2) >>> pairs [('what', 'something'), ("I'm", 'else'), ('sorting', 'to'), ('by', 'sort')] >>> pairs.sort() >>> result = [ x[1] for x in pairs ] >>> result ['else', 'sort', 'to', 'something'] The del statement does not necessarily call __del__ -- it simply decrements the object's reference count, and if this reaches zero __del__ is called. Despite the cycle collector, it's still a good idea to define an eXPlicit close() method on objects to be called whenever you're done with them. The close() method can then remove attributes that refer to subobjecs. Don't call __del__ directly -- __del__ should call close() and close() should make sure that it can be called more than once for the same object. What is weakref? If the only reference to an object is a weakref, then garbage collector is free to destroy it. Primarily used to implement caches or mappings holding large objects. Threading? Use threading module, not the low-level thread module. Python's threading support doesn't seem to be good. Truncate a file? f = open(filename, "r+"), and use f.truncate(offset); Or also, os.ftruncate(fd, offset) for fd opened by os.open(). Copy file? Use module shutil. Read/write binary data? Use struct module. Then you can use its members pack and unpack. Example: import struct f = open(filename, "rb") # Open in binary mode for portability s = f.read(8) x, y, z = struct.unpack(">hhl", s) The '>' in the format string forces big-endian data; the letter 'h' reads one "short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from

the string. For homogenous list of ints or floats, you can use the array module. Bi-directional pipe trying to avoid deadlocks? Use temporary file (not really an elegent solution) or use excectpy or pexcept. Random number generator? Use module random. Want to redirect stdout/stderr to a file? sys.stdout = file(logFile, "w") sys.stderr = sys.stdout pathlist = os.environ['PATH'].split(os.pathsep) except can take a list of exception classes DOTALL in re makes . matches \n as well import types, types can be used to tell what type of object reference you have There is no ternary operator (aka. conditional operator) aka ( a ? b : c). There is a workaround but it makes the code even more confusing to read. Sorry. Locking file can be achieved using fcntl. On some system, like HP-UX, the file needs to be opened with w+ or r+ (or something with +, check documentation). Example: import fcntl ... file = open(filename, "w+") fcntl.lockf(file.fileno(), fcntl.LOCK_EX) ... file.close() Silly way to retrieve current time in string: import time, shutil now = time.localtime() timestr = ".%s%s%s.%s%s%s" % (now.tm_year, now.tm_mon, now.tm_mday, now.tm_hour, now.tm_min, now.tm_sec) Want to backup while preserving permission (like cp -p)? shutil.copy2(src, dst) Want to compare type or want to know what type of an object you have? Keyword type will do it. Use as follow: if type(foo) == type([]): print "it is a list" # You can also compare it against types.Types. Want to modify the sys.path at runtime without modifying source code? Set environment PYTHONPATH and its content will be prefixed in front of the default values of sys.path. An example on regular expression, rex, re, regexp: rex = re.compile("File (.*) is removed; (.*) not included in release tag (.*)") sreMatch = rex.search(line) if sreMatch: # or perhaps can be written as if sreMatch != None print "%s, %s, %s" % (sreMatch.group(1), sreMatch.group(2), sreMatch.group(3)) Best way to remove duplicates of items in a list: foo = [1, 1, 2, 2, 3, 4] set = {} map(set.__setitem__, foo, []) foo = set.keys() Declare and initialize a dictionary? foo = { "CCU":("weight","SBP","DBP"), "OE":("weight","HR","Temp","blood glucose") } input = sys.stdin.readline() try: id = int(input) except ValueError: print "Can't convert input to an integer."

id = -1 input = sys.stdin.readline() if input[0] in ("y", "Y"): print "it is a yes" # check if python version is at least 2.3 if sys.version < '2.3': print "gotta run at least 2.3." # check if euid is root or 0 if os.getuid() != 0: print "gotta run as root." #environment variable CCSYSDIR os.environ['CCSYSDIR'] # get password line entryfor username from /etc/passwd import pwd userattr = pwd.getpwdnam("username") # userattr[2] == user # userattr[3] == group # set permission of a file so that it is rwx for ? os.chmod(file, stat.S_IREADstat.S_IWRITEstat.S_IEXEC) # take one keystroke and return from function? raw_input("press enter to continue..." # quickly open and write file.open("thefile.txt", "w") file.write("junk goes into this file\n\ second line of junk file\n\ and third and final line\n") file.close() LIBRARY MODULES # get reference count of an object, usually one higher than expected sys.getrefcount(obj) import os from os.path import join, getsize for root, dirs, files in os.walk('python/Lib/email'): print root, "consumes", print sum([getsize(join(root, name)) for name in files]), print "bytes in", len(files), "non-directory files" if 'CVS' in dirs: dirs.remove('CVS') # don't visit CVS directories import os from os.path import join # Delete everything reachable from the directory named in 'top'. # CAUTION: This is dangerous! For example, if top == '/', it # could delete all your disk files. for root, dirs, files in os.walk(top, topdown=False): for name in files: os.remove(join(root, name)) for name in dirs: os.rmdir(join(root, name)) Instead of using os.listdir(), use dircache. lst = dircache.listdir(path) # get a list of all directories/files under path dircache.annotate(path, lst) # extra, add trailing / for directory name Single file comparison? filecmp.cmp(fname1, fname2 [,shallow=1]) filecmp.cmp returns 0 for match, 1 for no match. Multiple files comparison in two directories? filecmp.cmpfiles(dirname1, dirname2, fnamelist [,shallow=1])) filecmp.cmpfiles returns a tuple of three lists: (matches,mismatches,errors) Single directory comparison? file.dircmp(dirname1, dirname2 [,ignore=... [,hide=...]) ignore defaults to '["RCS","CVS","tags"]' and hide defaults to '[os.curdir,os.pardir]' (i.e., '[".",".."]').

fileinput, something like cat? glob: list pathnames matching pattern pathnames = glob.glob('/Users/quilty/Book/chap[3-4].txt') linecache: efficient random access to a file linecache.getline('/etc/hosts', 15) linecache.checkcache # has file been modified since last cached os.path: path name manipulation os.path.commonprefix os.path.expanduser os.path.expandvars os.path.join os.path.normpath: remove redundant path information os.path.splitdrive: useful on Windows os.path.walk ~= os.walk? file.readline() # slow but memory-friendly file.readlines() # fast but memory-hungry xreadlines is better but is deprecated. Use idiom 'for line in file.open("bla"):' instead. The [commands] module exists primarily as a convenience wrapper for calls to `os.popen*()`. commands.getoutput can be implemented as such: def getoutput(cmd): import os return os.popen('{ '+cmd+'; } 2>&1').read() # check out the usage of '+cmd+' for whatever that means... Use dbm module to create a 'dictionary-on-disk'. This allows you use to store to disk pairs of key/value where both key and value are strings. You work with the dbm as though it is a in-memory dictionary. If you need to store a key/value pair where the value is not just a string, use shelve module. Of course, it still can't store objects that are not pickle-able like file objects. If shelve is not powerful enough for your need, try ZODB. Prefer cPickle over picke module. Example usage: import cPickle from somewhere import my_complex_object s = cPickle.dumps(my_complex_object) new_obj = cPickle.loads(s) Module name collision? Use the keyword as. import repr as _repr from repr import repr as newrepr datetime module giving you a headache since you don't know how to tell it to set the dst? Pass -1 as the last argument when creating the object datetime.datetime and Python will figure out the dst for you.

Single file comparison? filecmp.cmp(fname1, fname2 [,shallow=1]) filecmp.cmp returns 0 for match, 1 for no match. Multiple files comparison in two directories? filecmp.cmpfiles(dirname1, dirname2, fnamelist [,shallow=1])) filecmp.cmpfiles returns a tuple of three lists: (matches,mismatches,errors) Single directory comparison? file.dircmp(dirname1, dirname2 [,ignore=... [,hide=...]) ignore defaults to '["RCS","CVS","tags"]' and hide defaults to '[os.curdir,os.pardir]' (i.e., '[".",".."]'). fileinput, something like cat? glob: list pathnames matching pattern pathnames = glob.glob('/Users/quilty/Book/chap[3-4].txt') linecache: efficient random access to a file linecache.getline('/etc/hosts', 15) linecache.checkcache # has file been modified since last cached os.path: path name manipulation os.path.commonprefix os.path.expanduser os.path.expandvars os.path.join os.path.normpath: remove redundant path information os.path.splitdrive: useful on Windows os.path.walk ~= os.walk? file.readline() # slow but memory-friendly file.readlines() # fast but memory-hungry xreadlines is better but is deprecated. Use idiom 'for line in file.open("bla"):' instead. The [commands] module exists primarily as a convenience wrapper for calls to `os.popen*()`. commands.getoutput can be implemented as such: def getoutput(cmd): import os return os.popen('{ '+cmd+'; } 2>&1').read() # check out the usage of '+cmd+' for whatever that means... Use dbm module to create a 'dictionary-on-disk'. This allows you use to store to disk pairs of key/value where both key and value are strings. You work with the dbm as though it is a in-memory dictionary. If you need to store a key/value pair where the value is not just a string, use shelve module. Of course, it still can't store objects that are not pickle-able like file objects. If shelve is not powerful enough for your need, try ZODB. Prefer cPickle over picke module. Example usage: import cPickle from somewhere import my_complex_object s = cPickle.dumps(my_complex_object) new_obj = cPickle.loads(s) Module name collision? Use the keyword as.

上一篇文章：亂序排列文件中的所有行
下一篇文章： Python入門(二)-------標識符，運算符，簡單數據類型