|
1 .. _tut-brieftourtwo: |
|
2 |
|
3 ********************************************* |
|
4 Brief Tour of the Standard Library -- Part II |
|
5 ********************************************* |
|
6 |
|
7 This second tour covers more advanced modules that support professional |
|
8 programming needs. These modules rarely occur in small scripts. |
|
9 |
|
10 |
|
11 .. _tut-output-formatting: |
|
12 |
|
13 Output Formatting |
|
14 ================= |
|
15 |
|
16 The :mod:`repr` module provides a version of :func:`repr` customized for |
|
17 abbreviated displays of large or deeply nested containers:: |
|
18 |
|
19 >>> import repr |
|
20 >>> repr.repr(set('supercalifragilisticexpialidocious')) |
|
21 "set(['a', 'c', 'd', 'e', 'f', 'g', ...])" |
|
22 |
|
23 The :mod:`pprint` module offers more sophisticated control over printing both |
|
24 built-in and user defined objects in a way that is readable by the interpreter. |
|
25 When the result is longer than one line, the "pretty printer" adds line breaks |
|
26 and indentation to more clearly reveal data structure:: |
|
27 |
|
28 >>> import pprint |
|
29 >>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta', |
|
30 ... 'yellow'], 'blue']]] |
|
31 ... |
|
32 >>> pprint.pprint(t, width=30) |
|
33 [[[['black', 'cyan'], |
|
34 'white', |
|
35 ['green', 'red']], |
|
36 [['magenta', 'yellow'], |
|
37 'blue']]] |
|
38 |
|
39 The :mod:`textwrap` module formats paragraphs of text to fit a given screen |
|
40 width:: |
|
41 |
|
42 >>> import textwrap |
|
43 >>> doc = """The wrap() method is just like fill() except that it returns |
|
44 ... a list of strings instead of one big string with newlines to separate |
|
45 ... the wrapped lines.""" |
|
46 ... |
|
47 >>> print textwrap.fill(doc, width=40) |
|
48 The wrap() method is just like fill() |
|
49 except that it returns a list of strings |
|
50 instead of one big string with newlines |
|
51 to separate the wrapped lines. |
|
52 |
|
53 The :mod:`locale` module accesses a database of culture specific data formats. |
|
54 The grouping attribute of locale's format function provides a direct way of |
|
55 formatting numbers with group separators:: |
|
56 |
|
57 >>> import locale |
|
58 >>> locale.setlocale(locale.LC_ALL, 'English_United States.1252') |
|
59 'English_United States.1252' |
|
60 >>> conv = locale.localeconv() # get a mapping of conventions |
|
61 >>> x = 1234567.8 |
|
62 >>> locale.format("%d", x, grouping=True) |
|
63 '1,234,567' |
|
64 >>> locale.format("%s%.*f", (conv['currency_symbol'], |
|
65 ... conv['frac_digits'], x), grouping=True) |
|
66 '$1,234,567.80' |
|
67 |
|
68 |
|
69 .. _tut-templating: |
|
70 |
|
71 Templating |
|
72 ========== |
|
73 |
|
74 The :mod:`string` module includes a versatile :class:`Template` class with a |
|
75 simplified syntax suitable for editing by end-users. This allows users to |
|
76 customize their applications without having to alter the application. |
|
77 |
|
78 The format uses placeholder names formed by ``$`` with valid Python identifiers |
|
79 (alphanumeric characters and underscores). Surrounding the placeholder with |
|
80 braces allows it to be followed by more alphanumeric letters with no intervening |
|
81 spaces. Writing ``$$`` creates a single escaped ``$``:: |
|
82 |
|
83 >>> from string import Template |
|
84 >>> t = Template('${village}folk send $$10 to $cause.') |
|
85 >>> t.substitute(village='Nottingham', cause='the ditch fund') |
|
86 'Nottinghamfolk send $10 to the ditch fund.' |
|
87 |
|
88 The :meth:`substitute` method raises a :exc:`KeyError` when a placeholder is not |
|
89 supplied in a dictionary or a keyword argument. For mail-merge style |
|
90 applications, user supplied data may be incomplete and the |
|
91 :meth:`safe_substitute` method may be more appropriate --- it will leave |
|
92 placeholders unchanged if data is missing:: |
|
93 |
|
94 >>> t = Template('Return the $item to $owner.') |
|
95 >>> d = dict(item='unladen swallow') |
|
96 >>> t.substitute(d) |
|
97 Traceback (most recent call last): |
|
98 . . . |
|
99 KeyError: 'owner' |
|
100 >>> t.safe_substitute(d) |
|
101 'Return the unladen swallow to $owner.' |
|
102 |
|
103 Template subclasses can specify a custom delimiter. For example, a batch |
|
104 renaming utility for a photo browser may elect to use percent signs for |
|
105 placeholders such as the current date, image sequence number, or file format:: |
|
106 |
|
107 >>> import time, os.path |
|
108 >>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg'] |
|
109 >>> class BatchRename(Template): |
|
110 ... delimiter = '%' |
|
111 >>> fmt = raw_input('Enter rename style (%d-date %n-seqnum %f-format): ') |
|
112 Enter rename style (%d-date %n-seqnum %f-format): Ashley_%n%f |
|
113 |
|
114 >>> t = BatchRename(fmt) |
|
115 >>> date = time.strftime('%d%b%y') |
|
116 >>> for i, filename in enumerate(photofiles): |
|
117 ... base, ext = os.path.splitext(filename) |
|
118 ... newname = t.substitute(d=date, n=i, f=ext) |
|
119 ... print '{0} --> {1}'.format(filename, newname) |
|
120 |
|
121 img_1074.jpg --> Ashley_0.jpg |
|
122 img_1076.jpg --> Ashley_1.jpg |
|
123 img_1077.jpg --> Ashley_2.jpg |
|
124 |
|
125 Another application for templating is separating program logic from the details |
|
126 of multiple output formats. This makes it possible to substitute custom |
|
127 templates for XML files, plain text reports, and HTML web reports. |
|
128 |
|
129 |
|
130 .. _tut-binary-formats: |
|
131 |
|
132 Working with Binary Data Record Layouts |
|
133 ======================================= |
|
134 |
|
135 The :mod:`struct` module provides :func:`pack` and :func:`unpack` functions for |
|
136 working with variable length binary record formats. The following example shows |
|
137 how to loop through header information in a ZIP file without using the |
|
138 :mod:`zipfile` module. Pack codes ``"H"`` and ``"I"`` represent two and four |
|
139 byte unsigned numbers respectively. The ``"<"`` indicates that they are |
|
140 standard size and in little-endian byte order:: |
|
141 |
|
142 import struct |
|
143 |
|
144 data = open('myfile.zip', 'rb').read() |
|
145 start = 0 |
|
146 for i in range(3): # show the first 3 file headers |
|
147 start += 14 |
|
148 fields = struct.unpack('<IIIHH', data[start:start+16]) |
|
149 crc32, comp_size, uncomp_size, filenamesize, extra_size = fields |
|
150 |
|
151 start += 16 |
|
152 filename = data[start:start+filenamesize] |
|
153 start += filenamesize |
|
154 extra = data[start:start+extra_size] |
|
155 print filename, hex(crc32), comp_size, uncomp_size |
|
156 |
|
157 start += extra_size + comp_size # skip to the next header |
|
158 |
|
159 |
|
160 .. _tut-multi-threading: |
|
161 |
|
162 Multi-threading |
|
163 =============== |
|
164 |
|
165 Threading is a technique for decoupling tasks which are not sequentially |
|
166 dependent. Threads can be used to improve the responsiveness of applications |
|
167 that accept user input while other tasks run in the background. A related use |
|
168 case is running I/O in parallel with computations in another thread. |
|
169 |
|
170 The following code shows how the high level :mod:`threading` module can run |
|
171 tasks in background while the main program continues to run:: |
|
172 |
|
173 import threading, zipfile |
|
174 |
|
175 class AsyncZip(threading.Thread): |
|
176 def __init__(self, infile, outfile): |
|
177 threading.Thread.__init__(self) |
|
178 self.infile = infile |
|
179 self.outfile = outfile |
|
180 def run(self): |
|
181 f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED) |
|
182 f.write(self.infile) |
|
183 f.close() |
|
184 print 'Finished background zip of: ', self.infile |
|
185 |
|
186 background = AsyncZip('mydata.txt', 'myarchive.zip') |
|
187 background.start() |
|
188 print 'The main program continues to run in foreground.' |
|
189 |
|
190 background.join() # Wait for the background task to finish |
|
191 print 'Main program waited until background was done.' |
|
192 |
|
193 The principal challenge of multi-threaded applications is coordinating threads |
|
194 that share data or other resources. To that end, the threading module provides |
|
195 a number of synchronization primitives including locks, events, condition |
|
196 variables, and semaphores. |
|
197 |
|
198 While those tools are powerful, minor design errors can result in problems that |
|
199 are difficult to reproduce. So, the preferred approach to task coordination is |
|
200 to concentrate all access to a resource in a single thread and then use the |
|
201 :mod:`Queue` module to feed that thread with requests from other threads. |
|
202 Applications using :class:`Queue.Queue` objects for inter-thread communication |
|
203 and coordination are easier to design, more readable, and more reliable. |
|
204 |
|
205 |
|
206 .. _tut-logging: |
|
207 |
|
208 Logging |
|
209 ======= |
|
210 |
|
211 The :mod:`logging` module offers a full featured and flexible logging system. |
|
212 At its simplest, log messages are sent to a file or to ``sys.stderr``:: |
|
213 |
|
214 import logging |
|
215 logging.debug('Debugging information') |
|
216 logging.info('Informational message') |
|
217 logging.warning('Warning:config file %s not found', 'server.conf') |
|
218 logging.error('Error occurred') |
|
219 logging.critical('Critical error -- shutting down') |
|
220 |
|
221 This produces the following output:: |
|
222 |
|
223 WARNING:root:Warning:config file server.conf not found |
|
224 ERROR:root:Error occurred |
|
225 CRITICAL:root:Critical error -- shutting down |
|
226 |
|
227 By default, informational and debugging messages are suppressed and the output |
|
228 is sent to standard error. Other output options include routing messages |
|
229 through email, datagrams, sockets, or to an HTTP Server. New filters can select |
|
230 different routing based on message priority: :const:`DEBUG`, :const:`INFO`, |
|
231 :const:`WARNING`, :const:`ERROR`, and :const:`CRITICAL`. |
|
232 |
|
233 The logging system can be configured directly from Python or can be loaded from |
|
234 a user editable configuration file for customized logging without altering the |
|
235 application. |
|
236 |
|
237 |
|
238 .. _tut-weak-references: |
|
239 |
|
240 Weak References |
|
241 =============== |
|
242 |
|
243 Python does automatic memory management (reference counting for most objects and |
|
244 :term:`garbage collection` to eliminate cycles). The memory is freed shortly |
|
245 after the last reference to it has been eliminated. |
|
246 |
|
247 This approach works fine for most applications but occasionally there is a need |
|
248 to track objects only as long as they are being used by something else. |
|
249 Unfortunately, just tracking them creates a reference that makes them permanent. |
|
250 The :mod:`weakref` module provides tools for tracking objects without creating a |
|
251 reference. When the object is no longer needed, it is automatically removed |
|
252 from a weakref table and a callback is triggered for weakref objects. Typical |
|
253 applications include caching objects that are expensive to create:: |
|
254 |
|
255 >>> import weakref, gc |
|
256 >>> class A: |
|
257 ... def __init__(self, value): |
|
258 ... self.value = value |
|
259 ... def __repr__(self): |
|
260 ... return str(self.value) |
|
261 ... |
|
262 >>> a = A(10) # create a reference |
|
263 >>> d = weakref.WeakValueDictionary() |
|
264 >>> d['primary'] = a # does not create a reference |
|
265 >>> d['primary'] # fetch the object if it is still alive |
|
266 10 |
|
267 >>> del a # remove the one reference |
|
268 >>> gc.collect() # run garbage collection right away |
|
269 0 |
|
270 >>> d['primary'] # entry was automatically removed |
|
271 Traceback (most recent call last): |
|
272 File "<stdin>", line 1, in <module> |
|
273 d['primary'] # entry was automatically removed |
|
274 File "C:/python26/lib/weakref.py", line 46, in __getitem__ |
|
275 o = self.data[key]() |
|
276 KeyError: 'primary' |
|
277 |
|
278 |
|
279 .. _tut-list-tools: |
|
280 |
|
281 Tools for Working with Lists |
|
282 ============================ |
|
283 |
|
284 Many data structure needs can be met with the built-in list type. However, |
|
285 sometimes there is a need for alternative implementations with different |
|
286 performance trade-offs. |
|
287 |
|
288 The :mod:`array` module provides an :class:`array()` object that is like a list |
|
289 that stores only homogeneous data and stores it more compactly. The following |
|
290 example shows an array of numbers stored as two byte unsigned binary numbers |
|
291 (typecode ``"H"``) rather than the usual 16 bytes per entry for regular lists of |
|
292 python int objects:: |
|
293 |
|
294 >>> from array import array |
|
295 >>> a = array('H', [4000, 10, 700, 22222]) |
|
296 >>> sum(a) |
|
297 26932 |
|
298 >>> a[1:3] |
|
299 array('H', [10, 700]) |
|
300 |
|
301 The :mod:`collections` module provides a :class:`deque()` object that is like a |
|
302 list with faster appends and pops from the left side but slower lookups in the |
|
303 middle. These objects are well suited for implementing queues and breadth first |
|
304 tree searches:: |
|
305 |
|
306 >>> from collections import deque |
|
307 >>> d = deque(["task1", "task2", "task3"]) |
|
308 >>> d.append("task4") |
|
309 >>> print "Handling", d.popleft() |
|
310 Handling task1 |
|
311 |
|
312 unsearched = deque([starting_node]) |
|
313 def breadth_first_search(unsearched): |
|
314 node = unsearched.popleft() |
|
315 for m in gen_moves(node): |
|
316 if is_goal(m): |
|
317 return m |
|
318 unsearched.append(m) |
|
319 |
|
320 In addition to alternative list implementations, the library also offers other |
|
321 tools such as the :mod:`bisect` module with functions for manipulating sorted |
|
322 lists:: |
|
323 |
|
324 >>> import bisect |
|
325 >>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')] |
|
326 >>> bisect.insort(scores, (300, 'ruby')) |
|
327 >>> scores |
|
328 [(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')] |
|
329 |
|
330 The :mod:`heapq` module provides functions for implementing heaps based on |
|
331 regular lists. The lowest valued entry is always kept at position zero. This |
|
332 is useful for applications which repeatedly access the smallest element but do |
|
333 not want to run a full list sort:: |
|
334 |
|
335 >>> from heapq import heapify, heappop, heappush |
|
336 >>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0] |
|
337 >>> heapify(data) # rearrange the list into heap order |
|
338 >>> heappush(data, -5) # add a new entry |
|
339 >>> [heappop(data) for i in range(3)] # fetch the three smallest entries |
|
340 [-5, 0, 1] |
|
341 |
|
342 |
|
343 .. _tut-decimal-fp: |
|
344 |
|
345 Decimal Floating Point Arithmetic |
|
346 ================================= |
|
347 |
|
348 The :mod:`decimal` module offers a :class:`Decimal` datatype for decimal |
|
349 floating point arithmetic. Compared to the built-in :class:`float` |
|
350 implementation of binary floating point, the new class is especially helpful for |
|
351 financial applications and other uses which require exact decimal |
|
352 representation, control over precision, control over rounding to meet legal or |
|
353 regulatory requirements, tracking of significant decimal places, or for |
|
354 applications where the user expects the results to match calculations done by |
|
355 hand. |
|
356 |
|
357 For example, calculating a 5% tax on a 70 cent phone charge gives different |
|
358 results in decimal floating point and binary floating point. The difference |
|
359 becomes significant if the results are rounded to the nearest cent:: |
|
360 |
|
361 >>> from decimal import * |
|
362 >>> Decimal('0.70') * Decimal('1.05') |
|
363 Decimal("0.7350") |
|
364 >>> .70 * 1.05 |
|
365 0.73499999999999999 |
|
366 |
|
367 The :class:`Decimal` result keeps a trailing zero, automatically inferring four |
|
368 place significance from multiplicands with two place significance. Decimal |
|
369 reproduces mathematics as done by hand and avoids issues that can arise when |
|
370 binary floating point cannot exactly represent decimal quantities. |
|
371 |
|
372 Exact representation enables the :class:`Decimal` class to perform modulo |
|
373 calculations and equality tests that are unsuitable for binary floating point:: |
|
374 |
|
375 >>> Decimal('1.00') % Decimal('.10') |
|
376 Decimal("0.00") |
|
377 >>> 1.00 % 0.10 |
|
378 0.09999999999999995 |
|
379 |
|
380 >>> sum([Decimal('0.1')]*10) == Decimal('1.0') |
|
381 True |
|
382 >>> sum([0.1]*10) == 1.0 |
|
383 False |
|
384 |
|
385 The :mod:`decimal` module provides arithmetic with as much precision as needed:: |
|
386 |
|
387 >>> getcontext().prec = 36 |
|
388 >>> Decimal(1) / Decimal(7) |
|
389 Decimal("0.142857142857142857142857142857142857") |
|
390 |
|
391 |