|
1 .. _tut-io: |
|
2 |
|
3 **************** |
|
4 Input and Output |
|
5 **************** |
|
6 |
|
7 There are several ways to present the output of a program; data can be printed |
|
8 in a human-readable form, or written to a file for future use. This chapter will |
|
9 discuss some of the possibilities. |
|
10 |
|
11 |
|
12 .. _tut-formatting: |
|
13 |
|
14 Fancier Output Formatting |
|
15 ========================= |
|
16 |
|
17 So far we've encountered two ways of writing values: *expression statements* and |
|
18 the :keyword:`print` statement. (A third way is using the :meth:`write` method |
|
19 of file objects; the standard output file can be referenced as ``sys.stdout``. |
|
20 See the Library Reference for more information on this.) |
|
21 |
|
22 .. index:: module: string |
|
23 |
|
24 Often you'll want more control over the formatting of your output than simply |
|
25 printing space-separated values. There are two ways to format your output; the |
|
26 first way is to do all the string handling yourself; using string slicing and |
|
27 concatenation operations you can create any layout you can imagine. The |
|
28 standard module :mod:`string` contains some useful operations for padding |
|
29 strings to a given column width; these will be discussed shortly. The second |
|
30 way is to use the :meth:`str.format` method. |
|
31 |
|
32 One question remains, of course: how do you convert values to strings? Luckily, |
|
33 Python has ways to convert any value to a string: pass it to the :func:`repr` |
|
34 or :func:`str` functions. |
|
35 |
|
36 The :func:`str` function is meant to return representations of values which are |
|
37 fairly human-readable, while :func:`repr` is meant to generate representations |
|
38 which can be read by the interpreter (or will force a :exc:`SyntaxError` if |
|
39 there is not equivalent syntax). For objects which don't have a particular |
|
40 representation for human consumption, :func:`str` will return the same value as |
|
41 :func:`repr`. Many values, such as numbers or structures like lists and |
|
42 dictionaries, have the same representation using either function. Strings and |
|
43 floating point numbers, in particular, have two distinct representations. |
|
44 |
|
45 Some examples:: |
|
46 |
|
47 >>> s = 'Hello, world.' |
|
48 >>> str(s) |
|
49 'Hello, world.' |
|
50 >>> repr(s) |
|
51 "'Hello, world.'" |
|
52 >>> str(0.1) |
|
53 '0.1' |
|
54 >>> repr(0.1) |
|
55 '0.10000000000000001' |
|
56 >>> x = 10 * 3.25 |
|
57 >>> y = 200 * 200 |
|
58 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...' |
|
59 >>> print s |
|
60 The value of x is 32.5, and y is 40000... |
|
61 >>> # The repr() of a string adds string quotes and backslashes: |
|
62 ... hello = 'hello, world\n' |
|
63 >>> hellos = repr(hello) |
|
64 >>> print hellos |
|
65 'hello, world\n' |
|
66 >>> # The argument to repr() may be any Python object: |
|
67 ... repr((x, y, ('spam', 'eggs'))) |
|
68 "(32.5, 40000, ('spam', 'eggs'))" |
|
69 |
|
70 Here are two ways to write a table of squares and cubes:: |
|
71 |
|
72 >>> for x in range(1, 11): |
|
73 ... print repr(x).rjust(2), repr(x*x).rjust(3), |
|
74 ... # Note trailing comma on previous line |
|
75 ... print repr(x*x*x).rjust(4) |
|
76 ... |
|
77 1 1 1 |
|
78 2 4 8 |
|
79 3 9 27 |
|
80 4 16 64 |
|
81 5 25 125 |
|
82 6 36 216 |
|
83 7 49 343 |
|
84 8 64 512 |
|
85 9 81 729 |
|
86 10 100 1000 |
|
87 |
|
88 >>> for x in range(1,11): |
|
89 ... print '{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x) |
|
90 ... |
|
91 1 1 1 |
|
92 2 4 8 |
|
93 3 9 27 |
|
94 4 16 64 |
|
95 5 25 125 |
|
96 6 36 216 |
|
97 7 49 343 |
|
98 8 64 512 |
|
99 9 81 729 |
|
100 10 100 1000 |
|
101 |
|
102 (Note that in the first example, one space between each column was added by the |
|
103 way :keyword:`print` works: it always adds spaces between its arguments.) |
|
104 |
|
105 This example demonstrates the :meth:`rjust` method of string objects, which |
|
106 right-justifies a string in a field of a given width by padding it with spaces |
|
107 on the left. There are similar methods :meth:`ljust` and :meth:`center`. These |
|
108 methods do not write anything, they just return a new string. If the input |
|
109 string is too long, they don't truncate it, but return it unchanged; this will |
|
110 mess up your column lay-out but that's usually better than the alternative, |
|
111 which would be lying about a value. (If you really want truncation you can |
|
112 always add a slice operation, as in ``x.ljust(n)[:n]``.) |
|
113 |
|
114 There is another method, :meth:`zfill`, which pads a numeric string on the left |
|
115 with zeros. It understands about plus and minus signs:: |
|
116 |
|
117 >>> '12'.zfill(5) |
|
118 '00012' |
|
119 >>> '-3.14'.zfill(7) |
|
120 '-003.14' |
|
121 >>> '3.14159265359'.zfill(5) |
|
122 '3.14159265359' |
|
123 |
|
124 Basic usage of the :meth:`str.format` method looks like this:: |
|
125 |
|
126 >>> print 'We are the {0} who say "{1}!"'.format('knights', 'Ni') |
|
127 We are the knights who say "Ni!" |
|
128 |
|
129 The brackets and characters within them (called format fields) are replaced with |
|
130 the objects passed into the format method. The number in the brackets refers to |
|
131 the position of the object passed into the format method. :: |
|
132 |
|
133 >>> print '{0} and {1}'.format('spam', 'eggs') |
|
134 spam and eggs |
|
135 >>> print '{1} and {0}'.format('spam', 'eggs') |
|
136 eggs and spam |
|
137 |
|
138 If keyword arguments are used in the format method, their values are referred to |
|
139 by using the name of the argument. :: |
|
140 |
|
141 >>> print 'This {food} is {adjective}.'.format( |
|
142 ... food='spam', adjective='absolutely horrible') |
|
143 This spam is absolutely horrible. |
|
144 |
|
145 Positional and keyword arguments can be arbitrarily combined:: |
|
146 |
|
147 >>> print 'The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred', |
|
148 ... other='Georg') |
|
149 The story of Bill, Manfred, and Georg. |
|
150 |
|
151 An optional ``':``` and format specifier can follow the field name. This also |
|
152 greater control over how the value is formatted. The following example |
|
153 truncates the Pi to three places after the decimal. |
|
154 |
|
155 >>> import math |
|
156 >>> print 'The value of PI is approximately {0:.3f}.'.format(math.pi) |
|
157 The value of PI is approximately 3.142. |
|
158 |
|
159 Passing an integer after the ``':'`` will cause that field to be a minimum |
|
160 number of characters wide. This is useful for making tables pretty.:: |
|
161 |
|
162 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678} |
|
163 >>> for name, phone in table.items(): |
|
164 ... print '{0:10} ==> {1:10d}'.format(name, phone) |
|
165 ... |
|
166 Jack ==> 4098 |
|
167 Dcab ==> 7678 |
|
168 Sjoerd ==> 4127 |
|
169 |
|
170 If you have a really long format string that you don't want to split up, it |
|
171 would be nice if you could reference the variables to be formatted by name |
|
172 instead of by position. This can be done by simply passing the dict and using |
|
173 square brackets ``'[]'`` to access the keys :: |
|
174 |
|
175 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} |
|
176 >>> print ('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; ' |
|
177 ... 'Dcab: {0[Dcab]:d}'.format(table)) |
|
178 Jack: 4098; Sjoerd: 4127; Dcab: 8637678 |
|
179 |
|
180 This could also be done by passing the table as keyword arguments with the '**' |
|
181 notation.:: |
|
182 |
|
183 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} |
|
184 >>> print 'Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table) |
|
185 Jack: 4098; Sjoerd: 4127; Dcab: 8637678 |
|
186 |
|
187 This is particularly useful in combination with the new built-in :func:`vars` |
|
188 function, which returns a dictionary containing all local variables. |
|
189 |
|
190 For a complete overview of string formating with :meth:`str.format`, see |
|
191 :ref:`formatstrings`. |
|
192 |
|
193 |
|
194 Old string formatting |
|
195 --------------------- |
|
196 |
|
197 The ``%`` operator can also be used for string formatting. It interprets the |
|
198 left argument much like a :cfunc:`sprintf`\ -style format string to be applied |
|
199 to the right argument, and returns the string resulting from this formatting |
|
200 operation. For example:: |
|
201 |
|
202 >>> import math |
|
203 >>> print 'The value of PI is approximately %5.3f.' % math.pi |
|
204 The value of PI is approximately 3.142. |
|
205 |
|
206 Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%`` |
|
207 operator. However, because this old style of formatting will eventually removed |
|
208 from the language :meth:`str.format` should generally be used. |
|
209 |
|
210 More information can be found in the :ref:`string-formatting` section. |
|
211 |
|
212 |
|
213 .. _tut-files: |
|
214 |
|
215 Reading and Writing Files |
|
216 ========================= |
|
217 |
|
218 .. index:: |
|
219 builtin: open |
|
220 object: file |
|
221 |
|
222 :func:`open` returns a file object, and is most commonly used with two |
|
223 arguments: ``open(filename, mode)``. |
|
224 |
|
225 :: |
|
226 |
|
227 >>> f = open('/tmp/workfile', 'w') |
|
228 >>> print f |
|
229 <open file '/tmp/workfile', mode 'w' at 80a0960> |
|
230 |
|
231 The first argument is a string containing the filename. The second argument is |
|
232 another string containing a few characters describing the way in which the file |
|
233 will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'`` |
|
234 for only writing (an existing file with the same name will be erased), and |
|
235 ``'a'`` opens the file for appending; any data written to the file is |
|
236 automatically added to the end. ``'r+'`` opens the file for both reading and |
|
237 writing. The *mode* argument is optional; ``'r'`` will be assumed if it's |
|
238 omitted. |
|
239 |
|
240 On Windows, ``'b'`` appended to the mode opens the file in binary mode, so there |
|
241 are also modes like ``'rb'``, ``'wb'``, and ``'r+b'``. Windows makes a |
|
242 distinction between text and binary files; the end-of-line characters in text |
|
243 files are automatically altered slightly when data is read or written. This |
|
244 behind-the-scenes modification to file data is fine for ASCII text files, but |
|
245 it'll corrupt binary data like that in :file:`JPEG` or :file:`EXE` files. Be |
|
246 very careful to use binary mode when reading and writing such files. On Unix, |
|
247 it doesn't hurt to append a ``'b'`` to the mode, so you can use it |
|
248 platform-independently for all binary files. |
|
249 |
|
250 |
|
251 .. _tut-filemethods: |
|
252 |
|
253 Methods of File Objects |
|
254 ----------------------- |
|
255 |
|
256 The rest of the examples in this section will assume that a file object called |
|
257 ``f`` has already been created. |
|
258 |
|
259 To read a file's contents, call ``f.read(size)``, which reads some quantity of |
|
260 data and returns it as a string. *size* is an optional numeric argument. When |
|
261 *size* is omitted or negative, the entire contents of the file will be read and |
|
262 returned; it's your problem if the file is twice as large as your machine's |
|
263 memory. Otherwise, at most *size* bytes are read and returned. If the end of |
|
264 the file has been reached, ``f.read()`` will return an empty string (``""``). |
|
265 :: |
|
266 |
|
267 >>> f.read() |
|
268 'This is the entire file.\n' |
|
269 >>> f.read() |
|
270 '' |
|
271 |
|
272 ``f.readline()`` reads a single line from the file; a newline character (``\n``) |
|
273 is left at the end of the string, and is only omitted on the last line of the |
|
274 file if the file doesn't end in a newline. This makes the return value |
|
275 unambiguous; if ``f.readline()`` returns an empty string, the end of the file |
|
276 has been reached, while a blank line is represented by ``'\n'``, a string |
|
277 containing only a single newline. :: |
|
278 |
|
279 >>> f.readline() |
|
280 'This is the first line of the file.\n' |
|
281 >>> f.readline() |
|
282 'Second line of the file\n' |
|
283 >>> f.readline() |
|
284 '' |
|
285 |
|
286 ``f.readlines()`` returns a list containing all the lines of data in the file. |
|
287 If given an optional parameter *sizehint*, it reads that many bytes from the |
|
288 file and enough more to complete a line, and returns the lines from that. This |
|
289 is often used to allow efficient reading of a large file by lines, but without |
|
290 having to load the entire file in memory. Only complete lines will be returned. |
|
291 :: |
|
292 |
|
293 >>> f.readlines() |
|
294 ['This is the first line of the file.\n', 'Second line of the file\n'] |
|
295 |
|
296 An alternative approach to reading lines is to loop over the file object. This is |
|
297 memory efficient, fast, and leads to simpler code:: |
|
298 |
|
299 >>> for line in f: |
|
300 print line, |
|
301 |
|
302 This is the first line of the file. |
|
303 Second line of the file |
|
304 |
|
305 The alternative approach is simpler but does not provide as fine-grained |
|
306 control. Since the two approaches manage line buffering differently, they |
|
307 should not be mixed. |
|
308 |
|
309 ``f.write(string)`` writes the contents of *string* to the file, returning |
|
310 ``None``. :: |
|
311 |
|
312 >>> f.write('This is a test\n') |
|
313 |
|
314 To write something other than a string, it needs to be converted to a string |
|
315 first:: |
|
316 |
|
317 >>> value = ('the answer', 42) |
|
318 >>> s = str(value) |
|
319 >>> f.write(s) |
|
320 |
|
321 ``f.tell()`` returns an integer giving the file object's current position in the |
|
322 file, measured in bytes from the beginning of the file. To change the file |
|
323 object's position, use ``f.seek(offset, from_what)``. The position is computed |
|
324 from adding *offset* to a reference point; the reference point is selected by |
|
325 the *from_what* argument. A *from_what* value of 0 measures from the beginning |
|
326 of the file, 1 uses the current file position, and 2 uses the end of the file as |
|
327 the reference point. *from_what* can be omitted and defaults to 0, using the |
|
328 beginning of the file as the reference point. :: |
|
329 |
|
330 >>> f = open('/tmp/workfile', 'r+') |
|
331 >>> f.write('0123456789abcdef') |
|
332 >>> f.seek(5) # Go to the 6th byte in the file |
|
333 >>> f.read(1) |
|
334 '5' |
|
335 >>> f.seek(-3, 2) # Go to the 3rd byte before the end |
|
336 >>> f.read(1) |
|
337 'd' |
|
338 |
|
339 When you're done with a file, call ``f.close()`` to close it and free up any |
|
340 system resources taken up by the open file. After calling ``f.close()``, |
|
341 attempts to use the file object will automatically fail. :: |
|
342 |
|
343 >>> f.close() |
|
344 >>> f.read() |
|
345 Traceback (most recent call last): |
|
346 File "<stdin>", line 1, in ? |
|
347 ValueError: I/O operation on closed file |
|
348 |
|
349 It is good practice to use the :keyword:`with` keyword when dealing with file |
|
350 objects. This has the advantage that the file is properly closed after its |
|
351 suite finishes, even if an exception is raised on the way. It is also much |
|
352 shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks:: |
|
353 |
|
354 >>> with open('/tmp/workfile', 'r') as f: |
|
355 ... read_data = f.read() |
|
356 >>> f.closed |
|
357 True |
|
358 |
|
359 File objects have some additional methods, such as :meth:`isatty` and |
|
360 :meth:`truncate` which are less frequently used; consult the Library Reference |
|
361 for a complete guide to file objects. |
|
362 |
|
363 |
|
364 .. _tut-pickle: |
|
365 |
|
366 The :mod:`pickle` Module |
|
367 ------------------------ |
|
368 |
|
369 .. index:: module: pickle |
|
370 |
|
371 Strings can easily be written to and read from a file. Numbers take a bit more |
|
372 effort, since the :meth:`read` method only returns strings, which will have to |
|
373 be passed to a function like :func:`int`, which takes a string like ``'123'`` |
|
374 and returns its numeric value 123. However, when you want to save more complex |
|
375 data types like lists, dictionaries, or class instances, things get a lot more |
|
376 complicated. |
|
377 |
|
378 Rather than have users be constantly writing and debugging code to save |
|
379 complicated data types, Python provides a standard module called :mod:`pickle`. |
|
380 This is an amazing module that can take almost any Python object (even some |
|
381 forms of Python code!), and convert it to a string representation; this process |
|
382 is called :dfn:`pickling`. Reconstructing the object from the string |
|
383 representation is called :dfn:`unpickling`. Between pickling and unpickling, |
|
384 the string representing the object may have been stored in a file or data, or |
|
385 sent over a network connection to some distant machine. |
|
386 |
|
387 If you have an object ``x``, and a file object ``f`` that's been opened for |
|
388 writing, the simplest way to pickle the object takes only one line of code:: |
|
389 |
|
390 pickle.dump(x, f) |
|
391 |
|
392 To unpickle the object again, if ``f`` is a file object which has been opened |
|
393 for reading:: |
|
394 |
|
395 x = pickle.load(f) |
|
396 |
|
397 (There are other variants of this, used when pickling many objects or when you |
|
398 don't want to write the pickled data to a file; consult the complete |
|
399 documentation for :mod:`pickle` in the Python Library Reference.) |
|
400 |
|
401 :mod:`pickle` is the standard way to make Python objects which can be stored and |
|
402 reused by other programs or by a future invocation of the same program; the |
|
403 technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is |
|
404 so widely used, many authors who write Python extensions take care to ensure |
|
405 that new data types such as matrices can be properly pickled and unpickled. |
|
406 |
|
407 |