|
1 |
|
2 :mod:`zipfile` --- Work with ZIP archives |
|
3 ========================================= |
|
4 |
|
5 .. module:: zipfile |
|
6 :synopsis: Read and write ZIP-format archive files. |
|
7 .. moduleauthor:: James C. Ahlstrom <jim@interet.com> |
|
8 .. sectionauthor:: James C. Ahlstrom <jim@interet.com> |
|
9 |
|
10 .. versionadded:: 1.6 |
|
11 |
|
12 The ZIP file format is a common archive and compression standard. This module |
|
13 provides tools to create, read, write, append, and list a ZIP file. Any |
|
14 advanced use of this module will require an understanding of the format, as |
|
15 defined in `PKZIP Application Note |
|
16 <http://www.pkware.com/documents/casestudies/APPNOTE.TXT>`_. |
|
17 |
|
18 This module does not currently handle multi-disk ZIP files, or ZIP files |
|
19 which have appended comments (although it correctly handles comments |
|
20 added to individual archive members---for which see the :ref:`zipinfo-objects` |
|
21 documentation). It can handle ZIP files that use the ZIP64 extensions |
|
22 (that is ZIP files that are more than 4 GByte in size). It supports |
|
23 decryption of encrypted files in ZIP archives, but it currently cannot |
|
24 create an encrypted file. Decryption is extremely slow as it is |
|
25 implemented in native python rather than C. |
|
26 |
|
27 For other archive formats, see the :mod:`bz2`, :mod:`gzip`, and |
|
28 :mod:`tarfile` modules. |
|
29 |
|
30 The module defines the following items: |
|
31 |
|
32 .. exception:: BadZipfile |
|
33 |
|
34 The error raised for bad ZIP files (old name: ``zipfile.error``). |
|
35 |
|
36 |
|
37 .. exception:: LargeZipFile |
|
38 |
|
39 The error raised when a ZIP file would require ZIP64 functionality but that has |
|
40 not been enabled. |
|
41 |
|
42 |
|
43 .. class:: ZipFile |
|
44 |
|
45 The class for reading and writing ZIP files. See section |
|
46 :ref:`zipfile-objects` for constructor details. |
|
47 |
|
48 |
|
49 .. class:: PyZipFile |
|
50 |
|
51 Class for creating ZIP archives containing Python libraries. |
|
52 |
|
53 |
|
54 .. class:: ZipInfo([filename[, date_time]]) |
|
55 |
|
56 Class used to represent information about a member of an archive. Instances |
|
57 of this class are returned by the :meth:`getinfo` and :meth:`infolist` |
|
58 methods of :class:`ZipFile` objects. Most users of the :mod:`zipfile` module |
|
59 will not need to create these, but only use those created by this |
|
60 module. *filename* should be the full name of the archive member, and |
|
61 *date_time* should be a tuple containing six fields which describe the time |
|
62 of the last modification to the file; the fields are described in section |
|
63 :ref:`zipinfo-objects`. |
|
64 |
|
65 |
|
66 .. function:: is_zipfile(filename) |
|
67 |
|
68 Returns ``True`` if *filename* is a valid ZIP file based on its magic number, |
|
69 otherwise returns ``False``. This module does not currently handle ZIP files |
|
70 which have appended comments. |
|
71 |
|
72 |
|
73 .. data:: ZIP_STORED |
|
74 |
|
75 The numeric constant for an uncompressed archive member. |
|
76 |
|
77 |
|
78 .. data:: ZIP_DEFLATED |
|
79 |
|
80 The numeric constant for the usual ZIP compression method. This requires the |
|
81 zlib module. No other compression methods are currently supported. |
|
82 |
|
83 |
|
84 .. seealso:: |
|
85 |
|
86 `PKZIP Application Note <http://www.pkware.com/documents/casestudies/APPNOTE.TXT>`_ |
|
87 Documentation on the ZIP file format by Phil Katz, the creator of the format and |
|
88 algorithms used. |
|
89 |
|
90 `Info-ZIP Home Page <http://www.info-zip.org/>`_ |
|
91 Information about the Info-ZIP project's ZIP archive programs and development |
|
92 libraries. |
|
93 |
|
94 |
|
95 .. _zipfile-objects: |
|
96 |
|
97 ZipFile Objects |
|
98 --------------- |
|
99 |
|
100 |
|
101 .. class:: ZipFile(file[, mode[, compression[, allowZip64]]]) |
|
102 |
|
103 Open a ZIP file, where *file* can be either a path to a file (a string) or a |
|
104 file-like object. The *mode* parameter should be ``'r'`` to read an existing |
|
105 file, ``'w'`` to truncate and write a new file, or ``'a'`` to append to an |
|
106 existing file. If *mode* is ``'a'`` and *file* refers to an existing ZIP file, |
|
107 then additional files are added to it. If *file* does not refer to a ZIP file, |
|
108 then a new ZIP archive is appended to the file. This is meant for adding a ZIP |
|
109 archive to another file, such as :file:`python.exe`. Using :: |
|
110 |
|
111 cat myzip.zip >> python.exe |
|
112 |
|
113 also works, and at least :program:`WinZip` can read such files. If *mode* is |
|
114 ``a`` and the file does not exist at all, it is created. *compression* is the |
|
115 ZIP compression method to use when writing the archive, and should be |
|
116 :const:`ZIP_STORED` or :const:`ZIP_DEFLATED`; unrecognized values will cause |
|
117 :exc:`RuntimeError` to be raised. If :const:`ZIP_DEFLATED` is specified but the |
|
118 :mod:`zlib` module is not available, :exc:`RuntimeError` is also raised. The |
|
119 default is :const:`ZIP_STORED`. If *allowZip64* is ``True`` zipfile will create |
|
120 ZIP files that use the ZIP64 extensions when the zipfile is larger than 2 GB. If |
|
121 it is false (the default) :mod:`zipfile` will raise an exception when the ZIP |
|
122 file would require ZIP64 extensions. ZIP64 extensions are disabled by default |
|
123 because the default :program:`zip` and :program:`unzip` commands on Unix (the |
|
124 InfoZIP utilities) don't support these extensions. |
|
125 |
|
126 .. versionchanged:: 2.6 |
|
127 If the file does not exist, it is created if the mode is 'a'. |
|
128 |
|
129 |
|
130 .. method:: ZipFile.close() |
|
131 |
|
132 Close the archive file. You must call :meth:`close` before exiting your program |
|
133 or essential records will not be written. |
|
134 |
|
135 |
|
136 .. method:: ZipFile.getinfo(name) |
|
137 |
|
138 Return a :class:`ZipInfo` object with information about the archive member |
|
139 *name*. Calling :meth:`getinfo` for a name not currently contained in the |
|
140 archive will raise a :exc:`KeyError`. |
|
141 |
|
142 |
|
143 .. method:: ZipFile.infolist() |
|
144 |
|
145 Return a list containing a :class:`ZipInfo` object for each member of the |
|
146 archive. The objects are in the same order as their entries in the actual ZIP |
|
147 file on disk if an existing archive was opened. |
|
148 |
|
149 |
|
150 .. method:: ZipFile.namelist() |
|
151 |
|
152 Return a list of archive members by name. |
|
153 |
|
154 |
|
155 .. method:: ZipFile.open(name[, mode[, pwd]]) |
|
156 |
|
157 Extract a member from the archive as a file-like object (ZipExtFile). *name* is |
|
158 the name of the file in the archive, or a :class:`ZipInfo` object. The *mode* |
|
159 parameter, if included, must be one of the following: ``'r'`` (the default), |
|
160 ``'U'``, or ``'rU'``. Choosing ``'U'`` or ``'rU'`` will enable universal newline |
|
161 support in the read-only object. *pwd* is the password used for encrypted files. |
|
162 Calling :meth:`open` on a closed ZipFile will raise a :exc:`RuntimeError`. |
|
163 |
|
164 .. note:: |
|
165 |
|
166 The file-like object is read-only and provides the following methods: |
|
167 :meth:`read`, :meth:`readline`, :meth:`readlines`, :meth:`__iter__`, |
|
168 :meth:`next`. |
|
169 |
|
170 .. note:: |
|
171 |
|
172 If the ZipFile was created by passing in a file-like object as the first |
|
173 argument to the constructor, then the object returned by :meth:`.open` shares the |
|
174 ZipFile's file pointer. Under these circumstances, the object returned by |
|
175 :meth:`.open` should not be used after any additional operations are performed |
|
176 on the ZipFile object. If the ZipFile was created by passing in a string (the |
|
177 filename) as the first argument to the constructor, then :meth:`.open` will |
|
178 create a new file object that will be held by the ZipExtFile, allowing it to |
|
179 operate independently of the ZipFile. |
|
180 |
|
181 .. note:: |
|
182 |
|
183 The :meth:`open`, :meth:`read` and :meth:`extract` methods can take a filename |
|
184 or a :class:`ZipInfo` object. You will appreciate this when trying to read a |
|
185 ZIP file that contains members with duplicate names. |
|
186 |
|
187 .. versionadded:: 2.6 |
|
188 |
|
189 |
|
190 .. method:: ZipFile.extract(member[, path[, pwd]]) |
|
191 |
|
192 Extract a member from the archive to the current working directory; *member* |
|
193 must be its full name or a :class:`ZipInfo` object). Its file information is |
|
194 extracted as accurately as possible. *path* specifies a different directory |
|
195 to extract to. *member* can be a filename or a :class:`ZipInfo` object. |
|
196 *pwd* is the password used for encrypted files. |
|
197 |
|
198 .. versionadded:: 2.6 |
|
199 |
|
200 |
|
201 .. method:: ZipFile.extractall([path[, members[, pwd]]]) |
|
202 |
|
203 Extract all members from the archive to the current working directory. *path* |
|
204 specifies a different directory to extract to. *members* is optional and must |
|
205 be a subset of the list returned by :meth:`namelist`. *pwd* is the password |
|
206 used for encrypted files. |
|
207 |
|
208 .. versionadded:: 2.6 |
|
209 |
|
210 |
|
211 .. method:: ZipFile.printdir() |
|
212 |
|
213 Print a table of contents for the archive to ``sys.stdout``. |
|
214 |
|
215 |
|
216 .. method:: ZipFile.setpassword(pwd) |
|
217 |
|
218 Set *pwd* as default password to extract encrypted files. |
|
219 |
|
220 .. versionadded:: 2.6 |
|
221 |
|
222 |
|
223 .. method:: ZipFile.read(name[, pwd]) |
|
224 |
|
225 Return the bytes of the file *name* in the archive. *name* is the name of the |
|
226 file in the archive, or a :class:`ZipInfo` object. The archive must be open for |
|
227 read or append. *pwd* is the password used for encrypted files and, if specified, |
|
228 it will override the default password set with :meth:`setpassword`. Calling |
|
229 :meth:`read` on a closed ZipFile will raise a :exc:`RuntimeError`. |
|
230 |
|
231 .. versionchanged:: 2.6 |
|
232 *pwd* was added, and *name* can now be a :class:`ZipInfo` object. |
|
233 |
|
234 |
|
235 .. method:: ZipFile.testzip() |
|
236 |
|
237 Read all the files in the archive and check their CRC's and file headers. |
|
238 Return the name of the first bad file, or else return ``None``. Calling |
|
239 :meth:`testzip` on a closed ZipFile will raise a :exc:`RuntimeError`. |
|
240 |
|
241 |
|
242 .. method:: ZipFile.write(filename[, arcname[, compress_type]]) |
|
243 |
|
244 Write the file named *filename* to the archive, giving it the archive name |
|
245 *arcname* (by default, this will be the same as *filename*, but without a drive |
|
246 letter and with leading path separators removed). If given, *compress_type* |
|
247 overrides the value given for the *compression* parameter to the constructor for |
|
248 the new entry. The archive must be open with mode ``'w'`` or ``'a'`` -- calling |
|
249 :meth:`write` on a ZipFile created with mode ``'r'`` will raise a |
|
250 :exc:`RuntimeError`. Calling :meth:`write` on a closed ZipFile will raise a |
|
251 :exc:`RuntimeError`. |
|
252 |
|
253 .. note:: |
|
254 |
|
255 There is no official file name encoding for ZIP files. If you have unicode file |
|
256 names, you must convert them to byte strings in your desired encoding before |
|
257 passing them to :meth:`write`. WinZip interprets all file names as encoded in |
|
258 CP437, also known as DOS Latin. |
|
259 |
|
260 .. note:: |
|
261 |
|
262 Archive names should be relative to the archive root, that is, they should not |
|
263 start with a path separator. |
|
264 |
|
265 .. note:: |
|
266 |
|
267 If ``arcname`` (or ``filename``, if ``arcname`` is not given) contains a null |
|
268 byte, the name of the file in the archive will be truncated at the null byte. |
|
269 |
|
270 |
|
271 .. method:: ZipFile.writestr(zinfo_or_arcname, bytes) |
|
272 |
|
273 Write the string *bytes* to the archive; *zinfo_or_arcname* is either the file |
|
274 name it will be given in the archive, or a :class:`ZipInfo` instance. If it's |
|
275 an instance, at least the filename, date, and time must be given. If it's a |
|
276 name, the date and time is set to the current date and time. The archive must be |
|
277 opened with mode ``'w'`` or ``'a'`` -- calling :meth:`writestr` on a ZipFile |
|
278 created with mode ``'r'`` will raise a :exc:`RuntimeError`. Calling |
|
279 :meth:`writestr` on a closed ZipFile will raise a :exc:`RuntimeError`. |
|
280 |
|
281 .. note:: |
|
282 |
|
283 When passing a :class:`ZipInfo` instance as the *zinfo_or_acrname* parameter, |
|
284 the compression method used will be that specified in the *compress_type* |
|
285 member of the given :class:`ZipInfo` instance. By default, the |
|
286 :class:`ZipInfo` constructor sets this member to :const:`ZIP_STORED`. |
|
287 |
|
288 The following data attributes are also available: |
|
289 |
|
290 |
|
291 .. attribute:: ZipFile.debug |
|
292 |
|
293 The level of debug output to use. This may be set from ``0`` (the default, no |
|
294 output) to ``3`` (the most output). Debugging information is written to |
|
295 ``sys.stdout``. |
|
296 |
|
297 .. attribute:: ZipFile.comment |
|
298 |
|
299 The comment text associated with the ZIP file. If assigning a comment to a |
|
300 :class:`ZipFile` instance created with mode 'a' or 'w', this should be a |
|
301 string no longer than 65535 bytes. Comments longer than this will be |
|
302 truncated in the written archive when :meth:`ZipFile.close` is called. |
|
303 |
|
304 .. _pyzipfile-objects: |
|
305 |
|
306 PyZipFile Objects |
|
307 ----------------- |
|
308 |
|
309 The :class:`PyZipFile` constructor takes the same parameters as the |
|
310 :class:`ZipFile` constructor. Instances have one method in addition to those of |
|
311 :class:`ZipFile` objects. |
|
312 |
|
313 |
|
314 .. method:: PyZipFile.writepy(pathname[, basename]) |
|
315 |
|
316 Search for files :file:`\*.py` and add the corresponding file to the archive. |
|
317 The corresponding file is a :file:`\*.pyo` file if available, else a |
|
318 :file:`\*.pyc` file, compiling if necessary. If the pathname is a file, the |
|
319 filename must end with :file:`.py`, and just the (corresponding |
|
320 :file:`\*.py[co]`) file is added at the top level (no path information). If the |
|
321 pathname is a file that does not end with :file:`.py`, a :exc:`RuntimeError` |
|
322 will be raised. If it is a directory, and the directory is not a package |
|
323 directory, then all the files :file:`\*.py[co]` are added at the top level. If |
|
324 the directory is a package directory, then all :file:`\*.py[co]` are added under |
|
325 the package name as a file path, and if any subdirectories are package |
|
326 directories, all of these are added recursively. *basename* is intended for |
|
327 internal use only. The :meth:`writepy` method makes archives with file names |
|
328 like this:: |
|
329 |
|
330 string.pyc # Top level name |
|
331 test/__init__.pyc # Package directory |
|
332 test/test_support.pyc # Module test.test_support |
|
333 test/bogus/__init__.pyc # Subpackage directory |
|
334 test/bogus/myfile.pyc # Submodule test.bogus.myfile |
|
335 |
|
336 |
|
337 .. _zipinfo-objects: |
|
338 |
|
339 ZipInfo Objects |
|
340 --------------- |
|
341 |
|
342 Instances of the :class:`ZipInfo` class are returned by the :meth:`getinfo` and |
|
343 :meth:`infolist` methods of :class:`ZipFile` objects. Each object stores |
|
344 information about a single member of the ZIP archive. |
|
345 |
|
346 Instances have the following attributes: |
|
347 |
|
348 |
|
349 .. attribute:: ZipInfo.filename |
|
350 |
|
351 Name of the file in the archive. |
|
352 |
|
353 |
|
354 .. attribute:: ZipInfo.date_time |
|
355 |
|
356 The time and date of the last modification to the archive member. This is a |
|
357 tuple of six values: |
|
358 |
|
359 +-------+--------------------------+ |
|
360 | Index | Value | |
|
361 +=======+==========================+ |
|
362 | ``0`` | Year | |
|
363 +-------+--------------------------+ |
|
364 | ``1`` | Month (one-based) | |
|
365 +-------+--------------------------+ |
|
366 | ``2`` | Day of month (one-based) | |
|
367 +-------+--------------------------+ |
|
368 | ``3`` | Hours (zero-based) | |
|
369 +-------+--------------------------+ |
|
370 | ``4`` | Minutes (zero-based) | |
|
371 +-------+--------------------------+ |
|
372 | ``5`` | Seconds (zero-based) | |
|
373 +-------+--------------------------+ |
|
374 |
|
375 |
|
376 .. attribute:: ZipInfo.compress_type |
|
377 |
|
378 Type of compression for the archive member. |
|
379 |
|
380 |
|
381 .. attribute:: ZipInfo.comment |
|
382 |
|
383 Comment for the individual archive member. |
|
384 |
|
385 |
|
386 .. attribute:: ZipInfo.extra |
|
387 |
|
388 Expansion field data. The `PKZIP Application Note |
|
389 <http://www.pkware.com/documents/casestudies/APPNOTE.TXT>`_ contains |
|
390 some comments on the internal structure of the data contained in this string. |
|
391 |
|
392 |
|
393 .. attribute:: ZipInfo.create_system |
|
394 |
|
395 System which created ZIP archive. |
|
396 |
|
397 |
|
398 .. attribute:: ZipInfo.create_version |
|
399 |
|
400 PKZIP version which created ZIP archive. |
|
401 |
|
402 |
|
403 .. attribute:: ZipInfo.extract_version |
|
404 |
|
405 PKZIP version needed to extract archive. |
|
406 |
|
407 |
|
408 .. attribute:: ZipInfo.reserved |
|
409 |
|
410 Must be zero. |
|
411 |
|
412 |
|
413 .. attribute:: ZipInfo.flag_bits |
|
414 |
|
415 ZIP flag bits. |
|
416 |
|
417 |
|
418 .. attribute:: ZipInfo.volume |
|
419 |
|
420 Volume number of file header. |
|
421 |
|
422 |
|
423 .. attribute:: ZipInfo.internal_attr |
|
424 |
|
425 Internal attributes. |
|
426 |
|
427 |
|
428 .. attribute:: ZipInfo.external_attr |
|
429 |
|
430 External file attributes. |
|
431 |
|
432 |
|
433 .. attribute:: ZipInfo.header_offset |
|
434 |
|
435 Byte offset to the file header. |
|
436 |
|
437 |
|
438 .. attribute:: ZipInfo.CRC |
|
439 |
|
440 CRC-32 of the uncompressed file. |
|
441 |
|
442 |
|
443 .. attribute:: ZipInfo.compress_size |
|
444 |
|
445 Size of the compressed data. |
|
446 |
|
447 |
|
448 .. attribute:: ZipInfo.file_size |
|
449 |
|
450 Size of the uncompressed file. |
|
451 |