diff -r 509e4801c378 -r 22878952f6e2 srcanamdw/codescanner/pyinstaller/doc/Manual.html --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/srcanamdw/codescanner/pyinstaller/doc/Manual.html Thu Feb 18 12:29:02 2010 +0530 @@ -0,0 +1,1561 @@ + + + +
+ + +Author: | +William Caban (based on Gordon McMillan's manual) |
---|---|
Contact: | +william@hpcf.upr.edu |
Revision: | +257 |
Source URL: | svn://pyinstaller/trunk/doc/source/Manual.rst | +
Copyright: | +This document has been placed in the public domain. |
First, unpack the archive on you path of choice. Installer is not a Python +package, so it doesn't need to go in site-packages, or have a .pth file. For +the purpose of this documentation we will assume /your/path/to/pyinstaller/. You will be +using a couple of scripts in the /your/path/to/pyinstaller/ directory, and these will find +everything they need from their own location. For convenience, keep the paths +to these scripts short (don't install in a deeply nested subdirectory).
+PyInstaller is dependant to the version of python you configure it for. In +other words, you will need a separate copy of PyInstaller for each Python +version you wish to work with or you'll need to rerun Configure.py every +time you switch the Python version).
+ +Note: Windows users can skip this step, because all of Python is contained in +pythonXX.dll, and PyInstaller will use your pythonXX.dll.
+On Linux the first thing to do is build the runtime executables.
+Change to the /your/path/to/pyinstaller/ source/linux subdirectory. Run Make.py +[-n|-e] and then make. This will produce support/loader/run and +support/loader/run_d, which are the bootloaders.
+ +Note: If you have multiple versions of Python, the Python you use to run +Make.py is the one whose configuration is used.
+The -n and -e options set a non-elf or elf flag in your config.dat. +As of v1.0, the executable will try both strategies, and this flag +just sets how you want your executables built. In the elf strategy, the archive +is concatenated to the executable. In the non-elf strategy, the executable +expects an archive with the same name as itself in the executable's directory. +Note that the executable chases down symbolic links before determining it's name +and directory, so putting the archive in the same directory as the symbolic link +will not work.
+Windows distributions come with several executables in the support/loader +directory: run_*.exe (bootloader for regular programs), and +inprocsrvr_*.dll (bootloader for in-process COM servers). To rebuild this, +you need to install Scons, and then just run scons from the /your/path/to/pyinstaller/ +directory.
+ +In the /your/path/to/pyinstaller/ directory, run Configure.py. This saves some +information into config.dat that would otherwise be recomputed every time. +It can be rerun at any time if your configuration changes. It must be run before +trying to build anything.
+ +[For Windows COM server support, see section Windows COM Server Support]
+The root directory has a script Makespec.py for this purpose:
++python Makespec.py [opts] <scriptname> [<scriptname> ...] ++
Where allowed OPTIONS are:
++-F, --onefile | +produce a single file deployment (see below). |
+-D, --onedir | +produce a single directory deployment (default). |
+-K, --tk | +include TCL/TK in the deployment. |
+-a, --ascii | +do not include encodings. The default (on Python versions with unicode +support) is now to include all encodings. |
+-d, --debug | +use debug (verbose) versions of the executables. |
+-w, --windowed, --noconsole | +|
Use the Windows subsystem executable, which does not open +the console when the program is launched. (Windows only) | |
+-c, --nowindowed, --console | +|
Use the console subsystem executable. This is the default. (Windows only) | |
+-s, --strip | +the executable and all shared libraries will be run through strip. Note +that cygwin's strip tends to render normal Win32 dlls unusable. |
+-X, --upx | +if you have UPX installed (detected by Configure), this will use it to +compress your executable (and, on Windows, your dlls). See note below. |
+-o DIR, --out=DIR | +|
create the spec file in directory. If not specified, and the current +directory is Installer's root directory, an output subdirectory will be +created. Otherwise the current directory is used. | |
+-p DIR, --paths=DIR | +|
set base path for import (like using PYTHONPATH). Multiple directories are +allowed, separating them with the path separator (';' under Windows, ':' +under Linux), or using this option multiple times. | |
+--icon=<FILE.ICO> | +|
add file.ico to the executable's resources. (Windows only) | |
+--icon=<FILE.EXE,N> | +|
add the n-th incon in file.exe to the executable's resources. (Windows +only) | |
+-v FILE, --version=FILE | +|
add verfile as a version resource to the executable. (Windows only) | |
+-n NAME, --name=NAME | +|
optional name to assign to the project (from which the spec file name is +generated). If omitted, the basename of the (first) script is used. |
[For building with optimization on (like Python -O), see section +Building Optimized]
+For simple projects, the generated spec file will probably be sufficient. For +more complex projects, it should be regarded as a template. The spec file is +actually Python code, and modifying it should be ease. See Spec Files for +details.
+ ++python Build.py specfile ++
A buildproject subdirectory will be created in the specfile's directory. This +is a private workspace so that Build.py can act like a makefile. Any named +targets will appear in the specfile's directory. For --onedir +configurations, it will create also distproject, which is the directory you're +interested in. For a --onefile, the executable will be in the specfile's +directory.
+In most cases, this will be all you have to do. If not, see When things go +wrong and be sure to read the introduction to Spec Files.
+ +For Windows COM support execute:
++python MakeCOMServer.py [OPTION] script... ++
This will generate a new script drivescript.py and a spec file for the script.
+These options are allowed:
++--debug | +Use the verbose version of the executable. |
+--verbose | +Register the COM server(s) with the quiet flag off. |
+--ascii | +do not include encodings (this is passed through to Makespec). |
+--out <dir> | +Generate the driver script and spec file in dir. |
Now Build your project on the generated spec file.
+If you have the win32dbg package installed, you can use it with the generated +COM server. In the driver script, set debug=1 in the registration line.
+Warnings: the inprocess COM server support will not work when the client +process already has Python loaded. It would be rather tricky to +non-obtrusively hook into an already running Python, but the show-stopper is +that the Python/C API won't let us find out which interpreter instance I should +hook into. (If this is important to you, you might experiment with using +apartment threading, which seems the best possibility to get this to work). To +use a "frozen" COM server from a Python process, you'll have to load it as an +exe:
++o = win32com.client.Dispatch(progid, + clsctx=pythoncom.CLSCTX_LOCAL_SERVER) ++
MakeCOMServer also assumes that your top level code (registration etc.) is +"normal". If it's not, you will have to edit the generated script.
+ +There are two facets to running optimized: gathering .pyo's, and setting the +Py_OptimizeFlag. Installer will gather .pyo's if it is run optimized:
++python -O Build.py ... ++
The Py_OptimizeFlag will be set if you use a ('O','','OPTION') in one of +the TOCs building the EXE:
++exe = EXE(pyz, + a.scripts + [('O','','OPTION')], + ... ++
See Spec Files for details.
+ +On both Windows and Linux, UPX can give truly startling compression - the days +of fitting something useful on a diskette are not gone forever! Installer has +been tested with many UPX versions without problems. Just get it and install it +on your PATH, then rerun configure.
+For Windows, there is a problem of compatibility between UPX and executables +generated by Microsoft Visual Studio .NET 2003 (or the equivalent free +toolkit available for download). This is especially worrisome for users of +Python 2.4+, where most extensions (and Python itself) are compiled with that +compiler. This issue has been fixed in later beta versions of UPX, so you +will need at least UPX 1.92 beta. Configure.py will check this for you +and complain if you have an older version of UPX and you are using Python 2.4.
+ +For Linux, a bit more discussion is in order. First, UPX is only useful on +executables, not shared libs. Installer accounts for that, but to get the full +benefit, you might rebuild Python with more things statically linked.
+More importantly, when run finds that its sys.argv[0] does not contain a path, +it will use /proc/pid/exe to find itself (if it can). This happens, for +example, when executed by Apache. If it has been upx-ed, this symbolic link +points to the tempfile created by the upx stub and PyInstaller will fail (please +see the UPX docs for more information). So for now, at least, you can't use upx +for CGI's executed by Apache. Otherwise, you can ignore the warnings in the UPX +docs, since what PyInstaller opens is the executable Installer created, not the +temporary upx-created executable.
+ +A --onefile works by packing all the shared libs / dlls into the archive +attached to the bootloader executable (or next to the executable in a non-elf +configuration). When first started, it finds that it needs to extract these +files before it can run "for real". That's because locating and loading a +shared lib or linked-in dll is a system level action, not user-level. With +PyInstaller v1.0 it always uses a temporary directory (_MEIpid) in the +user's temp directory. It then executes itself again, setting things up so +the system will be able to load the shared libs / dlls. When executing is +complete, it recursively removes the entire directory it created.
+This has a number of implications:
+While we are not a security expert, we believe the scheme is good enough for +most of the users.
+Notes for *nix users: Take notice that if the executable does a setuid root, +a determined hacker could possibly (given enough tries) introduce a malicious +lookalike of one of the shared libraries during the hole between when the +library is extracted and when it gets loaded by the execvp'd process. So maybe +you shouldn't do setuid root programs using --onefile. In fact, we do not +recomend the use of --onefile on setuid programs.
+ +setuptools is a distutils extensions which provide many benefits, including +the ability to distribute the extension as egg files. Together with the +nifty easy_install (a tool which automatically locates, downloads and +installs Python extensions), egg files are becoming more and more +widespread as a way for distributing Python extensions.
+egg files are actually ZIP files under the hood, and they rely on the fact +that Python 2.4 is able to transparently import modules stored within ZIP +files. PyInstaller is currently not able to import and extract modules +within ZIP files, so code which uses extensions packaged as egg files +cannot be packaged with PyInstaller.
+The workaround is pretty easy: you can use easy_install -Z at installation +time to ask easy_install to always decompress egg files. This will allow +PyInstaller to see the files and make the package correctly. If you have already +installed the modules, you can simply decompress them within a directory with +the same name of the egg file (including also the extension).
+Support for egg files is planned for a future release of PyInstaller.
+ ++python ArchiveViewer.py <archivefile> ++
ArchiveViewer lets you examine the contents of any archive build with +PyInstaller or executable (PYZ, PKG or exe). Invoke it with the target as the +first arg (It has been set up as a Send-To so it shows on the context menu in +Explorer). The archive can be navigated using these commands:
++python bindepend.py <executable_or_dynamic_library> ++
bindepend will analyze the executable you pass to it, and write to stdout all +its binary dependencies. This is handy to find out which DLLs are required by +an executable or another DLL. This module is used by PyInstaller itself to +follow the chain of dependencies of binary extensions and make sure that all +of them get included in the final package.
++python GrabVersion.py <executable_with_version_resource> ++
GrabVersion outputs text which can be eval'ed by versionInfo.py to reproduce +a version resource. Invoke it with the full path name of a Windows executable +(with a version resource) as the first argument. If you cut & paste (or +redirect to a file), you can then edit the version information. The edited +text file can be used in a version = myversion.txt option on any executable +in an PyInstaller spec file.
+This was done in this way because version resources are rather strange beasts, +and fully understanding them is probably impossible. Some elements are +optional, others required, but you could spend unbounded amounts of time +figuring this out, because it's not well documented. When you view the version +tab on a properties dialog, there's no straightforward relationship between +how the data is displayed and the structure of the resource itself. So the +easiest thing to do is find an executable that displays the kind of +information you want, grab it's resource and edit it. Certainly easier than +the Version resource wizard in VC++.
+ +You can interactively track down dependencies, including getting +cross-references by using mf.py, documented in section mf.py: A modulefinder +Replacement
+ +Spec files are in Python syntax. They are evaluated by Build.py. A simplistic +spec file might look like this:
++a = Analysis(['myscript.py']) +pyz = PYZ(a.pure) +exe = EXE(pyz, a.scripts, a.binaries, name="myapp.exe") ++
This creates a single file deployment with all binaries (extension modules and +their dependencies) packed into the executable.
+A simplistic single directory deployment might look like this:
++a = Analysis(['myscript.py']) +pyz = PYZ(a.pure) +exe = EXE(a.scripts, pyz, name="myapp.exe", exclude_binaries=1) +dist = COLLECT(exe, a.binaries, name="dist") ++
Note that neither of these examples are realistic. Use Makespec.py (documented +in section Create a spec file for your project) to create your specfile, +and tweak it (if necessary) from there.
+All of the classes you see above are subclasses of Build.Target. A Target acts +like a rule in a makefile. It knows enough to cache its last inputs and +outputs. If its inputs haven't changed, it can assume its outputs wouldn't +change on recomputation. So a spec file acts much like a makefile, only +rebuilding as much as needs rebuilding. This means, for example, that if you +change an EXE from debug=1 to debug=0, the rebuild will be nearly +instantaneous.
+The high level view is that an Analysis takes a list of scripts as input, +and generates three "outputs", held in attributes named scripts, pure +and binaries. A PYZ (a .pyz archive) is built from the modules in +pure. The EXE is built from the PYZ, the scripts and, in the case of a +single-file deployment, the binaries. In a single-directory deployment, a +directory is built containing a slim executable and the binaries.
+ +Before you can do much with a spec file, you need to understand the +TOC (Table Of Contents) class.
+A TOC appears to be a list of tuples of the form (name, path, typecode). +In fact, it's an ordered set, not a list. A TOC contains no duplicates, where +uniqueness is based on name only. Furthermore, within this constraint, a TOC +preserves order.
+Besides the normal list methods and operations, TOC supports taking differences +and intersections (and note that adding or extending is really equivalent to +union). Furthermore, the operations can take a real list of tuples on the right +hand side. This makes excluding modules quite easy. For a pure Python module:
++pyz = PYZ(a.pure - [('badmodule', '', '')]) ++
or for an extension module in a single-directory deployment:
++dist = COLLECT(..., a.binaries - [('badmodule', '', '')], ...) ++
or for a single-file deployment:
++exe = EXE(..., a.binaries - [('badmodule', '', '')], ...) ++
To add files to a TOC, you need to know about the typecodes (or the step using +the TOC won't know what to do with the entry).
+typecode | +description | +name | +path | +
---|---|---|---|
'EXTENSION' | +An extension module. | +Python internal name. | +Full path name in build. | +
'PYSOURCE' | +A script. | +Python internal name. | +Full path name in build. | +
'PYMODULE' | +A pure Python module (including __init__ modules). | +Python internal name. | +Full path name in build. | +
'PYZ' | +A .pyz archive (archive_rt.ZlibArchive). | +Runtime name. | +Full path name in build. | +
'PKG' | +A pkg archive (carchive4.CArchive). | +Runtime name. | +Full path name in build. | +
'BINARY' | +A shared library. | +Runtime name. | +Full path name in build. | +
'DATA' | +Aribitrary files. | +Runtime name. | +Full path name in build. | +
'OPTION' | +A runtime runtime option (frozen into the executable). | +The option. | +Unused. | +
You can force the include of any file in much the same way you do excludes:
++collect = COLLECT(a.binaries + + [('readme', '/my/project/readme', 'DATA')], ...) ++
or even:
++collect = COLLECT(a.binaries, + [('readme', '/my/project/readme', 'DATA')], ...) ++
(that is, you can use a list of tuples in place of a TOC in most cases).
+There's not much reason to use this technique for PYSOURCE, since an Analysis +takes a list of scripts as input. For PYMODULEs and EXTENSIONs, the hook +mechanism discussed here is better because you won't have to remember how you +got it working next time.
+This technique is most useful for data files (see the Tree class below for a +way to build a TOC from a directory tree), and for runtime options. The options +the run executables understand are:
+Option | +Description | +Example | +Notes | +
---|---|---|---|
v | +Verbose imports | +('v', '', 'OPTION') | +Same as Python -v ... | +
u | +Unbuffered stdio | +('u', '', 'OPTION') | +Same as Python -u ... | +
W spec | +Warning option | +('W ignore', '', 'OPTION') | +Python 2.1+ only. | +
s | +Use site.py | +('s', '', 'OPTION') | +The opposite of Python's -S flag. Note that site.py must be in the executable's directory to be used. | +
f | +Force execvp | +('f', '', 'OPTION') | +Linux/unix only. Ensures that LD_LIBRARY_PATH is set properly. | +
Advanced users should note that by using set differences and intersections, it +becomes possible to factor out common modules, and deploy a project containing +multiple executables with minimal redundancy. You'll need some top level code +in each executable to mount the common PYZ.
+ ++Analysis(scripts, pathex=None, hookspath=None, excludes=None) ++
An Analysis has three outputs, all TOCs accessed as attributes of the Analysis.
++PYZ(toc, name=None, level=9) ++
Generally, you will not need to create your own PKGs, as the EXE will do it for +you. This is one way to include read-only data in a single-file deployment, +however. A single-file deployment including TK support will use this technique.
++PKG(toc, name=None, cdict=None, exclude_binaries=0) ++
+EXE(*args, **kws) ++
Possible keyword arguments:
+There are actually two EXE classes - one for ELF platforms (where the +bootloader, that is the run executable, and the PKG are concatenated), +and one for non-ELF platforms (where the run executable is simply renamed, and +expects a exename.pkg in the same directory). Which class becomes available +as EXE is determined by a flag in config.dat. This flag is set to +non-ELF when using Make.py -n.
+ +On Windows, this provides support for doing in-process COM servers. It is not +generalized. However, embedders can follow the same model to build a special +purpose DLL so the Python support in their app is hidden. You will need to +write your own dll, but thanks to Allan Green for refactoring the C code and +making that a managable task.
+ ++COLLECT(*args, **kws) ++
Possible keyword arguments:
++Tree(root, prefix=None, excludes=None) ++
A list of names to exclude. Two forms are allowed:
+Since a Tree is a TOC, you can also use the exclude technique described above +in the section on TOCs.
+ +When an Analysis step runs, it produces a warnings file (named warnproject.txt) +in the spec file's directory. Generally, most of these warnings are harmless. +For example, os.py (which is cross-platform) works by figuring out what +platform it is on, then importing (and rebinding names from) the appropriate +platform-specific module. So analyzing os.py will produce a set of warnings +like:
++W: no module named dos (conditional import by os) +W: no module named ce (conditional import by os) +W: no module named os2 (conditional import by os) ++
Note that the analysis has detected that the import is within a conditional +block (an if statement). The analysis also detects if an import within a +function or class, (delayed) or at the top level. A top-level, non-conditional +import failure is really a hard error. There's at least a reasonable chance +that conditional and / or delayed import will be handled gracefully at runtime.
+Ignorable warnings may also be produced when a class or function is declared in +a package (an __init__.py module), and the import specifies +package.name. In this case, the analysis can't tell if name is supposed to +refer to a submodule of package.
+Warnings are also produced when an __import__, exec or eval statement is +encountered. The __import__ warnings should almost certainly be investigated. +Both exec and eval can be used to implement import hacks, but usually their use +is more benign.
+Any problem detected here can be handled by hooking the analysis of the module. +See Listing Hidden Imports below for how to do it.
+ +Setting debug=1 on an EXE will cause the executable to put out progress +messages (for console apps, these go to stdout; for Windows apps, these show as +MessageBoxes). This can be useful if you are doing complex packaging, or your +app doesn't seem to be starting, or just to learn how the runtime works.
+ +You can also pass a -v (verbose imports) flag to the embedded Python. This can +be extremely useful. I usually try it even on apparently working apps, just to +make sure that I'm always getting my copies of the modules and no import has +leaked out to the installed Python.
+You set this (like the other runtime options) by feeding a phone TOC entry to +the EXE. The easiest way to do this is to change the EXE from:
++EXE(..., anal.scripts, ....) ++
to:
++EXE(..., anal.scripts + [('v', '', 'OPTION')], ...) ++
These messages will always go to stdout, so you won't see them on Windows if +console=0.
+ +When the analysis phase cannot find needed modules, it may be that the code is +manipulating sys.path. The easiest thing to do in this case is tell Analysis +about the new directory through the second arg to the constructor:
++anal = Analysis(['somedir/myscript.py'], + ['path/to/thisdir', 'path/to/thatdir']) ++
In this case, the Analysis will have a search path:
++['somedir', 'path/to/thisdir', 'path/to/thatdir'] + sys.path ++
You can do the same when running Makespec.py:
++Makespec.py --paths=path/to/thisdir;path/to/thatdir ... ++
(on *nix, use : as the path separator).
+ +Hidden imports are fairly common. These can occur when the code is using +__import__ (or, perhaps exec or eval), in which case you will see a warning in +the warnproject.txt file. They can also occur when an extension module uses the +Python/C API to do an import, in which case Analysis can't detect anything. You +can verify that hidden import is the problem by using Python's verbose imports +flag. If the import messages say "module not found", but the warnproject.txt +file has no "no module named..." message for the same module, then the problem +is a hidden import.
+ +Hidden imports are handled by hooking the module (the one doing the hidden +imports) at Analysis time. Do this by creating a file named hook-module.py +(where module is the fully-qualified Python name, eg, hook-xml.dom.py), and +placing it in the hooks package under PyInstaller's root directory, +(alternatively, you can save it elsewhere, and then use the hookspath arg to +Analysis so your private hooks directory will be searched). Normally, it will +have only one line:
++hiddenimports = ['module1', 'module2'] ++
When the Analysis finds this file, it will proceed exactly as though the module +explicitly imported module1 and module2. (Full details on the analysis-time +hook mechanism is in the Hooks section).
+If you successfully hook a publicly distributed module in this way, please send +us the hook so we can make it available to others.
+ +Python allows a package to extend the search path used to find modules and +sub-packages through the __path__ mechanism. Normally, a package's __path__ has +only one entry - the directory in which the __init__.py was found. But +__init__.py is free to extend its __path__ to include other directories. For +example, the win32com.shell.shell module actually resolves to +win32com/win32comext/shell/shell.pyd. This is because win32com/__init__.py +appends ../win32comext to its __path__.
+Because the __init__.py is not actually run during an analysis, we use the same +hook mechanism we use for hidden imports. A static list of names won't do, +however, because the new entry on __path__ may well require computation. So +hook-module.py should define a method hook(mod). The mod argument is an +instance of mf.Module which has (more or less) the same attributes as a real +module object. The hook function should return a mf.Module instance - perhaps +a brand new one, but more likely the same one used as an arg, but mutated. +See mf.py: A Modulefinder Replacement for details, and hooks/hook-win32com.py +for an example.
+Note that manipulations of __path__ hooked in this way apply to the analysis, +and only the analysis. That is, at runtime win32com.shell is resolved the same +way as win32com.anythingelse, and win32com.__path__ knows nothing of ../win32comext.
+Once in awhile, that's not enough.
+ +More bizarre situations can be accomodated with runtime hooks. These are small +scripts that manipulate the environment before your main script runs, +effectively providing additional top-level code to your script.
+At the tail end of an analysis, the module list is examined for matches in +rthooks.dat, which is the string representation of a Python dictionary. The +key is the module name, and the value is a list of hook-script pathnames.
+So putting an entry:
++'somemodule': ['path/to/somescript.py'], ++
into rthooks.dat is almost the same thing as doing this:
++anal = Analysis(['path/to/somescript.py', 'main.py'], ... ++
except that in using the hook, path/to/somescript.py will not be analyzed, +(that's not a feature - we just haven't found a sane way fit the recursion into +my persistence scheme).
+Hooks done in this way, while they need to be careful of what they import, are +free to do almost anything. One provided hook sets things up so that win32com +can generate modules at runtime (to disk), and the generated modules can be +found in the win32com package.
+ +In most sophisticated apps, it becomes necessary to figure out (at runtime) +whether you're running "live" or "frozen". For example, you might have a +configuration file that (running "live") you locate based on a module's +__file__ attribute. That won't work once the code is packaged up. You'll +probably want to look for it based on sys.executable instead.
+The bootloaders set sys.frozen=1 (and, for in-process COM servers, the +embedding DLL sets sys.frozen='dll').
+For really advanced users, you can access the iu.ImportManager as +sys.importManager. See iu.py for how you might make use of this fact.
+ +In a --onedir distribution, this is easy: pass a list of your data files +(in TOC format) to the COLLECT, and they will show up in the distribution +directory tree. The name in the (name, path, 'DATA') tuple can be a relative +path name. Then, at runtime, you can use code like this to find the file:
++os.path.join(os.path.dirname(sys.executable), relativename)) ++
In a --onefile, it's a bit trickier. You can cheat, and add the files to the +EXE as BINARY. They will then be extracted at runtime into the work directory +by the C code (which does not create directories, so the name must be a plain +name), and cleaned up on exit. The work directory is best found by +os.environ['_MEIPASS2']. Be awawre, though, that if you use --strip or --upx, +strange things may happen to your data - BINARY is really for shared +libs / dlls.
+If you add them as 'DATA' to the EXE, then it's up to you to extract them. Use +code like this:
++import sys, carchive +this = carchive.CArchive(sys.executable) +data = this.extract('mystuff')[1] ++
to get the contents as a binary string. See support/unpackTK.py for an advanced +example (the TCL and TK lib files are in a PKG which is opened in place, and +then extracted to the filesystem).
+ +Pmw comes with a script named bundlepmw in the bin directory. If you follow the +instructions in that script, you'll end up with a module named Pmw.py. Ensure +that Builder finds that module and not the development package.
+ +If you're using popen on Windows and want the code to work on Win9x, you'll +need to distribute win9xpopen.exe with your app. On older Pythons with +Win32all, this would apply to Win32pipe and win32popenWin9x.exe. (On yet older +Pythons, no form of popen worked on Win9x).
+ +The ELF executable format (Windows, Linux and some others) allows arbitrary +data to be concatenated to the end of the executable without disturbing its +functionality. For this reason, a CArchive's Table of Contents is at the end of +the archive. The executable can open itself as a binary file name, seek to the +end and 'open' the CArchive (see figure 3).
+On other platforms, the archive and the executable are separate, but the +archive is named executable.pkg, and expected to be in the same directory. +Other than that, the process is the same.
+ +In a single directory deployment (--onedir, which is the default), all of the +binaries are already in the file system. In that case, the embedding app:
+There are a couple situations which require two passes:
+The first pass:
+The child process executes as in One Pass Execution above (the magic +environment variable is what tells it that this is pass two).
+figure 3 - Self Extracting Executable
+There are, of course, quite a few differences between the Windows and +Unix/Linux versions. The major one is that because all of Python on Windows is +in pythonXX.dll, and dynamic loading is so simple-minded, that one binary can +be use with any version of Python. There's much in common, though, and that C +code can be found in source/common/launch.c.
+The Unix/Linux build process (which you need to run just once for any version +of Python) makes use of the config information in your install (if you +installed from RPM, you need the Python-development RPM). It also overrides +getpath.c since we don't want it hunting around the filesystem to build +sys.path.
+In both cases, while one PyInstaller download can be used with any Python +version, you need to have separate installations for each Python version.
+ +You know what an archive is: a .tar file, a .jar file, a .zip file. Two kinds +of archives are used here. One is equivalent to a Java .jar file - it allows +Python modules to be stored efficiently and, (with some import hooks) imported +directly. This is a ZlibArchive. The other (a CArchive) is equivalent to a +.zip file - a general way of packing up (and optionally compressing) arbitrary +blobs of data. It gets its name from the fact that it can be manipulated easily +from C, as well as from Python. Both of these derive from a common base class, +making it fairly easy to create new kinds of archives.
+ +A ZlibArchive contains compressed .pyc (or .pyo) files. The Table of Contents +is a marshalled dictionary, with the key (the module's name as given in an +import statement) associated with a seek position and length. Because it is +all marshalled Python, ZlibArchives are completely cross-platform.
+A ZlibArchive hooks in with iu.py so that, with a little setup, the archived +modules can be imported transparently. Even with compression at level 9, this +works out to being faster than the normal import. Instead of searching +sys.path, there's a lookup in the dictionary. There's no stat-ing of the .py +and .pyc and no file opens (the file is already open). There's just a seek, a +read and a decompress. A traceback will point to the source file the archive +entry was created from (the __file__ attribute from the time the .pyc was +compiled). On a user's box with no source installed, this is not terribly +useful, but if they send you the traceback, at least you can make sense of it.
+ + +A CArchive contains whatever you want to stuff into it. It's very much like a +.zip file. They are easy to create in Python and unpack from C code. CArchives +can be appended to other files (like ELF and COFF executables, for example). +To allow this, they are opened from the end, so the TOC for a CArchive is at +the back, followed only by a cookie that tells you where the TOC starts and +where the archive itself starts.
+CArchives can also be embedded within other CArchives. The inner archive can be +opened in place (without extraction).
+Each TOC entry is variable length. The first field in the entry tells you the +length of the entry. The last field is the name of the corresponding packed +file. The name is null terminated. Compression is optional by member.
+There is also a type code associated with each entry. If you're using a +CArchive as a .zip file, you don't need to worry about this. The type codes +are used by the self-extracting executables.
+ + +PyInstaller is mainly distributed under the +GPL License +but it has an exception such that you can use it to compile commercial products.
+In a nutshell, the license is GPL for the source code with the exception that:
++++
+- You may use PyInstaller to compile commercial applications out of your +source code.
+- The resulting binaries generated by PyInstaller from your source code can be +shipped with whatever license you want.
+- You may modify PyInstaller for your own needs but these changes to the +PyInstaller source code falls under the terms of the GPL license. In other +words, any modifications to will have to be distributed under GPL.
+
For updated information or clarification see our +FAQ at PyInstaller +home page: http://pyinstaller.hpcf.upr.edu
+ +Module mf is modelled after iu.
+It also uses ImportDirectors and Owners to partition the import name space. +Except for the fact that these return Module instances instead of real module +objects, they are identical.
+Instead of an ImportManager, mf has an ImportTracker managing things.
+ +ImportTracker can be called in two ways: analyze_one(name, importername=None) +or analyze_r(name, importername=None). The second method does what modulefinder +does - it recursively finds all the module names that importing name would +cause to appear in sys.modules. The first method is non-recursive. This is +useful, because it is the only way of answering the question "Who imports +name?" But since it is somewhat unrealistic (very few real imports do not +involve recursion), it deserves some explanation.
+ +When a name is imported, there are structural and dynamic effects. The dynamic +effects are due to the execution of the top-level code in the module (or +modules) that get imported. The structural effects have to do with whether the +import is relative or absolute, and whether the name is a dotted name (if there +are N dots in the name, then N+1 modules will be imported even without any code +running).
+The analyze_one method determines the structural effects, and defers the +dynamic effects. For example, analyze_one("B.C", "A") could return ["B", "B.C"] +or ["A.B", "A.B.C"] depending on whether the import turns out to be relative or +absolute. In addition, ImportTracker's modules dict will have Module instances +for them.
+ +There are Module subclasses for builtins, extensions, packages and (normal) +modules. Besides the normal module object attributes, they have an attribute +imports. For packages and normal modules, imports is a list populated by +scanning the code object (and therefor, the names in this list may be relative +or absolute names - we don't know until they have been analyzed).
+The highly astute will notice that there is a hole in analyze_one() here. The +first thing that happens when B.C is being imported is that B is imported and +it's top-level code executed. That top-level code can do various things so that +when the import of B.C finally occurs, something completely different happens +(from what a structural analysis would predict). But mf can handle this through +it's hooks mechanism.
+ +Like modulefinder, mf scans the byte code of a module, looking for imports. In +addition, mf will pick out a module's __all__ attribute, if it is built as a +list of constant names. This means that if a package declares an __all__ list +as a list of names, ImportTracker will track those names if asked to analyze +package.*. The code scan also notes the occurance of __import__, exec and eval, +and can issue warnings when they're found.
+The code scanning also keeps track (as well as it can) of the context of an +import. It recognizes when imports are found at the top-level, and when they +are found inside definitions (deferred imports). Within that, it also tracks +whether the import is inside a condition (conditional imports).
+ +In modulefinder, scanning the code takes the place of executing the code +object. mf goes further and allows a module to be hooked (after it has been +scanned, but before analyze_one is done with it). A hook is a module named +hook-fullyqualifiedname in the hooks package. These modules should have one or +more of the following three global names defined:
+The first hook (hiddenimports) extends the list created by scanning the code. +ExtensionModules, of course, don't get scanned, so this is the only way of +recording any imports they do.
+The second hook (attrs) exists mainly so that ImportTracker won't issue +spurious warnings when the rightmost node in a dotted name turns out to be an +attribute in a package module, instead of a missing submodule.
+The callable hook exists for things like dynamic modification of a package's +__path__ or perverse situations, like xml.__init__ replacing itself in +sys.modules with _xmlplus.__init__. (It takes nine hook modules to properly +trace through PyXML-using code, and I can't believe that it's any easier for +the poor programmer using that package). The hook(mod) (if it exists) is +called before looking at the others - that way it can, for example, test +sys.version and adjust what's in hiddenimports.
+ +ImportTracker has a getwarnings() method that returns all the warnings +accumulated by the instance, and by the Module instances in its modules dict. +Generally, it is ImportTracker who will accumulate the warnings generated +during the structural phase, and Modules that will get the warnings generated +during the code scan.
+Note that by using a hook module, you can silence some particularly tiresome +warnings, but not all of them.
+ +Once a full analysis (that is, an analyze_r call) has been done, you can get a +cross reference by using getxref(). This returns a list of tuples. Each tuple +is (modulename, importers), where importers is a list of the (fully qualified) +names of the modules importing modulename. Both the returned list and the +importers list are sorted.
+ +A simple example follows:
++++>>> import mf +>>> a = mf.ImportTracker() +>>> a.analyze_r("os") +['os', 'sys', 'posixpath', 'nt', 'stat', 'string', 'strop', +'re', 'pcre', 'ntpath', 'dospath', 'macpath', 'win32api', +'UserDict', 'copy', 'types', 'repr', 'tempfile'] +>>> a.analyze_one("os") +['os'] +>>> a.modules['string'].imports +[('strop', 0, 0), ('strop.*', 0, 0), ('re', 1, 1)] +>>> ++
The tuples in the imports list are (name, delayed, conditional).
+++ ++>>> for w in a.modules['string'].warnings: print w +... +W: delayed eval hack detected at line 359 +W: delayed eval hack detected at line 389 +W: delayed eval hack detected at line 418 +>>> for w in a.getwarnings(): print w +... +W: no module named pwd (delayed, conditional import by posixpath) +W: no module named dos (conditional import by os) +W: no module named os2 (conditional import by os) +W: no module named posix (conditional import by os) +W: no module named mac (conditional import by os) +W: no module named MACFS (delayed, conditional import by tempfile) +W: no module named macfs (delayed, conditional import by tempfile) +W: top-level conditional exec statment detected at line 47 + - os (C:\Program Files\Python\Lib\os.py) +W: delayed eval hack detected at line 359 + - string (C:\Program Files\Python\Lib\string.py) +W: delayed eval hack detected at line 389 + - string (C:\Program Files\Python\Lib\string.py) +W: delayed eval hack detected at line 418 + - string (C:\Program Files\Python\Lib\string.py) +>>> ++
Module iu grows out of the pioneering work that Greg Stein did with imputil +(actually, it includes some verbatim imputil code, but since Greg didn't +copyright it, we won't mention it). Both modules can take over Python's +builtin import and ease writing of at least certain kinds of import hooks.
+iu differs from imputil: +* faster +* better emulation of builtin import +* more managable
+There is an ImportManager which provides the replacement for builtin import +and hides all the semantic complexities of a Python import request from it's +delegates.
+ +ImportManager formalizes the concept of a metapath. This concept implicitly +exists in native Python in that builtins and frozen modules are searched +before sys.path, (on Windows there's also a search of the registry while on +Mac, resources may be searched). This metapath is a list populated with +ImportDirector instances. There are ImportDirector subclasses for builtins, +frozen modules, (on Windows) modules found through the registry and a +PathImportDirector for handling sys.path. For a top-level import (that is, not +an import of a module in a package), ImportManager tries each director on it's +metapath until one succeeds.
+ImportManager hides the semantic complexity of an import from the directors. +It's up to the ImportManager to decide if an import is relative or absolute; +to see if the module has already been imported; to keep sys.modules up to +date; to handle the fromlist and return the correct module object.
+ +An ImportDirector just needs to respond to getmod(name) by returning a module +object or None. As you will see, an ImportDirector can consider name to be +atomic - it has no need to examine name to see if it is dotted.
+To see how this works, we need to examine the PathImportDirector.
+ +The PathImportDirector subclass manages a list of names - most notably, +sys.path. To do so, it maintains a shadowpath - a dictionary mapping the names +on its pathlist (eg, sys.path) to their associated Owners. (It could do this +directly, but the assumption that sys.path is occupied solely by strings seems +ineradicable.) Owners of the appropriate kind are created as needed (if all +your imports are satisfied by the first two elements of sys.path, the +PathImportDirector's shadowpath will only have two entries).
+ +An Owner is much like an ImportDirector but manages a much more concrete piece +of turf. For example, a DirOwner manages one directory. Since there are no +other officially recognized filesystem-like namespaces for importing, that's +all that's included in iu, but it's easy to imagine Owners for zip files +(and I have one for my own .pyz archive format) or even URLs.
+As with ImportDirectors, an Owner just needs to respond to getmod(name) by +returning a module object or None, and it can consider name to be atomic.
+So structurally, we have a tree, rooted at the ImportManager. At the next +level, we have a set of ImportDirectors. At least one of those directors, the +PathImportDirector in charge of sys.path, has another level beneath it, +consisting of Owners. This much of the tree covers the entire top-level import +namespace.
+The rest of the import namespace is covered by treelets, each rooted in a +package module (an __init__.py).
+ +To make this work, Owners need to recognize when a module is a package. For a +DirOwner, this means that name is a subdirectory which contains an __init__.py. +The __init__ module is loaded and its __path__ is initialized with the +subdirectory. Then, a PathImportDirector is created to manage this __path__. +Finally the new PathImportDirector's getmod is assigned to the package's +__importsub__ function.
+When a module within the package is imported, the request is routed (by the +ImportManager) diretly to the package's __importsub__. In a hierarchical +namespace (like a filesystem), this means that __importsub__ (which is really +the bound getmod method of a PathImportDirector instance) needs only the +module name, not the package name or the fully qualified name. And that's +exactly what it gets. (In a flat namespace - like most archives - it is +perfectly easy to route the request back up the package tree to the archive +Owner, qualifying the name at each step.)
+ +Let's say we want to import from zip files. So, we subclass Owner. The +__init__ method should take a filename, and raise a ValueError if the file is +not an acceptable .zip file, (when a new name is encountered on sys.path or a +package's __path__, registered Owners are tried until one accepts the name). +The getmod method would check the zip file's contents and return None if the +name is not found. Otherwise, it would extract the marshalled code object from +the zip, create a new module object and perform a bit of initialization (12 +lines of code all told for my own archive format, including initializing a pack +age with it's __subimporter__).
+Once the new Owner class is registered with iu, you can put a zip file on +sys.path. A package could even put a zip file on its __path__.
+ +This code has been tested with the PyXML, mxBase and Win32 packages, covering +over a dozen import hacks from manipulations of __path__ to replacing a module +in sys.modules with a different one. Emulation of Python's native import is +nearly exact, including the names recorded in sys.modules and module attributes +(packages imported through iu have an extra attribute - __importsub__).
+ +In most cases, iu is slower than builtin import (by 15 to 20%) but faster than +imputil (by 15 to 20%). By inserting archives at the front of sys.path +containing the standard lib and the package being tested, this can be reduced +to 5 to 10% slower (or, on my 1.52 box, 10% faster!) than builtin import. A bit +more can be shaved off by manipulating the ImportManager's metapath.
+ +This module makes no attempt to facilitate policy import hacks. It is easy to +implement certain kinds of policies within a particular domain, but +fundamentally iu works by dividing up the import namespace into independent +domains.
+Quite simply, I think cross-domain import hacks are a very bad idea. As author +of the original package on which PyInstaller is based, McMillan worked with +import hacks for many years. Many of them are highly fragile; they often rely +on undocumented (maybe even accidental) features of implementation. +A cross-domain import hack is not likely to work with PyXML, for example.
+That rant aside, you can modify ImportManger to implement different policies. +For example, a version that implements three import primitives: absolute +import, relative import and recursive-relative import. No idea what the Python +syntax for those should be, but __aimport__, __rimport__ and __rrimport__ were +easy to implement.
+Here's a simple example of using iu as a builtin import replacement.
+++ ++>>> import iu +>>> iu.ImportManager().install() +>>> +>>> import DateTime +>>> DateTime.__importsub__ +<method PathImportDirector.getmod + of PathImportDirector instance at 825900> +>>> ++