FCL/sftools/ana/staticanamdw: srcanamdw/codescanner/pyinstaller/doc/source/Manual.rst@22878952f6e2


==================
PyInstaller Manual
==================
:Author: William Caban (based on Gordon McMillan's manual)
:Contact: william@hpcf.upr.edu
:Revision: $Rev: 257 $
:Source URL: $HeadURL: http://svn.pyinstaller.python-hosting.com/trunk/doc/source/Manual.rst $
:Copyright: This document has been placed in the public domain.

.. contents::


Getting Started
+++++++++++++++

Installing PyInstaller
----------------------

First, unpack the archive on you path of choice. Installer is **not** a Python
package, so it doesn't need to go in site-packages, or have a .pth file. For
the purpose of this documentation we will assume |install_path|. You will be
using a couple of scripts in the |install_path| directory, and these will find
everything they need from their own location. For convenience, keep the paths
to these scripts short (don't install in a deeply nested subdirectory).

|PyInstaller| is dependant to the version of python you configure it for. In
other words, you will need a separate copy of |PyInstaller| for each Python
version you wish to work with *or* you'll need to rerun ``Configure.py`` every
time you switch the Python version).

|GOBACK|


Building the runtime executables
--------------------------------

*Note:* Windows users can skip this step, because all of Python is contained in
pythonXX.dll, and |PyInstaller| will use your pythonXX.dll.

On Linux the first thing to do is build the runtime executables.

Change to the |install_path| ``source/linux`` subdirectory. Run ``Make.py
[-n|-e]`` and then make. This will produce ``support/loader/run`` and
``support/loader/run_d``, which are the bootloaders.

.. sidebar:: Bootloader

   The bootloader (also known as *stub* in literature) is the small program
   which starts up your packaged program. Usually, the archive containing the
   bytecoded modules of your program is simply attended to it. See
   `Self-extracting executables`_ for more details on the process.

*Note:* If you have multiple versions of Python, the Python you use to run
``Make.py`` is the one whose configuration is used.

The ``-n`` and ``-e`` options set a non-elf or elf flag in your ``config.dat``.
As of |InitialVersion|, the executable will try both strategies, and this flag
just sets how you want your executables built. In the elf strategy, the archive
is concatenated to the executable. In the non-elf strategy, the executable
expects an archive with the same name as itself in the executable's directory.
Note that the executable chases down symbolic links before determining it's name
and directory, so putting the archive in the same directory as the symbolic link
will not work.

Windows distributions come with several executables in the ``support/loader``
directory: ``run_*.exe`` (bootloader for regular programs), and
``inprocsrvr_*.dll`` (bootloader for in-process COM servers). To rebuild this,
you need to install Scons_, and then just run ``scons`` from the |install_path|
directory.

|GOBACK|

Configuring your PyInstaller setup
----------------------------------

In the |install_path| directory, run ``Configure.py``. This saves some
information into ``config.dat`` that would otherwise be recomputed every time.
It can be rerun at any time if your configuration changes. It must be run before
trying to build anything.

|GOBACK|


Create a spec file for your project
-----------------------------------

[For Windows COM server support, see section `Windows COM Server Support`_]

The root directory has a script Makespec.py for this purpose::

       python Makespec.py [opts] <scriptname> [<scriptname> ...]

Where allowed OPTIONS are:

-F, --onefile
    produce a single file deployment (see below).

-D, --onedir
    produce a single directory deployment (default).

-K, --tk
    include TCL/TK in the deployment.

-a, --ascii
    do not include encodings. The default (on Python versions with unicode
    support) is now to include all encodings.

-d, --debug
    use debug (verbose) versions of the executables.

-w, --windowed, --noconsole
    Use the Windows subsystem executable, which does not open
    the console when the program is launched. **(Windows only)**

-c, --nowindowed, --console
    Use the console subsystem executable. This is the default. **(Windows only)**

-s, --strip
    the executable and all shared libraries will be run through strip. Note
    that cygwin's strip tends to render normal Win32 dlls unusable.

-X, --upx
    if you have UPX installed (detected by Configure), this will use it to
    compress your executable (and, on Windows, your dlls). See note below.

-o DIR, --out=DIR
    create the spec file in *directory*. If not specified, and the current
    directory is Installer's root directory, an output subdirectory will be
    created. Otherwise the current directory is used.

-p DIR, --paths=DIR
    set base path for import (like using PYTHONPATH). Multiple directories are
    allowed, separating them with the path separator (';' under Windows, ':'
    under Linux), or using this option multiple times.

--icon=<FILE.ICO>
    add *file.ico* to the executable's resources. **(Windows only)**

--icon=<FILE.EXE,N>
    add the *n*-th incon in *file.exe* to the executable's resources. **(Windows
    only)**

-v FILE, --version=FILE
    add verfile as a version resource to the executable. **(Windows only)**

-n NAME, --name=NAME
    optional *name* to assign to the project (from which the spec file name is
    generated). If omitted, the basename of the (first) script is used.

[For building with optimization on (like ``Python -O``), see section
`Building Optimized`_]

For simple projects, the generated spec file will probably be sufficient. For
more complex projects, it should be regarded as a template. The spec file is
actually Python code, and modifying it should be ease. See `Spec Files`_ for
details.


|GOBACK|

Build your project
------------------

::

      python Build.py specfile


A ``buildproject`` subdirectory will be created in the specfile's directory. This
is a private workspace so that ``Build.py`` can act like a makefile. Any named
targets will appear in the specfile's directory. For ``--onedir``
configurations, it will create also ``distproject``, which is the directory you're
interested in. For a ``--onefile``, the executable will be in the specfile's
directory.

In most cases, this will be all you have to do. If not, see `When things go
wrong`_ and be sure to read the introduction to `Spec Files`_.

|GOBACK|

Windows COM Server support
--------------------------

For Windows COM support execute::

       python MakeCOMServer.py [OPTION] script...


This will generate a new script ``drivescript.py`` and a spec file for the script.

These options are allowed:

--debug
    Use the verbose version of the executable.

--verbose
    Register the COM server(s) with the quiet flag off.

--ascii
    do not include encodings (this is passed through to Makespec).

--out <dir>
    Generate the driver script and spec file in dir.

Now `Build your project`_ on the generated spec file.

If you have the win32dbg package installed, you can use it with the generated
COM server. In the driver script, set ``debug=1`` in the registration line.

**Warnings**: the inprocess COM server support will not work when the client
process already has Python loaded. It would be rather tricky to
non-obtrusively hook into an already running Python, but the show-stopper is
that the Python/C API won't let us find out which interpreter instance I should
hook into. (If this is important to you, you might experiment with using
apartment threading, which seems the best possibility to get this to work). To
use a "frozen" COM server from a Python process, you'll have to load it as an
exe::

      o = win32com.client.Dispatch(progid,
                       clsctx=pythoncom.CLSCTX_LOCAL_SERVER)


MakeCOMServer also assumes that your top level code (registration etc.) is
"normal". If it's not, you will have to edit the generated script.

|GOBACK|


Building Optimized
------------------

There are two facets to running optimized: gathering ``.pyo``'s, and setting the
``Py_OptimizeFlag``. Installer will gather ``.pyo``'s if it is run optimized::

       python -O Build.py ...


The ``Py_OptimizeFlag`` will be set if you use a ``('O','','OPTION')`` in one of
the ``TOCs`` building the ``EXE``::

      exe = EXE(pyz,
                a.scripts + [('O','','OPTION')],
                ...

See `Spec Files`_ for details.

|GOBACK|


A Note on using UPX
-------------------

On both Windows and Linux, UPX can give truly startling compression - the days
of fitting something useful on a diskette are not gone forever! Installer has
been tested with many UPX versions without problems. Just get it and install it
on your PATH, then rerun configure.

For Windows, there is a problem of compatibility between UPX and executables
generated by Microsoft Visual Studio .NET 2003 (or the equivalent free
toolkit available for download). This is especially worrisome for users of
Python 2.4+, where most extensions (and Python itself) are compiled with that
compiler. This issue has been fixed in later beta versions of UPX, so you
will need at least UPX 1.92 beta. `Configure.py`_ will check this for you
and complain if you have an older version of UPX and you are using Python 2.4.

.. sidebar:: UPX and Unix

    Under UNIX, old versions of UPX were not able to expand and execute the
    executable in memory, and they were extracting it into a temporary file
    in the filesystem, before spawning it. This is no longer valid under Linux,
    but the information in this paragraph still needs to be updated.

.. _`Configure.py`: `Configuring your PyInstaller setup`_

For Linux, a bit more discussion is in order. First, UPX is only useful on
executables, not shared libs. Installer accounts for that, but to get the full
benefit, you might rebuild Python with more things statically linked.

More importantly, when ``run`` finds that its ``sys.argv[0]`` does not contain a path,
it will use ``/proc/pid/exe`` to find itself (if it can). This happens, for
example, when executed by Apache. If it has been upx-ed, this symbolic link
points to the tempfile created by the upx stub and |PyInstaller| will fail (please
see the UPX docs for more information). So for now, at least, you can't use upx
for CGI's executed by Apache. Otherwise, you can ignore the warnings in the UPX
docs, since what PyInstaller opens is the executable Installer created, not the
temporary upx-created executable.

|GOBACK|

A Note on ``--onefile``
-----------------------

A ``--onefile`` works by packing all the shared libs / dlls into the archive
attached to the bootloader executable (or next to the executable in a non-elf
configuration). When first started, it finds that it needs to extract these
files before it can run "for real". That's because locating and loading a
shared lib or linked-in dll is a system level action, not user-level. With
|PyInstallerVersion| it always uses a temporary directory (``_MEIpid``) in the
user's temp directory. It then executes itself again, setting things up so
the system will be able to load the shared libs / dlls. When executing is
complete, it recursively removes the entire directory it created.

This has a number of implications:

* You can run multiple copies - they won't collide.

* Running multiple copies will be rather expensive to the system (nothing is
  shared).

* If you're using the cheat of adding user data as ``'BINARY'``, it will be in
  ``os.environ['_MEIPASS2']``, not in the executable's directory.

* On Windows, using Task Manager to kill the parent process will leave the
  directory behind.

* On \*nix, a kill -9 (or crash) will leave the directory behind.

* Otherwise, on both platforms, the directory will be recursively deleted.

* So any files you might create in ``os.environ['_MEIPASS2']`` will be deleted.

* The executable can be in a protected or read-only directory.

* If for some reason, the ``_MEIpid`` directory already exists, the executable
  will fail. It is created mode 0700, so only the one user can modify it
  (on \*nix, of course).

While we are not a security expert, we believe the scheme is good enough for
most of the users.

**Notes for \*nix users**: Take notice that if the executable does a setuid root,
a determined hacker could possibly (given enough tries) introduce a malicious
lookalike of one of the shared libraries during the hole between when the
library is extracted and when it gets loaded by the execvp'd process. So maybe
you shouldn't do setuid root programs using ``--onefile``. **In fact, we do not
recomend the use of --onefile on setuid programs.**

|GOBACK|

A Note on .egg files and setuptools
-----------------------------------
`setuptools`_ is a distutils extensions which provide many benefits, including
the ability to distribute the extension as ``egg`` files. Together with the
nifty `easy_install`_ (a tool which automatically locates, downloads and
installs Python extensions), ``egg`` files are becoming more and more
widespread as a way for distributing Python extensions.

``egg`` files are actually ZIP files under the hood, and they rely on the fact
that Python 2.4 is able to transparently import modules stored within ZIP
files. PyInstaller is currently *not* able to import and extract modules
within ZIP files, so code which uses extensions packaged as ``egg`` files
cannot be packaged with PyInstaller.

The workaround is pretty easy: you can use ``easy_install -Z`` at installation
time to ask ``easy_install`` to always decompress egg files. This will allow
PyInstaller to see the files and make the package correctly. If you have already
installed the modules, you can simply decompress them within a directory with
the same name of the ``egg`` file (including also the extension).

Support for ``egg`` files is planned for a future release of PyInstaller.

.. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools
.. _`easy_install`: http://peak.telecommunity.com/DevCenter/EasyInstall


|GOBACK|


PyInstaller Utilities
+++++++++++++++++++++

ArchiveViewer
-------------

::

      python ArchiveViewer.py <archivefile>


ArchiveViewer lets you examine the contents of any archive build with
|PyInstaller| or executable (PYZ, PKG or exe). Invoke it with the target as the
first arg (It has been set up as a Send-To so it shows on the context menu in
Explorer). The archive can be navigated using these commands:

O <nm>
    Open the embedded archive <nm> (will prompt if omitted).

U
    Go up one level (go back to viewing the embedding archive).

X <nm>
    Extract nm (will prompt if omitted). Prompts for output filename. If none
    given, extracted to stdout.

Q
    Quit.


|GOBACK|


bindepend
---------

::

    python bindepend.py <executable_or_dynamic_library>

bindepend will analyze the executable you pass to it, and write to stdout all
its binary dependencies. This is handy to find out which DLLs are required by
an executable or another DLL. This module is used by |PyInstaller| itself to
follow the chain of dependencies of binary extensions and make sure that all
of them get included in the final package.


GrabVersion (Windows)
---------------------

::

      python GrabVersion.py <executable_with_version_resource>


GrabVersion outputs text which can be eval'ed by ``versionInfo.py`` to reproduce
a version resource. Invoke it with the full path name of a Windows executable
(with a version resource) as the first argument. If you cut & paste (or
redirect to a file), you can then edit the version information. The edited
text file can be used in a ``version = myversion.txt`` option on any executable
in an |PyInstaller| spec file.

This was done in this way because version resources are rather strange beasts,
and fully understanding them is probably impossible. Some elements are
optional, others required, but you could spend unbounded amounts of time
figuring this out, because it's not well documented. When you view the version
tab on a properties dialog, there's no straightforward relationship between
how the data is displayed and the structure of the resource itself. So the
easiest thing to do is find an executable that displays the kind of
information you want, grab it's resource and edit it. Certainly easier than
the Version resource wizard in VC++.

|GOBACK|


Analyzing Dependencies
----------------------

You can interactively track down dependencies, including getting
cross-references by using ``mf.py``, documented in section `mf.py: A modulefinder
Replacement`_

|GOBACK|


Spec Files
++++++++++

Introduction
------------

Spec files are in Python syntax. They are evaluated by Build.py. A simplistic
spec file might look like this::

      a = Analysis(['myscript.py'])
      pyz = PYZ(a.pure)
      exe = EXE(pyz, a.scripts, a.binaries, name="myapp.exe")

This creates a single file deployment with all binaries (extension modules and
their dependencies) packed into the executable.

A simplistic single directory deployment might look like this::

      a = Analysis(['myscript.py'])
      pyz = PYZ(a.pure)
      exe = EXE(a.scripts, pyz, name="myapp.exe", exclude_binaries=1)
      dist = COLLECT(exe, a.binaries, name="dist")


Note that neither of these examples are realistic. Use ``Makespec.py`` (documented
in section `Create a spec file for your project`_) to create your specfile,
and tweak it (if necessary) from there.

All of the classes you see above are subclasses of ``Build.Target``. A Target acts
like a rule in a makefile. It knows enough to cache its last inputs and
outputs. If its inputs haven't changed, it can assume its outputs wouldn't
change on recomputation. So a spec file acts much like a makefile, only
rebuilding as much as needs rebuilding. This means, for example, that if you
change an ``EXE`` from ``debug=1`` to ``debug=0``, the rebuild will be nearly
instantaneous.

The high level view is that an ``Analysis`` takes a list of scripts as input,
and generates three "outputs", held in attributes named ``scripts``, ``pure``
and ``binaries``. A ``PYZ`` (a ``.pyz`` archive) is built from the modules in
pure. The ``EXE`` is built from the ``PYZ``, the scripts and, in the case of a
single-file deployment, the binaries. In a single-directory deployment, a
directory is built containing a slim executable and the binaries.

|GOBACK|

TOC Class (Table of Contents)
-----------------------------

Before you can do much with a spec file, you need to understand the
``TOC`` (Table Of Contents) class.

A ``TOC`` appears to be a list of tuples of the form (name, path, typecode).
In fact, it's an ordered set, not a list. A TOC contains no duplicates, where
uniqueness is based on name only. Furthermore, within this constraint, a TOC
preserves order.

Besides the normal list methods and operations, TOC supports taking differences
and intersections (and note that adding or extending is really equivalent to
union). Furthermore, the operations can take a real list of tuples on the right
hand side. This makes excluding modules quite easy. For a pure Python module::

      pyz = PYZ(a.pure - [('badmodule', '', '')])


or for an extension module in a single-directory deployment::

      dist = COLLECT(..., a.binaries - [('badmodule', '', '')], ...)


or for a single-file deployment::

      exe = EXE(..., a.binaries - [('badmodule', '', '')], ...)

To add files to a TOC, you need to know about the typecodes (or the step using
the TOC won't know what to do with the entry).

+---------------+-------------------------------------------------------+-----------------------+-------------------------------+
| **typecode** 	| **description**					| **name**		| **path**			|
+===============+=======================================================+=======================+===============================+
| 'EXTENSION' 	| An extension module.					| Python internal name.	| Full path name in build.	|
+---------------+-------------------------------------------------------+-----------------------+-------------------------------+
| 'PYSOURCE'	| A script.						| Python internal name.	| Full path name in build.	|
+---------------+-------------------------------------------------------+-----------------------+-------------------------------+
| 'PYMODULE'	| A pure Python module (including __init__ modules).	| Python internal name.	| Full path name in build.	|
+---------------+-------------------------------------------------------+-----------------------+-------------------------------+
| 'PYZ'		| A .pyz archive (archive_rt.ZlibArchive).		| Runtime name.		| Full path name in build.	|
+---------------+-------------------------------------------------------+-----------------------+-------------------------------+
| 'PKG'		| A pkg archive (carchive4.CArchive).			| Runtime name. 	| Full path name in build.	|
+---------------+-------------------------------------------------------+-----------------------+-------------------------------+
| 'BINARY' 	| A shared library. 					| Runtime name. 	| Full path name in build.	|
+---------------+-------------------------------------------------------+-----------------------+-------------------------------+
| 'DATA' 	| Aribitrary files. 					| Runtime name. 	| Full path name in build.	|
+---------------+-------------------------------------------------------+-----------------------+-------------------------------+
| 'OPTION' 	| A runtime runtime option (frozen into the executable).| The option.		| Unused.			|
+---------------+-------------------------------------------------------+-----------------------+-------------------------------+

You can force the include of any file in much the same way you do excludes::

      collect = COLLECT(a.binaries +
                [('readme', '/my/project/readme', 'DATA')], ...)


or even::

      collect = COLLECT(a.binaries,
                [('readme', '/my/project/readme', 'DATA')], ...)


(that is, you can use a list of tuples in place of a ``TOC`` in most cases).

There's not much reason to use this technique for ``PYSOURCE``, since an ``Analysis``
takes a list of scripts as input. For ``PYMODULEs`` and ``EXTENSIONs``, the hook
mechanism discussed here is better because you won't have to remember how you
got it working next time.

This technique is most useful for data files (see the ``Tree`` class below for a
way to build a ``TOC`` from a directory tree), and for runtime options. The options
the run executables understand are:

+---------------+-----------------------+-------------------------------+-------------------------------------------------------------------------------------------------------+
| **Option**	| **Description**	| **Example**			| **Notes**												|
+===============+=======================+===============================+=======================================================================================================+
| v 		| Verbose imports	| ('v', '', 'OPTION')		| Same as Python -v ... 										|
+---------------+-----------------------+-------------------------------+-------------------------------------------------------------------------------------------------------+
| u		| Unbuffered stdio	| ('u', '', 'OPTION')		| Same as Python -u ... 										|
+---------------+-----------------------+-------------------------------+-------------------------------------------------------------------------------------------------------+
| W spec	| Warning option	| ('W ignore', '', 'OPTION')	| Python 2.1+ only. 											|
+---------------+-----------------------+-------------------------------+-------------------------------------------------------------------------------------------------------+
| s		| Use site.py		| ('s', '', 'OPTION')		| The opposite of Python's -S flag. Note that site.py must be in the executable's directory to be used. |
+---------------+-----------------------+-------------------------------+-------------------------------------------------------------------------------------------------------+
| f		| Force execvp		| ('f', '', 'OPTION')		| Linux/unix only. Ensures that LD_LIBRARY_PATH is set properly.					|
+---------------+-----------------------+-------------------------------+-------------------------------------------------------------------------------------------------------+

Advanced users should note that by using set differences and intersections, it
becomes possible to factor out common modules, and deploy a project containing
multiple executables with minimal redundancy. You'll need some top level code
in each executable to mount the common ``PYZ``.

|GOBACK|

Target Subclasses
-----------------

Analysis
********

::

      Analysis(scripts, pathex=None, hookspath=None, excludes=None)


``scripts``
    a list of scripts specified as file names.

``pathex``
    an optional list of paths to be searched before sys.path.

``hookspath``
    an optional list of paths used to extend the hooks package.

``excludes``
    an optional list of module or package names (their Python names, not path
    names) that will be ignored (as though they were not found).

An Analysis has three outputs, all ``TOCs`` accessed as attributes of the ``Analysis``.

``scripts``
    The scripts you gave Analysis as input, with any runtime hook scripts
    prepended.

``pure``
    The pure Python modules.

``binaries``
    The extension modules and their dependencies. The secondary dependencies are
    filtered. On Windows, a long list of MS dlls are excluded. On Linux/Unix,
    any shared lib in ``/lib`` or ``/usr/lib`` is excluded.

|GOBACK|

PYZ
***

::

      PYZ(toc, name=None, level=9)


``toc``
    a ``TOC``, normally an ``Analysis.pure``.

``name``
    A filename for the ``.pyz``. Normally not needed, as the generated name will do fine.

``level``
    The Zlib compression level to use. If 0, the zlib module is not required.


|GOBACK|

PKG
***

Generally, you will not need to create your own ``PKGs``, as the ``EXE`` will do it for
you. This is one way to include read-only data in a single-file deployment,
however. A single-file deployment including TK support will use this technique.

::

      PKG(toc, name=None, cdict=None, exclude_binaries=0)


``toc``
    a ``TOC``.

``name``
    a filename for the ``PKG`` (optional).

``cdict``
    a dictionary that specifies compression by typecode. For example, ``PYZ`` is
    left uncompressed so that it can be accessed inside the ``PKG``. The default
    uses sensible values. If zlib is not available, no compression is used.

``exclude_binaries``
    If 1, ``EXTENSIONs`` and ``BINARYs`` will be left out of the ``PKG``, and
    forwarded to its container (usually a ``COLLECT``).

|GOBACK|

EXE
***

::

      EXE(*args, **kws)


``args``
    One or more arguments which are either ``TOCs`` or ``Targets``.

``kws``
    Possible keyword arguments:

    ``console``
        Always 1 on Linux/unix. On Windows, governs whether to use the console
        executable, or the Windows subsystem executable.

    ``debug``
        Setting to 1 gives you progress messages from the executable (for a
        ``console=0``, these will be annoying MessageBoxes).

    ``name``
        The filename for the executable.

    ``exclude_binaries``
        Forwarded to the ``PKG`` the ``EXE`` builds.

    ``icon``
        Windows NT family only. ``icon='myicon.ico'`` to use an icon file, or
        ``icon='notepad.exe,0'`` to grab an icon resource.

    ``version``
        Windows NT family only. ``version='myversion.txt'``. Use ``GrabVersion.py`` to
        steal a version resource from an executable, and then edit the ouput to
        create your own. (The syntax of version resources is so arcane that I
        wouldn't attempt to write one from scratch.)


There are actually two ``EXE`` classes - one for ELF platforms (where the
bootloader, that is the ``run`` executable, and the ``PKG`` are concatenated),
and one for non-ELF platforms (where the run executable is simply renamed, and
expects a ``exename.pkg`` in the same directory). Which class becomes available
as ``EXE`` is determined by a flag in ``config.dat``. This flag is set to
non-ELF when using ``Make.py -n``.

|GOBACK|

DLL
***

On Windows, this provides support for doing in-process COM servers. It is not
generalized. However, embedders can follow the same model to build a special
purpose DLL so the Python support in their app is hidden. You will need to
write your own dll, but thanks to Allan Green for refactoring the C code and
making that a managable task.

|GOBACK|

COLLECT
*******

::

      COLLECT(*args, **kws)


``args``
    One or more arguments which are either ``TOCs`` or ``Targets``.

``kws``
    Possible keyword arguments:

    ``name``
        The name of the directory to be built.

|GOBACK|

Tree
****

::

      Tree(root, prefix=None, excludes=None)


``root``
    The root of the tree (on the build system).

``prefix``
    Optional prefix to the names on the target system.

``excludes``
    A list of names to exclude. Two forms are allowed:

    ``name``
        files with this basename will be excluded (do not include the path).

    ``*.ext``
        any file with the given extension will be excluded.

Since a ``Tree`` is a ``TOC``, you can also use the exclude technique described above
in the section on ``TOCs``.


|GOBACK|

When Things Go Wrong
++++++++++++++++++++

Finding out What Went Wrong
---------------------------

Buildtime Warnings
******************

When an ``Analysis`` step runs, it produces a warnings file (named ``warnproject.txt``)
in the spec file's directory. Generally, most of these warnings are harmless.
For example, ``os.py`` (which is cross-platform) works by figuring out what
platform it is on, then importing (and rebinding names from) the appropriate
platform-specific module. So analyzing ``os.py`` will produce a set of warnings
like::

      W: no module named dos (conditional import by os)
      W: no module named ce (conditional import by os)
      W: no module named os2 (conditional import by os)


Note that the analysis has detected that the import is within a conditional
block (an if statement). The analysis also detects if an import within a
function or class, (delayed) or at the top level. A top-level, non-conditional
import failure is really a hard error. There's at least a reasonable chance
that conditional and / or delayed import will be handled gracefully at runtime.

Ignorable warnings may also be produced when a class or function is declared in
a package (an ``__init__.py`` module), and the import specifies
``package.name``. In this case, the analysis can't tell if name is supposed to
refer to a submodule of package.

Warnings are also produced when an ``__import__``, ``exec`` or ``eval`` statement is
encountered. The ``__import__`` warnings should almost certainly be investigated.
Both ``exec`` and ``eval`` can be used to implement import hacks, but usually their use
is more benign.

Any problem detected here can be handled by hooking the analysis of the module.
See `Listing Hidden Imports`_ below for how to do it.

|GOBACK|

Getting Debug Messages
**********************

Setting ``debug=1`` on an ``EXE`` will cause the executable to put out progress
messages (for console apps, these go to stdout; for Windows apps, these show as
MessageBoxes). This can be useful if you are doing complex packaging, or your
app doesn't seem to be starting, or just to learn how the runtime works.

|GOBACK|

Getting Python's Verbose Imports
********************************

You can also pass a ``-v`` (verbose imports) flag to the embedded Python. This can
be extremely useful. I usually try it even on apparently working apps, just to
make sure that I'm always getting my copies of the modules and no import has
leaked out to the installed Python.

You set this (like the other runtime options) by feeding a phone ``TOC`` entry to
the ``EXE``. The easiest way to do this is to change the ``EXE`` from::

       EXE(..., anal.scripts, ....)

to::

       EXE(..., anal.scripts + [('v', '', 'OPTION')], ...)

These messages will always go to ``stdout``, so you won't see them on Windows if
``console=0``.

|GOBACK|

Helping Installer Find Modules
------------------------------

Extending the Path
******************

When the analysis phase cannot find needed modules, it may be that the code is
manipulating ``sys.path``. The easiest thing to do in this case is tell ``Analysis``
about the new directory through the second arg to the constructor::

       anal = Analysis(['somedir/myscript.py'],
                       ['path/to/thisdir', 'path/to/thatdir'])


In this case, the ``Analysis`` will have a search path::

       ['somedir', 'path/to/thisdir', 'path/to/thatdir'] + sys.path


You can do the same when running ``Makespec.py``::

       Makespec.py --paths=path/to/thisdir;path/to/thatdir ...


(on \*nix, use ``:`` as the path separator).

|GOBACK|

Listing Hidden Imports
**********************

Hidden imports are fairly common. These can occur when the code is using
``__import__`` (or, perhaps ``exec`` or ``eval``), in which case you will see a warning in
the ``warnproject.txt`` file. They can also occur when an extension module uses the
Python/C API to do an import, in which case Analysis can't detect anything. You
can verify that hidden import is the problem by using Python's verbose imports
flag. If the import messages say "module not found", but the ``warnproject.txt``
file has no "no module named..." message for the same module, then the problem
is a hidden import.

.. sidebar:: Standard hidden imports are already included!

    If you are getting worried while reading this paragraph, do not worry:
    having hidden imports is the exception, not the norm! And anyway,
    PyInstaller already ships with a large set of hooks that take care of
    hidden imports for the most common packages out there. For instance,
    PIL_, PyWin32_, PyQt_ are already taken care of.

Hidden imports are handled by hooking the module (the one doing the hidden
imports) at ``Analysis`` time. Do this by creating a file named ``hook-module.py``
(where module is the fully-qualified Python name, eg, ``hook-xml.dom.py``), and
placing it in the ``hooks`` package under |PyInstaller|'s root directory,
(alternatively, you can save it elsewhere, and then use the ``hookspath`` arg to
``Analysis`` so your private hooks directory will be searched). Normally, it will
have only one line::

      hiddenimports = ['module1', 'module2']

When the ``Analysis`` finds this file, it will proceed exactly as though the module
explicitly imported ``module1`` and ``module2``. (Full details on the analysis-time
hook mechanism is in the `Hooks`_ section).

If you successfully hook a publicly distributed module in this way, please send
us the hook so we can make it available to others.

|GOBACK|

Extending a Package's ``__path__``
**********************************

Python allows a package to extend the search path used to find modules and
sub-packages through the ``__path__`` mechanism. Normally, a package's ``__path__`` has
only one entry - the directory in which the ``__init__.py`` was found. But
``__init__.py`` is free to extend its ``__path__`` to include other directories. For
example, the ``win32com.shell.shell`` module actually resolves to
``win32com/win32comext/shell/shell.pyd``. This is because ``win32com/__init__.py``
appends ``../win32comext`` to its ``__path__``.

Because the ``__init__.py`` is not actually run during an analysis, we use the same
hook mechanism we use for hidden imports. A static list of names won't do,
however, because the new entry on ``__path__`` may well require computation. So
``hook-module.py`` should define a method ``hook(mod)``. The mod argument is an
instance of ``mf.Module`` which has (more or less) the same attributes as a real
module object. The hook function should return a ``mf.Module`` instance - perhaps
a brand new one, but more likely the same one used as an arg, but mutated.
See `mf.py: A Modulefinder Replacement`_ for details, and `hooks\/hook-win32com.py`_
for an example.

Note that manipulations of ``__path__`` hooked in this way apply to the analysis,
and only the analysis. That is, at runtime ``win32com.shell`` is resolved the same
way as ``win32com.anythingelse``, and ``win32com.__path__`` knows nothing of ``../win32comext``.

Once in awhile, that's not enough.

|GOBACK|

Changing Runtime Behavior
*************************

More bizarre situations can be accomodated with runtime hooks. These are small
scripts that manipulate the environment before your main script runs,
effectively providing additional top-level code to your script.

At the tail end of an analysis, the module list is examined for matches in
``rthooks.dat``, which is the string representation of a Python dictionary. The
key is the module name, and the value is a list of hook-script pathnames.

So putting an entry::

       'somemodule': ['path/to/somescript.py'],

into ``rthooks.dat`` is almost the same thing as doing this::

       anal = Analysis(['path/to/somescript.py', 'main.py'], ...


except that in using the hook, ``path/to/somescript.py`` will not be analyzed,
(that's not a feature - we just haven't found a sane way fit the recursion into
my persistence scheme).

Hooks done in this way, while they need to be careful of what they import, are
free to do almost anything. One provided hook sets things up so that win32com
can generate modules at runtime (to disk), and the generated modules can be
found in the win32com package.

|GOBACK|

Adapting to being "frozen"
**************************

In most sophisticated apps, it becomes necessary to figure out (at runtime)
whether you're running "live" or "frozen". For example, you might have a
configuration file that (running "live") you locate based on a module's
``__file__`` attribute. That won't work once the code is packaged up. You'll
probably want to look for it based on ``sys.executable`` instead.

The bootloaders set ``sys.frozen=1`` (and, for in-process COM servers, the
embedding DLL sets ``sys.frozen='dll'``).

For really advanced users, you can access the ``iu.ImportManager`` as
``sys.importManager``. See `iu.py`_ for how you might make use of this fact.

|GOBACK|

Accessing Data Files
********************

In a ``--onedir`` distribution, this is easy: pass a list of your data files
(in ``TOC`` format) to the ``COLLECT``, and they will show up in the distribution
directory tree. The name in the ``(name, path, 'DATA')`` tuple can be a relative
path name. Then, at runtime, you can use code like this to find the file::

       os.path.join(os.path.dirname(sys.executable), relativename))


In a ``--onefile``, it's a bit trickier. You can cheat, and add the files to the
``EXE`` as ``BINARY``. They will then be extracted at runtime into the work directory
by the C code (which does not create directories, so the name must be a plain
name), and cleaned up on exit. The work directory is best found by
``os.environ['_MEIPASS2']``. Be awawre, though, that if you use ``--strip`` or ``--upx``,
strange things may happen to your data - ``BINARY`` is really for shared
libs / dlls.

If you add them as ``'DATA'`` to the ``EXE``, then it's up to you to extract them. Use
code like this::

       import sys, carchive
       this = carchive.CArchive(sys.executable)
       data = this.extract('mystuff')[1]


to get the contents as a binary string. See `support\/unpackTK.py`_ for an advanced
example (the TCL and TK lib files are in a PKG which is opened in place, and
then extracted to the filesystem).

|GOBACK|

Miscellaneous
+++++++++++++

Pmw -- Python Mega Widgets
--------------------------

`Pmw`_ comes with a script named ``bundlepmw`` in the bin directory. If you follow the
instructions in that script, you'll end up with a module named ``Pmw.py``. Ensure
that Builder finds that module and not the development package.

|GOBACK|

Win9xpopen
----------

If you're using popen on Windows and want the code to work on Win9x, you'll
need to distribute ``win9xpopen.exe`` with your app. On older Pythons with
Win32all, this would apply to Win32pipe and ``win32popenWin9x.exe``. (On yet older
Pythons, no form of popen worked on Win9x).

|GOBACK|

Self-extracting executables
---------------------------

The ELF executable format (Windows, Linux and some others) allows arbitrary
data to be concatenated to the end of the executable without disturbing its
functionality. For this reason, a ``CArchive``'s Table of Contents is at the end of
the archive. The executable can open itself as a binary file name, seek to the
end and 'open' the ``CArchive`` (see figure 3).

On other platforms, the archive and the executable are separate, but the
archive is named ``executable.pkg``, and expected to be in the same directory.
Other than that, the process is the same.

|GOBACK|

One Pass Execution
******************

In a single directory deployment (``--onedir``, which is the default), all of the
binaries are already in the file system. In that case, the embedding app:

* opens the archive

* starts Python (on Windows, this is done with dynamic loading so one embedding
  app binary can be used with any Python version)

* imports all the modules which are at the top level of the archive (basically,
  bootstraps the import hooks)

* mounts the ``ZlibArchive(s)`` in the outer archive

* runs all the scripts which are at the top level of the archive

* finalizes Python

|GOBACK|

Two Pass Execution
******************

There are a couple situations which require two passes:

* a ``--onefile`` deployment (on Windows, the files can't be cleaned up afterwards
  because Python does not call ``FreeLibrary``; on other platforms, Python won't
  find them if they're extracted in the same process that uses them)

* ``LD_LIBRARY_PATH`` needs to be set to find the binaries (not extension modules,
  but modules the extensions are linked to).

The first pass:

* opens the archive

* extracts all the binaries in the archive (in |PyInstallerVersion|, this is always to a
  temporary directory).

* sets a magic environment variable

* sets ``LD_LIBRARY_PATH`` (non-Windows)

* executes itself as a child process (letting the child use his stdin, stdout
  and stderr)

* waits for the child to exit (on \*nix, the child actually replaces the parent)

* cleans up the extracted binaries (so on \*nix, this is done by the child)

The child process executes as in `One Pass Execution`_ above (the magic
environment variable is what tells it that this is pass two).

|SE_exeImage| figure 3 - Self Extracting Executable

There are, of course, quite a few differences between the Windows and
Unix/Linux versions. The major one is that because all of Python on Windows is
in ``pythonXX.dll``, and dynamic loading is so simple-minded, that one binary can
be use with any version of Python. There's much in common, though, and that C
code can be found in `source/common/launch.c`_.

The Unix/Linux build process (which you need to run just once for any version
of Python) makes use of the config information in your install (if you
installed from RPM, you need the Python-development RPM). It also overrides
``getpath.c`` since we don't want it hunting around the filesystem to build
``sys.path``.

In both cases, while one |PyInstaller| download can be used with any Python
version, you need to have separate installations for each Python version.

|GOBACK|

PyInstaller Archives
++++++++++++++++++++

Archives Introduction
---------------------
You know what an archive is: a ``.tar`` file, a ``.jar`` file, a ``.zip`` file. Two kinds
of archives are used here. One is equivalent to a Java ``.jar`` file - it allows
Python modules to be stored efficiently and, (with some import hooks) imported
directly. This is a ``ZlibArchive``. The other (a ``CArchive``) is equivalent to a
``.zip`` file - a general way of packing up (and optionally compressing) arbitrary
blobs of data. It gets its name from the fact that it can be manipulated easily
from C, as well as from Python. Both of these derive from a common base class,
making it fairly easy to create new kinds of archives.

|GOBACK|

``ZlibArchive``
---------------
A ``ZlibArchive`` contains compressed ``.pyc`` (or ``.pyo``) files. The Table of Contents
is a marshalled dictionary, with the key (the module's name as given in an
``import`` statement) associated with a seek position and length. Because it is
all marshalled Python, ``ZlibArchives`` are completely cross-platform.

A ``ZlibArchive`` hooks in with `iu.py`_ so that, with a little setup, the archived
modules can be imported transparently. Even with compression at level 9, this
works out to being faster than the normal import. Instead of searching
``sys.path``, there's a lookup in the dictionary. There's no ``stat``-ing of the ``.py``
and ``.pyc`` and no file opens (the file is already open). There's just a seek, a
read and a decompress. A traceback will point to the source file the archive
entry was created from (the ``__file__`` attribute from the time the ``.pyc`` was
compiled). On a user's box with no source installed, this is not terribly
useful, but if they send you the traceback, at least you can make sense of it.

|ZlibArchiveImage|

|GOBACK|

``CArchive``
------------
A ``CArchive`` contains whatever you want to stuff into it. It's very much like a
``.zip`` file. They are easy to create in Python and unpack from C code. ``CArchives``
can be appended to other files (like ELF and COFF executables, for example).
To allow this, they are opened from the end, so the ``TOC`` for a ``CArchive`` is at
the back, followed only by a cookie that tells you where the ``TOC`` starts and
where the archive itself starts.

``CArchives`` can also be embedded within other ``CArchives``. The inner archive can be
opened in place (without extraction).

Each ``TOC`` entry is variable length. The first field in the entry tells you the
length of the entry. The last field is the name of the corresponding packed
file. The name is null terminated. Compression is optional by member.

There is also a type code associated with each entry. If you're using a
``CArchive`` as a ``.zip`` file, you don't need to worry about this. The type codes
are used by the self-extracting executables.

|CArchiveImage|

|GOBACK|


License
+++++++
PyInstaller is mainly distributed  under the
`GPL License <http://pyinstaller.hpcf.upr.edu/pyinstaller/browser/trunk/doc/LICENSE.GPL?rev=latest>`_
but it has an exception such that you can use it to compile commercial products.

In a nutshell, the license is GPL for the source code with the exception that:

 #. You may use PyInstaller to compile commercial applications out of your
    source code.

 #. The resulting binaries generated by PyInstaller from your source code can be
    shipped with whatever license you want.

 #. You may modify PyInstaller for your own needs but *these* changes to the
    PyInstaller source code falls under the terms of the GPL license. In other
    words, any modifications to will *have* to be distributed under GPL.

For updated information or clarification see our
`FAQ <http://pyinstaller.hpcf.upr.edu/pyinstaller/wiki/FAQ>`_ at `PyInstaller`_
home page: http://pyinstaller.hpcf.upr.edu



|GOBACK|

Appendix
++++++++

.. sidebar:: You can stop reading here...

    ... if you are not interested in technical details. This appendix contains
    insights of the internal workings of |PyInstaller|, and you do not need this
    information unless you plan to work on |PyInstaller| itself.


``mf.py``: A Modulefinder Replacement
-------------------------------------

Module ``mf`` is modelled after ``iu``.

It also uses ``ImportDirectors`` and ``Owners`` to partition the import name space.
Except for the fact that these return ``Module`` instances instead of real module
objects, they are identical.

Instead of an ``ImportManager``, ``mf`` has an ``ImportTracker`` managing things.

|GOBACK|

ImportTracker
*************

``ImportTracker`` can be called in two ways: ``analyze_one(name, importername=None)``
or ``analyze_r(name, importername=None)``. The second method does what modulefinder
does - it recursively finds all the module names that importing name would
cause to appear in ``sys.modules``. The first method is non-recursive. This is
useful, because it is the only way of answering the question "Who imports
name?" But since it is somewhat unrealistic (very few real imports do not
involve recursion), it deserves some explanation.

|GOBACK|

``analyze_one()``
*****************

When a name is imported, there are structural and dynamic effects. The dynamic
effects are due to the execution of the top-level code in the module (or
modules) that get imported. The structural effects have to do with whether the
import is relative or absolute, and whether the name is a dotted name (if there
are N dots in the name, then N+1 modules will be imported even without any code
running).

The analyze_one method determines the structural effects, and defers the
dynamic effects. For example, ``analyze_one("B.C", "A")`` could return ``["B", "B.C"]``
or ``["A.B", "A.B.C"]`` depending on whether the import turns out to be relative or
absolute. In addition, ImportTracker's modules dict will have Module instances
for them.

|GOBACK|

Module Classes
**************

There are Module subclasses for builtins, extensions, packages and (normal)
modules. Besides the normal module object attributes, they have an attribute
imports. For packages and normal modules, imports is a list populated by
scanning the code object (and therefor, the names in this list may be relative
or absolute names - we don't know until they have been analyzed).

The highly astute will notice that there is a hole in ``analyze_one()`` here. The
first thing that happens when ``B.C`` is being imported is that ``B`` is imported and
it's top-level code executed. That top-level code can do various things so that
when the import of ``B.C`` finally occurs, something completely different happens
(from what a structural analysis would predict). But mf can handle this through
it's hooks mechanism.

|GOBACK|

code scanning
*************

Like modulefinder, ``mf`` scans the byte code of a module, looking for imports. In
addition, ``mf`` will pick out a module's ``__all__`` attribute, if it is built as a
list of constant names. This means that if a package declares an ``__all__`` list
as a list of names, ImportTracker will track those names if asked to analyze
``package.*``. The code scan also notes the occurance of ``__import__``, ``exec`` and ``eval``,
and can issue warnings when they're found.

The code scanning also keeps track (as well as it can) of the context of an
import. It recognizes when imports are found at the top-level, and when they
are found inside definitions (deferred imports). Within that, it also tracks
whether the import is inside a condition (conditional imports).

|GOBACK|

Hooks
*****

In modulefinder, scanning the code takes the place of executing the code
object. ``mf`` goes further and allows a module to be hooked (after it has been
scanned, but before analyze_one is done with it). A hook is a module named
``hook-fullyqualifiedname`` in the ``hooks`` package. These modules should have one or
more of the following three global names defined:

``hiddenimports``
    a list of modules names (relative or absolute) that the module imports in some untrackable way.

``attrs``
    a list of ``(name, value)`` pairs (where value is normally meaningless).

``hook(mod)``
    a function taking a ``Module`` instance and returning a ``Module`` instance (so it can modify or replace).


The first hook (``hiddenimports``) extends the list created by scanning the code.
``ExtensionModules``, of course, don't get scanned, so this is the only way of
recording any imports they do.

The second hook (``attrs``) exists mainly so that ImportTracker won't issue
spurious warnings when the rightmost node in a dotted name turns out to be an
attribute in a package module, instead of a missing submodule.

The callable hook exists for things like dynamic modification of a package's
``__path__`` or perverse situations, like ``xml.__init__`` replacing itself in
``sys.modules`` with ``_xmlplus.__init__``. (It takes nine hook modules to properly
trace through PyXML-using code, and I can't believe that it's any easier for
the poor programmer using that package). The ``hook(mod)`` (if it exists) is
called before looking at the others - that way it can, for example, test
``sys.version`` and adjust what's in ``hiddenimports``.

|GOBACK|

Warnings
********

``ImportTracker`` has a ``getwarnings()`` method that returns all the warnings
accumulated by the instance, and by the ``Module`` instances in its modules dict.
Generally, it is ``ImportTracker`` who will accumulate the warnings generated
during the structural phase, and ``Modules`` that will get the warnings generated
during the code scan.

Note that by using a hook module, you can silence some particularly tiresome
warnings, but not all of them.

|GOBACK|

Cross Reference
***************

Once a full analysis (that is, an ``analyze_r`` call) has been done, you can get a
cross reference by using ``getxref()``. This returns a list of tuples. Each tuple
is ``(modulename, importers)``, where importers is a list of the (fully qualified)
names of the modules importing ``modulename``. Both the returned list and the
importers list are sorted.

|GOBACK|

Usage
*****

A simple example follows:

      >>> import mf
      >>> a = mf.ImportTracker()
      >>> a.analyze_r("os")
      ['os', 'sys', 'posixpath', 'nt', 'stat', 'string', 'strop',
      're', 'pcre', 'ntpath', 'dospath', 'macpath', 'win32api',
      'UserDict', 'copy', 'types', 'repr', 'tempfile']
      >>> a.analyze_one("os")
      ['os']
      >>> a.modules['string'].imports
      [('strop', 0, 0), ('strop.*', 0, 0), ('re', 1, 1)]
      >>>


The tuples in the imports list are (name, delayed, conditional).

      >>> for w in a.modules['string'].warnings: print w
      ...
      W: delayed  eval hack detected at line 359
      W: delayed  eval hack detected at line 389
      W: delayed  eval hack detected at line 418
      >>> for w in a.getwarnings(): print w
      ...
      W: no module named pwd (delayed, conditional import by posixpath)
      W: no module named dos (conditional import by os)
      W: no module named os2 (conditional import by os)
      W: no module named posix (conditional import by os)
      W: no module named mac (conditional import by os)
      W: no module named MACFS (delayed, conditional import by tempfile)
      W: no module named macfs (delayed, conditional import by tempfile)
      W: top-level conditional exec statment detected at line 47
         - os (C:\Program Files\Python\Lib\os.py)
      W: delayed  eval hack detected at line 359
         - string (C:\Program Files\Python\Lib\string.py)
      W: delayed  eval hack detected at line 389
         - string (C:\Program Files\Python\Lib\string.py)
      W: delayed  eval hack detected at line 418
         - string (C:\Program Files\Python\Lib\string.py)
      >>>


|GOBACK|


.. _iu.py:

``iu.py``: An *imputil* Replacement
-----------------------------------

Module ``iu`` grows out of the pioneering work that Greg Stein did with ``imputil``
(actually, it includes some verbatim ``imputil`` code, but since Greg didn't
copyright it, we won't mention it). Both modules can take over Python's
builtin import and ease writing of at least certain kinds of import hooks.

``iu`` differs from ``imputil``:
* faster
* better emulation of builtin import
* more managable

There is an ``ImportManager`` which provides the replacement for builtin import
and hides all the semantic complexities of a Python import request from it's
delegates.

|GOBACK|

``ImportManager``
*****************

``ImportManager`` formalizes the concept of a metapath. This concept implicitly
exists in native Python in that builtins and frozen modules are searched
before ``sys.path``, (on Windows there's also a search of the registry while on
Mac, resources may be searched). This metapath is a list populated with
``ImportDirector`` instances. There are ``ImportDirector`` subclasses for builtins,
frozen modules, (on Windows) modules found through the registry and a
``PathImportDirector`` for handling ``sys.path``. For a top-level import (that is, not
an import of a module in a package), ``ImportManager`` tries each director on it's
metapath until one succeeds.

``ImportManager`` hides the semantic complexity of an import from the directors.
It's up to the ``ImportManager`` to decide if an import is relative or absolute;
to see if the module has already been imported; to keep ``sys.modules`` up to
date; to handle the fromlist and return the correct module object.

|GOBACK|

``ImportDirector``
******************

An ``ImportDirector`` just needs to respond to ``getmod(name)`` by returning a module
object or ``None``. As you will see, an ``ImportDirector`` can consider name to be
atomic - it has no need to examine name to see if it is dotted.

To see how this works, we need to examine the ``PathImportDirector``.

|GOBACK|

``PathImportDirector``
**********************

The ``PathImportDirector`` subclass manages a list of names - most notably,
``sys.path``. To do so, it maintains a shadowpath - a dictionary mapping the names
on its pathlist (eg, ``sys.path``) to their associated ``Owners``. (It could do this
directly, but the assumption that sys.path is occupied solely by strings seems
ineradicable.) ``Owners`` of the appropriate kind are created as needed (if all
your imports are satisfied by the first two elements of ``sys.path``, the
``PathImportDirector``'s shadowpath will only have two entries).

|GOBACK|

``Owner``
*********

An ``Owner`` is much like an ``ImportDirector`` but manages a much more concrete piece
of turf. For example, a ``DirOwner`` manages one directory. Since there are no
other officially recognized filesystem-like namespaces for importing, that's
all that's included in iu, but it's easy to imagine ``Owners`` for zip files
(and I have one for my own ``.pyz`` archive format) or even URLs.

As with ``ImportDirectors``, an ``Owner`` just needs to respond to ``getmod(name)`` by
returning a module object or ``None``, and it can consider name to be atomic.

So structurally, we have a tree, rooted at the ``ImportManager``. At the next
level, we have a set of ``ImportDirectors``. At least one of those directors, the
``PathImportDirector`` in charge of ``sys.path``, has another level beneath it,
consisting of ``Owners``. This much of the tree covers the entire top-level import
namespace.

The rest of the import namespace is covered by treelets, each rooted in a
package module (an ``__init__.py``).

|GOBACK|

Packages
********

To make this work, ``Owners`` need to recognize when a module is a package. For a
``DirOwner``, this means that name is a subdirectory which contains an ``__init__.py``.
The ``__init__`` module is loaded and its ``__path__`` is initialized with the
subdirectory. Then, a ``PathImportDirector`` is created to manage this ``__path__``.
Finally the new ``PathImportDirector``'s ``getmod`` is assigned to the package's
``__importsub__`` function.

When a module within the package is imported, the request is routed (by the
``ImportManager``) diretly to the package's ``__importsub__``. In a hierarchical
namespace (like a filesystem), this means that ``__importsub__`` (which is really
the bound getmod method of a ``PathImportDirector`` instance) needs only the
module name, not the package name or the fully qualified name. And that's
exactly what it gets. (In a flat namespace - like most archives - it is
perfectly easy to route the request back up the package tree to the archive
``Owner``, qualifying the name at each step.)

|GOBACK|

Possibilities
*************

Let's say we want to import from zip files. So, we subclass ``Owner``. The
``__init__`` method should take a filename, and raise a ``ValueError`` if the file is
not an acceptable ``.zip`` file, (when a new name is encountered on ``sys.path`` or a
package's ``__path__``, registered Owners are tried until one accepts the name).
The ``getmod`` method would check the zip file's contents and return ``None`` if the
name is not found. Otherwise, it would extract the marshalled code object from
the zip, create a new module object and perform a bit of initialization (12
lines of code all told for my own archive format, including initializing a pack
age with it's ``__subimporter__``).

Once the new ``Owner`` class is registered with ``iu``, you can put a zip file on
``sys.path``. A package could even put a zip file on its ``__path__``.

|GOBACK|

Compatibility
*************

This code has been tested with the PyXML, mxBase and Win32 packages, covering
over a dozen import hacks from manipulations of ``__path__`` to replacing a module
in ``sys.modules`` with a different one. Emulation of Python's native import is
nearly exact, including the names recorded in ``sys.modules`` and module attributes
(packages imported through ``iu`` have an extra attribute - ``__importsub__``).

|GOBACK|

Performance
***********

In most cases, ``iu`` is slower than builtin import (by 15 to 20%) but faster than
``imputil`` (by 15 to 20%). By inserting archives at the front of ``sys.path``
containing the standard lib and the package being tested, this can be reduced
to 5 to 10% slower (or, on my 1.52 box, 10% faster!) than builtin import. A bit
more can be shaved off by manipulating the ``ImportManager``'s metapath.

|GOBACK|

Limitations
***********

This module makes no attempt to facilitate policy import hacks. It is easy to
implement certain kinds of policies within a particular domain, but
fundamentally iu works by dividing up the import namespace into independent
domains.

Quite simply, I think cross-domain import hacks are a very bad idea. As author
of the original package on which |PyInstaller| is based, McMillan worked with
import hacks for many years. Many of them are highly fragile; they often rely
on undocumented (maybe even accidental) features of implementation.
A cross-domain import hack is not likely to work with PyXML, for example.

That rant aside, you can modify ``ImportManger`` to implement different policies.
For example, a version that implements three import primitives: absolute
import, relative import and recursive-relative import. No idea what the Python
syntax for those should be, but ``__aimport__``, ``__rimport__`` and ``__rrimport__`` were
easy to implement.


Usage
*****

Here's a simple example of using ``iu`` as a builtin import replacement.

      >>> import iu
      >>> iu.ImportManager().install()
      >>>
      >>> import DateTime
      >>> DateTime.__importsub__
      <method PathImportDirector.getmod
        of PathImportDirector instance at 825900>
      >>>

|GOBACK|

.. _PyInstaller: http://pyinstaller.hpcf.upr.edu/pyinstaller
.. _Roadmap: http://pyinstaller.hpcf.upr.edu/pyinstaller/roadmap
.. _`Submit a Bug`: http://pyinstaller.hpcf.upr.edu/pyinstaller/newticket
.. _Scons: http://www.scons.org
.. _hooks\/hook-win32com.py: http://pyinstaller.hpcf.upr.edu/pyinstaller/browser/trunk/hooks/hook-win32com.py?rev=latest
.. _support\/unpackTK.py: http://pyinstaller.hpcf.upr.edu/pyinstaller/browser/trunk/support/unpackTK.py?rev=latest
.. _source/common/launch.c: http://pyinstaller.hpcf.upr.edu/pyinstaller/browser/trunk/source/common/launch.c?rev=latest
.. _Pmw: http://pmw.sourceforge.net/
.. _PIL: http://www.pythonware.com/products/pil/
.. _PyQt: http://www.riverbankcomputing.co.uk/pyqt/index.php
.. _PyWin32: http://starship.python.net/crew/mhammond/win32/
.. |ZlibArchiveImage| image:: images/ZlibArchive.png
.. |CArchiveImage| image:: images/CArchive.png
.. |SE_exeImage| image:: images/SE_exe.png
.. |PyInstaller| replace:: PyInstaller
.. |PyInstallerVersion| replace:: PyInstaller v1.0
.. |InitialVersion| replace:: v1.0
.. |install_path| replace:: /your/path/to/pyinstaller/
.. |GOBACK| replace:: `Back to Top`_
.. _`Back to Top`: `PyInstaller Manual`_
author	noe\swadi
	Thu, 18 Feb 2010 12:29:02 +0530
changeset 1	22878952f6e2
permissions	-rw-r--r--