src/3rdparty/ptmalloc/README
changeset 0 1918ee327afb
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/src/3rdparty/ptmalloc/README	Mon Jan 11 14:00:40 2010 +0000
@@ -0,0 +1,186 @@
+ptmalloc3 - a multi-thread malloc implementation
+================================================
+
+Wolfram Gloger (wg@malloc.de)
+
+Jan 2006
+
+
+Thanks
+======
+
+This release was partly funded by Pixar Animation Studios.  I would
+like to thank David Baraff of Pixar for his support and Doug Lea
+(dl@cs.oswego.edu) for the great original malloc implementation.
+
+
+Introduction
+============
+
+This package is a modified version of Doug Lea's malloc-2.8.3
+implementation (available seperately from ftp://g.oswego.edu/pub/misc)
+that I adapted for multiple threads, while trying to avoid lock
+contention as much as possible.
+
+As part of the GNU C library, the source files may be available under
+the GNU Library General Public License (see the comments in the
+files).  But as part of this stand-alone package, the code is also
+available under the (probably less restrictive) conditions described
+in the file 'COPYRIGHT'.  In any case, there is no warranty whatsoever
+for this package.
+
+The current distribution should be available from:
+
+http://www.malloc.de/malloc/ptmalloc3.tar.gz
+
+
+Compilation
+===========
+
+It should be possible to build ptmalloc3 on any UN*X-like system that
+implements the sbrk(), mmap(), munmap() and mprotect() calls.  Since
+there are now several source files, a library (libptmalloc3.a) is
+generated.  See the Makefile for examples of the compile-time options.
+
+Note that support for non-ANSI compilers is no longer there.
+
+Several example targets are provided in the Makefile:
+
+ o Posix threads (pthreads), compile with "make posix"
+
+ o Posix threads with explicit initialization, compile with
+   "make posix-explicit" (known to be required on HPUX)
+
+ o Posix threads without "tsd data hack" (see below), compile with
+   "make posix-with-tsd"
+
+ o Solaris threads, compile with "make solaris"
+
+ o SGI sproc() threads, compile with "make sproc"
+
+ o no threads, compile with "make nothreads" (currently out of order?)
+
+For Linux:
+
+ o make "linux-pthread" (almost the same as "make posix") or
+   make "linux-shared"
+
+Note that some compilers need special flags for multi-threaded code,
+e.g. with Solaris cc with Posix threads, one should use:
+
+% make posix SYS_FLAGS='-mt'
+
+Some additional targets, ending in `-libc', are also provided in the
+Makefile, to compare performance of the test programs to the case when
+linking with the standard malloc implementation in libc.
+
+A potential problem remains: If any of the system-specific functions
+for getting/setting thread-specific data or for locking a mutex call
+one of the malloc-related functions internally, the implementation
+cannot work at all due to infinite recursion.  One example seems to be
+Solaris 2.4.  I would like to hear if this problem occurs on other
+systems, and whether similar workarounds could be applied.
+
+For Posix threads, too, an optional hack like that has been integrated
+(activated when defining USE_TSD_DATA_HACK) which depends on
+`pthread_t' being convertible to an integral type (which is of course
+not generally guaranteed).  USE_TSD_DATA_HACK is now the default
+because I haven't yet found a non-glibc pthreads system where this
+hack is _not_ needed.
+
+*NEW* and _important_: In (currently) one place in the ptmalloc3
+source, a write memory barrier is needed, named
+atomic_write_barrier().  This macro needs to be defined at the end of
+malloc-machine.h.  For gcc, a fallback in the form of a full memory
+barrier is already defined, but you may need to add another definition
+if you don't use gcc.
+
+Usage
+=====
+
+Just link libptmalloc3 into your application.
+
+Some wicked systems (e.g. HPUX apparently) won't let malloc call _any_
+thread-related functions before main().  On these systems,
+USE_STARTER=2 must be defined during compilation (see "make
+posix-explicit" above) and the global initialization function
+ptmalloc_init() must be called explicitly, preferably at the start of
+main().
+
+Otherwise, when using ptmalloc3, no special precautions are necessary.
+
+Link order is important
+=======================
+
+On some systems, when overriding malloc and linking against shared
+libraries, the link order becomes very important.  E.g., when linking
+C++ programs on Solaris with Solaris threads [this is probably now
+obsolete], don't rely on libC being included by default, but instead
+put `-lthread' behind `-lC' on the command line:
+
+  CC ... libptmalloc3.a -lC -lthread
+
+This is because there are global constructors in libC that need
+malloc/ptmalloc, which in turn needs to have the thread library to be
+already initialized.
+
+Debugging hooks
+===============
+
+All calls to malloc(), realloc(), free() and memalign() are routed
+through the global function pointers __malloc_hook, __realloc_hook,
+__free_hook and __memalign_hook if they are not NULL (see the malloc.h
+header file for declarations of these pointers).  Therefore the malloc
+implementation can be changed at runtime, if care is taken not to call
+free() or realloc() on pointers obtained with a different
+implementation than the one currently in effect.  (The easiest way to
+guarantee this is to set up the hooks before any malloc call, e.g.
+with a function pointed to by the global variable
+__malloc_initialize_hook).
+
+You can now also tune other malloc parameters (normally adjused via
+mallopt() calls from the application) with environment variables:
+
+    MALLOC_TRIM_THRESHOLD_    for deciding to shrink the heap (in bytes)
+
+    MALLOC_GRANULARITY_       The unit for allocating and deallocating
+    MALLOC_TOP_PAD_           memory from the system.  The default
+                              is 64k and this parameter _must_ be a
+                              power of 2.
+
+    MALLOC_MMAP_THRESHOLD_    min. size for chunks allocated via
+                              mmap() (in bytes)
+
+Tests
+=====
+
+Two testing applications, t-test1 and t-test2, are included in this
+source distribution.  Both perform pseudo-random sequences of
+allocations/frees, and can be given numeric arguments (all arguments
+are optional):
+
+% t-test[12] <n-total> <n-parallel> <n-allocs> <size-max> <bins>
+
+    n-total = total number of threads executed (default 10)
+    n-parallel = number of threads running in parallel (2)
+    n-allocs = number of malloc()'s / free()'s per thread (10000)
+    size-max = max. size requested with malloc() in bytes (10000)
+    bins = number of bins to maintain
+
+The first test `t-test1' maintains a completely seperate pool of
+allocated bins for each thread, and should therefore show full
+parallelism.  On the other hand, `t-test2' creates only a single pool
+of bins, and each thread randomly allocates/frees any bin.  Some lock
+contention is to be expected in this case, as the threads frequently
+cross each others arena.
+
+Performance results from t-test1 should be quite repeatable, while the
+behaviour of t-test2 depends on scheduling variations.
+
+Conclusion
+==========
+
+I'm always interested in performance data and feedback, just send mail
+to ptmalloc@malloc.de.
+
+Good luck!