src/3rdparty/clucene/README
changeset 0 1918ee327afb
equal deleted inserted replaced
-1:000000000000 0:1918ee327afb
       
     1 CLucene README
       
     2 ==============
       
     3 
       
     4 ------------------------------------------------------
       
     5 CLucene is a C++ port of Lucene.
       
     6 It is a high-performance, full-featured text search 
       
     7 engine written in C++. CLucene is faster than lucene
       
     8 as it is written in C++.
       
     9 ------------------------------------------------------
       
    10 
       
    11 CLucene has contributions from many, see AUTHORS
       
    12 
       
    13 CLucene is distributed under the GNU Lesser General Public License (LGPL) 
       
    14 	*or*
       
    15 the Apache License, Version 2.0
       
    16 See the LGPL.license and APACHE.license for the respective license information.
       
    17 Read COPYING for more about the license.
       
    18 
       
    19 Installation
       
    20 ------------
       
    21 * For Linux, MacOSX, cygwin and MinGW build information, read INSTALL.
       
    22 * Boost.Jam files are provided in the root directory and subdirectories.
       
    23 * Microsoft Visual Studio (6&7) are provided in the win32 folder.
       
    24 
       
    25 Mailing List
       
    26 ------------
       
    27 Questions and discussion should be directed to the CLucene mailing list
       
    28   at clucene-developers@lists.sourceforge.net  
       
    29 Find subscription instructions at 
       
    30   http://lists.sourceforge.net/lists/listinfo/clucene-developers
       
    31 Suggestions and bug reports can be made on our bug tracking database
       
    32   (http://sourceforge.net/tracker/?group_id=80013&atid=558446)
       
    33 
       
    34 The latest version
       
    35 ------------------
       
    36 Details of the latest version can be found on the CLucene sourceforge project
       
    37 web site: http://www.sourceforge.net/projects/clucene
       
    38 
       
    39 Documentation
       
    40 -------------
       
    41 Documentation is provided at http://clucene.sourceforge.net/doc/doxygen/html/
       
    42 You can also build your own documentation by running doxygen from the root directory
       
    43 of clucene.
       
    44 CLucene is a very close port of Java Lucene, so you can also try looking at the
       
    45 Java Docs on http://lucene.apache.org/java/
       
    46 
       
    47 
       
    48 Performance
       
    49 -----------
       
    50 Very little benchmarking has been done on clucene. Andi Vajda posted some 
       
    51 limited statistics on the clucene list a while ago with the following results.
       
    52 
       
    53 There are 250 HTML files under $JAVA_HOME/docs/api/java/util for about
       
    54 6108kb of HTML text. 
       
    55 org.apache.lucene.demo.IndexFiles with java and gcj: 
       
    56 on mac os x 10.3.1 (panther) powerbook g4 1ghz 1gb:
       
    57     . running with java 1.4.1_01-99 : 20379 ms
       
    58     . running with gcj 3.3.2 -O2    : 17842 ms
       
    59     . running clucene 0.8.9's demo  :  9930 ms 
       
    60 
       
    61 I recently did some more tests and came up with these rough tests:
       
    62 663mb (797 files) of Guttenberg texts 
       
    63 on a Pentium 4 running Windows XP with 1 GB of RAM. Indexing max 100,000 fields
       
    64 • Jlucene: 646453ms. peak mem usage ~72mb, avg ~14mb ram
       
    65 • Clucene: 232141. peak mem usage ~60, avg ~4mb ram
       
    66 
       
    67 Searching indexing using 10,000 single word queries
       
    68 • Jlucene: ~60078ms and used ~13mb ram
       
    69 • Clucene: ~48359ms and used ~4.2mb ram
       
    70 
       
    71 Platform notes
       
    72 --------------
       
    73 
       
    74 'Too many open files'
       
    75 Some platforms don't provide enough file handles to run CLucene properly.
       
    76 To solve this, increase the open file limit:
       
    77 
       
    78 On Solaris:
       
    79 ulimit -n 1024
       
    80 set rlim_fd_cur=1024
       
    81 
       
    82 Acknowledgments
       
    83 ----------------
       
    84 
       
    85 The Apache Lucene project is the basis for this software, so the biggest
       
    86 acknoledgment goes to that project.
       
    87 
       
    88 We wish to acknowledge the following copyrighted works that
       
    89 make up portions of the CLucene software:
       
    90 
       
    91 CLucene relies heavily on the use of autoconf and libtool to provide
       
    92 a build environment.