diff -r 000000000000 -r 1918ee327afb src/3rdparty/clucene/README --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/3rdparty/clucene/README Mon Jan 11 14:00:40 2010 +0000 @@ -0,0 +1,92 @@ +CLucene README +============== + +------------------------------------------------------ +CLucene is a C++ port of Lucene. +It is a high-performance, full-featured text search +engine written in C++. CLucene is faster than lucene +as it is written in C++. +------------------------------------------------------ + +CLucene has contributions from many, see AUTHORS + +CLucene is distributed under the GNU Lesser General Public License (LGPL) + *or* +the Apache License, Version 2.0 +See the LGPL.license and APACHE.license for the respective license information. +Read COPYING for more about the license. + +Installation +------------ +* For Linux, MacOSX, cygwin and MinGW build information, read INSTALL. +* Boost.Jam files are provided in the root directory and subdirectories. +* Microsoft Visual Studio (6&7) are provided in the win32 folder. + +Mailing List +------------ +Questions and discussion should be directed to the CLucene mailing list + at clucene-developers@lists.sourceforge.net +Find subscription instructions at + http://lists.sourceforge.net/lists/listinfo/clucene-developers +Suggestions and bug reports can be made on our bug tracking database + (http://sourceforge.net/tracker/?group_id=80013&atid=558446) + +The latest version +------------------ +Details of the latest version can be found on the CLucene sourceforge project +web site: http://www.sourceforge.net/projects/clucene + +Documentation +------------- +Documentation is provided at http://clucene.sourceforge.net/doc/doxygen/html/ +You can also build your own documentation by running doxygen from the root directory +of clucene. +CLucene is a very close port of Java Lucene, so you can also try looking at the +Java Docs on http://lucene.apache.org/java/ + + +Performance +----------- +Very little benchmarking has been done on clucene. Andi Vajda posted some +limited statistics on the clucene list a while ago with the following results. + +There are 250 HTML files under $JAVA_HOME/docs/api/java/util for about +6108kb of HTML text. +org.apache.lucene.demo.IndexFiles with java and gcj: +on mac os x 10.3.1 (panther) powerbook g4 1ghz 1gb: + . running with java 1.4.1_01-99 : 20379 ms + . running with gcj 3.3.2 -O2 : 17842 ms + . running clucene 0.8.9's demo : 9930 ms + +I recently did some more tests and came up with these rough tests: +663mb (797 files) of Guttenberg texts +on a Pentium 4 running Windows XP with 1 GB of RAM. Indexing max 100,000 fields +• Jlucene: 646453ms. peak mem usage ~72mb, avg ~14mb ram +• Clucene: 232141. peak mem usage ~60, avg ~4mb ram + +Searching indexing using 10,000 single word queries +• Jlucene: ~60078ms and used ~13mb ram +• Clucene: ~48359ms and used ~4.2mb ram + +Platform notes +-------------- + +'Too many open files' +Some platforms don't provide enough file handles to run CLucene properly. +To solve this, increase the open file limit: + +On Solaris: +ulimit -n 1024 +set rlim_fd_cur=1024 + +Acknowledgments +---------------- + +The Apache Lucene project is the basis for this software, so the biggest +acknoledgment goes to that project. + +We wish to acknowledge the following copyrighted works that +make up portions of the CLucene software: + +CLucene relies heavily on the use of autoconf and libtool to provide +a build environment.