|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||
See:
Description
| Class Summary | |
|---|---|
| AccentFoldingFilter | Improves query results by converting accented characters to normal characters by removing diacritics. |
| CrimsonBugWorkaround | There's a very nasty bug in the Apache Crimson XML parser. |
| CrimsonBugWorkaround.BlockEnum | Presents the input stream as a series of blocks of data |
| DocSelCache | This class represents the contents of the Document Selector Cache maintained by the indexer. |
| DocSelCache.Entry | One entry in the docSelector cache |
| FacetTokenizer | Performs special tokenization for facet fields. |
| HTMLIndexSource | Transforms an HTML file to a single-record XML file. |
| HTMLToString | This class provides a single static convert()
method that converts an HTML file into an XML string that can be
pre-filtered and added to a Lucene database by the
XMLTextProcessor class. |
| IdxTreeCleaner | This class purges "incomplete" documents from a Lucene index. |
| IdxTreeCuller | This class provides a simple mechanism for removing documents from an index when the source text no longer exists in the document library. |
| IdxTreeDictMaker | This class provides a simple mechanism for generating a spelling correction dictionary after new documents have been added or updated. |
| IdxTreeOptimizer | This class provides a simple mechanism for optimizing Lucene indices after new documents have been added , updated, or removed. |
| IndexDump | This class dumps the contents of user-selected fields from an XTF text index. |
| IndexerConfig | This class records configuration information about the current state of the TextIndexer application. |
| IndexInfo | This class maintains configuration information about the current index that the TextIndexer program is processing. |
| IndexMerge | This class merges the contents of two or more XTF indexes, with certain caveats. |
| IndexMerge.DirInfo | |
| IndexRecord | A single record within a IndexSource. |
| IndexSource | Represents a single source of data for an XTF index. |
| IndexStats | This class calculates and prints out some useful statistics about an existing index, such as number of documents, size, etc. |
| IndexSync | Takes care of copying the differences between a source index and a dest index to make them exactly equal. |
| MARCIndexSource | Supplies MARC data to an XTF index, breaking it up into individual MARCXML records. |
| MSWordIndexSource | Transforms a Microsoft Word file to a single-record XML file. |
| PDFIndexSource | Transforms a PDF file to a single-record XML file. |
| PDFToString | This class provides a single static convert()
method that converts the text in a PDF file into an XML string that can be
pre-filtered and added to a Lucene database by the
XMLTextProcessor class. |
| PluralFoldingFilter | Improves query results by converting plural words to their singular forms. |
| SectionInfo | This class maintains information about the current section in a text document that the TextIndexer program is processing. |
| SectionInfoStack | This class maintains information about the current nesting of sections in a text document that the TextIndexer program is processing. |
| SpellWritingFilter | Adds words from the token stream to a SpellWriter. |
| SrcTreeProcessor | This class is the main processing shell for files in the source text tree. |
| StartEndFilter | Ensures that the tokens at the start and end of the stream are indexed both with and without the special start-of-field/end-of-field markers. |
| StructuredFileProxy | Used to put off actually creating a structured store until it is needed. |
| TagFilter | Spots XML elements in a token stream and marks them specially. |
| TextIndexer | This class is the main class for the TextIndexer program. |
| TextIndexSource | Transforms an HTML file to a single-record XML file. |
| UnicodeNormalizingFilter | Apply Unicode Normalization to the tokens. |
| XMLConfigParser | This class parses TextIndexer configuration XML files. |
| XMLIndexSource | Supplies a single file containing a single record to the
XMLTextProcessor. |
| XMLTextProcessor | This class performs the actual parsing of the XML source text files and generates index information for it. |
| XtfSpecialTokensFilter | The XtfSpecialTokensFilter class is used by the
XTFTextAnalyzer class to convert special "bump" count values in
text chunks to actual position increments for words prior to adding them
to a Lucene index. |
| XTFTextAnalyzer | The XTFTextAnalyzer class performs the task of breaking up a
contiguous chunk of text into a list of separate words (tokens
in Lucene parlance.) |
| Exception Summary | |
|---|---|
| TextIndexerException | This exception is thrown by classes related to the textIndexer tool. |
Contains all the classes that make up the textIndexer tool.
The TextIndexer class is the main command-line interface, while XMLTextProcessor does most of the heavy lifting (scanning documents, breaking them into chunks, passing the chunks to Lucene.)
|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||