org.opencms.search.documents

Overview

Package

Class

Tree

Deprecated

Index

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package org.opencms.search.documents

Handles indexing different sorts of document and resource type from the OpenCms VFS for the full text search.

See:
Description

Interface Summary
I_CmsDocumentFactory	Used to create index Lucene Documents for OpenCms resources, controls the text extraction algorithm used for a specific OpenCms resource type / MIME type combination.
I_CmsSearchExtractor	Defines a text extractor for the integrated search engine.
I_CmsTermHighlighter	Highlights arbitrary terms, used for generation of search excerpts.

Class Summary
A_CmsVfsDocument	Base document factory class for a VFS `CmsResource`, just requires a specialized implementation of `I_CmsSearchExtractor.extractContent(CmsObject, CmsResource, CmsSearchIndex)` for text extraction from the binary document content.
CmsDocumentContainerPage	Lucene document factory class to extract index data from a resource of type `CmsResourceTypeContainerPage`.
CmsDocumentGeneric	Lucene document factory class for indexing data from a generic `CmsResource`.
CmsDocumentHtml	Lucene document factory class to extract index data from a cms resource containing plain html data.
CmsDocumentMsOfficeOLE2	Lucene document factory class to extract text data from a VFS resource that is an OLE 2 MS Office document.
CmsDocumentMsOfficeOOXML	Lucene document factory class to extract text data from a VFS resource that is an OOXML MS Office document.
CmsDocumentOpenOffice	Lucene document factory class to extract index data from a cms resource containing Open Document Format data.
CmsDocumentPdf	Lucene document factory class to extract index data from a cms resource containing Adobe pdf data.
CmsDocumentPlainText	Lucene document factory class to extract index data from a cms resource containing plain text data.
CmsDocumentRtf	Lucene document factory class to extract index data from a cms resource containing RTF data.
CmsDocumentXmlContent	Lucene document factory class to extract index data from an OpenCms VFS resource of type `CmsResourceTypeXmlContent`.
CmsDocumentXmlPage	Lucene document factory class to extract index data from a cms resource of type `CmsResourceTypeXmlPage`.
CmsExtractionResultCache	Implements a disk cache that stores text extraction results in the RFS.
CmsTermHighlighterHtml	Default highlighter implementation used for generation of search excerpts.

Exception Summary
CmsIndexNoContentException	Signals an error during content extraction of an empty document.

Package org.opencms.search.documents Description

Handles indexing different sorts of document and resource type from the OpenCms VFS for the full text search.

Since:: 6.0.0

Overview

Package

Class

Tree

Deprecated

Index

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES