org.opencms.search.documents
Class CmsDocumentMsOfficeOOXML

java.lang.Object
  extended by org.opencms.search.documents.A_CmsVfsDocument
      extended by org.opencms.search.documents.CmsDocumentMsOfficeOOXML
All Implemented Interfaces:
I_CmsDocumentFactory, I_CmsSearchExtractor

public class CmsDocumentMsOfficeOOXML
extends A_CmsVfsDocument

Lucene document factory class to extract text data from a VFS resource that is an OOXML MS Office document.

Supported formats are MS Word (.docx), MS PowerPoint (.pptx) and MS Excel (.xlsx).

The OLE 2 format was introduced in Microsoft Office version 97 and was the default format until Office version 2007 and the new XML-based OOXML format.

Since:
8.0.1

Field Summary
 
Fields inherited from class org.opencms.search.documents.A_CmsVfsDocument
m_name
 
Constructor Summary
CmsDocumentMsOfficeOOXML(String name)
          Creates a new instance of this lucene document factory.
 
Method Summary
 I_CmsExtractionResult extractContent(CmsObject cms, CmsResource resource, CmsSearchIndex index)
          Returns the raw text content of a given vfs resource containing MS Word data.
 boolean isLocaleDependend()
          Returns true if this document factory is locale depended.
 boolean isUsingCache()
          Returns true if result caching is supported for this factory.
 
Methods inherited from class org.opencms.search.documents.A_CmsVfsDocument
createDocument, getCache, getDocumentKey, getDocumentKeys, getName, logContentExtraction, readFile, setCache
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CmsDocumentMsOfficeOOXML

public CmsDocumentMsOfficeOOXML(String name)
Creates a new instance of this lucene document factory.

Parameters:
name - name of the document type
Method Detail

extractContent

public I_CmsExtractionResult extractContent(CmsObject cms,
                                            CmsResource resource,
                                            CmsSearchIndex index)
                                     throws CmsIndexException,
                                            CmsException
Returns the raw text content of a given vfs resource containing MS Word data.

Parameters:
cms - the cms object
resource - the resource to extract the content from
index - the index to extract the content for
Returns:
the extracted content of the resource
Throws:
CmsException - if somethin goes wrong
CmsIndexException
See Also:
I_CmsSearchExtractor.extractContent(CmsObject, CmsResource, CmsSearchIndex)

isLocaleDependend

public boolean isLocaleDependend()
Description copied from interface: I_CmsDocumentFactory
Returns true if this document factory is locale depended.

Returns:
true if this document factory is locale depended
See Also:
I_CmsDocumentFactory.isLocaleDependend()

isUsingCache

public boolean isUsingCache()
Description copied from interface: I_CmsDocumentFactory
Returns true if result caching is supported for this factory.

Returns:
true if result caching is supported for this factory
See Also:
I_CmsDocumentFactory.isUsingCache()