This article explains how to get a list of indexed documents from an index, and how to get the text of indexed documents in HTML or plain text format....Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
Learn this article and check how to obtain default convert options for specific conversion format with Groupdocs.Conversion for Java API. ...Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
This page contains a description of all index settings that can be specified in an instance of the IndexSettings class....Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
Learn how to extract and work with table of contents from Word documents (.doc, .docx) using Groupdocs.Parser for .NET....Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
This page contains a description of all index settings that can be specified in an instance of the IndexSettings class....Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
To extract table of contents from Microsoft Office Word document getToc method is used. Table of contents is generated by paragraphs with H1-H9 build-in styles.
Warning getToc method returns null value if table of contents extraction isn’t supported for the document. For example, table of contents extraction isn’t supported for TXT files. Therefore, for TXT file getToc method returns null. If Microsoft Office Word document has no table of contents, getToc method returns an empty collection....Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
Extracted data are stored in the instance of DocumentData class...Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
Groupdocs.Metadata for Java provides functionality that allows working with different kinds of WordProcessing documents such as DOC, DOCX, ODT, etc. For the full list of supported document formats please refer to Supported document formats.
Detecting the exact type of a document The following sample of code will help you to detect the exact type of a loaded document and extract some additional file format information.
Load a WordProcessing document Extract the root metadata package Use the getWordProcessingType method to obtain file format information advanced_usage....Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
Learn how to search for keywords and use regular expressions to find text in documents using Groupdocs.Parser for .NET. Search text with case sensitivity and whole word options in C#....Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...