The following tables indicate the file formats from which GroupDocs.Parser for Java can extract data. You can use the input below to filter supported formats by extension.
Tip Can’t find your file format?
We’re here to help! Please post a request on our Free Support Forum, and our team will assist you. Word Processing Document Type Parse Document by Template Extract Text (Accurate) Extract Text (Raw) Extract Structured Text and Formatted Text Extract Text Areas Extract Metadata Extract Images Extract Containers and Attachments Parse Form Data Extract Table of Contents Scan Barcode DOC Microsoft Word Document DOT Microsoft Word Document Template DOCX Office Open XML Document DOCM Office Open XML Macro-Enabled Document DOTX Office Open XML Document Template DOTM Office Open XML Document Macro-Enabled Template TXT Plain text ODT Open Document Text OTT Open Document Text Template RTF Rich Text Format PDF Document Type Parse Document by Template Extract Text (Accurate) Extract Text (Raw) Extract Structured Text and Formatted Text Extract Text Areas Extract Metadata Extract Images Extract Containers and Attachments Parse Form Data Extract Table of Contents Scan Barcode PDF Portable Document Format File Markup Document Type Parse Document by Template Extract Text (Accurate) Extract Text (Raw) Extract Structured Text and Formatted Text Extract Text Areas Extract Metadata Extract Images Extract Containers and Attachments Parse Form Data Extract Table of Contents Scan Barcode XHTML Extensible Hypertext Markup Language File MHTML MIME HTML File MD Markdown (Formatted Text is Not supported) XML XML File Ebook Document Type Parse Document by Template Extract Text (Accurate) Extract Text (Raw) Extract Structured Text and Formatted Text Extract Text Areas Extract Metadata Extract Images Extract Containers and Attachments Parse Form Data Extract Table of Contents Scan Barcode CHM Compiled HTML Help File EPUB Digital E-Book File Format FB2 FictionBook 2....Navigation Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
Insert images into documents from file paths, URLs, or binary data sources during document assembly....Navigation Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
Build documents from JSON data sources using JsonDataSource class to load and bind JSON data to templates in .NET applications....Navigation Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
This article shows how to delete Barcode electronic signatures different ways with GroupDocs.Signature Api....Navigation Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
This article demonstrates how to save edited text documents, spreadsheets and presentations with GroupDocs.Editor for .NET Api....Navigation Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
This page contains a description of all index settings that can be specified in an instance of the IndexSettings class....Navigation Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
To extract a text from Microsoft Office Word documents getText and getText(int) methods are used. These methods allow to extract a text from the entire document or a text from the selected page. TextOptions parameter is ignored for Microsoft Office Words documents.
Here are the steps to extract a text from Microsoft Office Word document:
Instantiate Parser object for the initial document; Call getText method and obtain TextReader object; Read a text from reader....Navigation Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...
This article demonstrates how to load, edit, and read form fields in a Word document using GroupDocs.Editor for .NET...Navigation Products GroupDocs.Total Product Family GroupDocs.Viewer Product...Solution GroupDocs.Annotation Product Solution GroupDocs.Conversion...