To extract data from PDF documents parseForm and parseByTemplate(Template) methods are used. Both methods return DocumentData object. For details, see Working With Extracted Data.
Here are the steps to extract data from PDF Form:
Instantiate Parser object for the initial document Call parseForm method and obtain the DocumentData object; Check if data isn’t null (parse form is supported for the document); Iterate over field data to obtain form data. The following example shows the use case when a user fills in PDF form and send it by email (for example)....Close Navigation Products GroupDocs.Total Product Family GroupDocs...
First of all you need to create an index. An index can be created in memory or on disk. An index created in memory cannot be saved after exiting your program. In contrast, an index created on disk may be loaded in the future to continue working....Close Navigation Products GroupDocs.Total Product Family GroupDocs...
This guide demonstrates how to edit the content of Markdown documents/files like common text documents using GroupDocs.Editor for Python via .NET....Close Navigation Products GroupDocs.Total Product Family GroupDocs...
This article explains how to detect the document's file type and calculate the number of pages when converting a file with GroupDocs.Conversion for .NET....Close Navigation Products GroupDocs.Total Product Family GroupDocs...
GroupDocs.Assembly for .NET is a document automation and reports generation API designed to create custom documents from templates. This .NET reporting API intelligently assembles the given data with the defined template document and it is a report generator which generates an output document based on the data source, in the template’s format as or in the specified output format. Since GroupDocs.Assembly for .NET can generate documents based on a data source, essentially it serves two purposes: Document Automation and Report Generation....Close Navigation Products GroupDocs.Total Product Family GroupDocs...
GroupDocs.Metadata for Java provides functionality that allows working with different kinds of presentations such as PPT, PPTX, POTM, POTX, etc. For the full list of supported presentation formats please refer to Supported document formats.
Detecting the exact type of a presentation The following sample of code will help you to detect the exact type of a loaded presentation and extract some additional file format information.
Load a presentation Extract the root metadata package Use the getPresentationType method to obtain file format information advanced_usage....Close Navigation Products GroupDocs.Total Product Family GroupDocs...
This topic describes how to use the GroupDocs.Viewer .NET API to convert EBooks to HTML, PDF, PNG, and JPEG formats....Close Navigation Products GroupDocs.Total Product Family GroupDocs...
Follow this guide and learn how to convert between diagram formats (VSDX, VSD, VSS, etc.) and customize page fitting using GroupDocs.Conversion for .NET....Close Navigation Products GroupDocs.Total Product Family GroupDocs...
This article explains that how Python redaction API allows you to easily redact data of sensitive or private nature from your documents. You can apply text redaction using exact phrase or regular expression for documents of different formats like PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and others....Close Navigation Products GroupDocs.Total Product Family GroupDocs...