This article shows how to extract data from documents of various formats including PDF, Microsoft Word (DOC, DOCX), Excel (XLS, XLSX), LibreOffice formats etc....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...
Why To Migrate? Here are the key reasons to use the new updated API provided by GroupDocs.Watermark for Java since version 20.1:
Watermarker class is introduced as a single entry point to manage watermarks in the document (instead of Documentclass from previous versions). Adding watermarks was unified for all supported document formats. Product architecture was redesigned from scratch in order to simplify passing options to manage watermarks. Document information and preview generation procedures were simplified....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...
Create bulleted lists in documents by iterating over data collections and displaying items as bullets....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...
Learn how to run GroupDocs.Comparison for Node.js via Java inside Docker containers with examples, dependencies, and troubleshooting....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...
Reading matroska format-specific properties The GroupDocs.Metadata API supports extracting format-specific information from MKV files.
The following are the steps to read native MKV metadata.
Load an MKV video Get the root metadata package Extract the native metadata package using MatroskaRootPackage.MatroskaPackage Read the Matroska metadata properties on different levels of the format structure AdvancedUsage.ManagingMetadataForSpecificFormats.Video.Matroska.MatroskaReadNativeMetadataProperties
using (Metadata metadata = new Metadata(Constants.InputMkv)) { var root = metadata.GetRootPackage(); // Read the EBML header Console....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...
Note This feature is only compatible with GroupDocs.Assembly for Java 19.10 or later releases. To access XML data while building a report, you can use facilities of DataSet to read XML into it and then pass it to the assembler as a data source. However, if your scenario does not permit to specify XML schema while loading XML into DataSet, all attributes and text values of XML elements are loaded as strings then....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...
Learn how to search for keywords and use regular expressions to find text in documents using GroupDocs.Parser for Python via .NET. Search text with case sensitivity and whole word options....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...
To extract data from PDF documents parseForm and parseByTemplate(Template) methods are used. Both methods return DocumentData object. For details, see Working With Extracted Data.
Here are the steps to extract data from PDF Form:
Instantiate Parser object for the initial document Call parseForm method and obtain the DocumentData object; Check if data isn’t null (parse form is supported for the document); Iterate over field data to obtain form data. The following example shows the use case when a user fills in PDF form and send it by email (for example)....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...
This article explains how to detect the document's file type and calculate the number of pages when converting a file with GroupDocs.Conversion for .NET....Merger Product Solution GroupDocs.Redaction Product Solution GroupDocs...