Note GroupDocs.Parser is a feature-reach document data parsing API. Here you may find description of the most important features. Parse Document by Template GroupDocs.Parser allows to parse documents by user-defined templates.
It is easy to crate a template with data field definitions, table definitions. Then it’s easy to use the template (just pass the Template object to parseByTemplate(Template) method) and extract data such as prices, invoices, tables from your typical documents....Editor Product Solution GroupDocs...Family / GroupDocs.Parser for Java / Getting Started / Features...
GroupDocs.Parser provides the functionality to extract data from HTML documents and other markup formats.
The following table provides the list of supported formats:
Format Description HTML Hypertext Markup Language File XHTML Extensible Hypertext Markup Language File MHTML MIME HTML File MD Markdown XML XML File More resources GitHub examples You may easily run the code above and see the feature in action in our GitHub examples:
GroupDocs.Parser for .NET examples GroupDocs....Editor Product Solution GroupDocs...Family / GroupDocs.Parser for Java / Developer Guide / Advanced...
GroupDocs.Conversion for Node.js via Java supports DOCX, DOCM, DOC, DOT, DOTM, XLS, XLSX, PDF, PPT, JPG, PNG, HTML, EML and many more...Editor Product Solution GroupDocs...GroupDocs.Conversion for Node.js via Java / Get started / Supported file...