GroupDocs.Parser provides the functionality To extract data from Microsoft Office Word documents. Both classic (doc, dot) and Open XML (docx, dotx) formats are supported. Also LibreOffice Writer (OpenOffice.org Writer) formats and RTF are supported.
The following table provides the list of supported formats:
Format Description DOC Microsoft Office Word Document DOT Microsoft Office Word Document Template DOCX Microsoft Office Open XML Document DOCM Microsoft Office Open XML Macro-Enabled Document DOTX Microsoft Office Open XML Document Template DOTM Microsoft Office Open XML Document Macro-Enabled Template TXT Plain text ODT Open Document Text OTT Open Document Text Template RTF Rich Text Format More resources GitHub examples You may easily run the code above and see the feature in action in our GitHub examples:...data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more...
In the BitTorrent file distribution system, a Torrent file or METAINFO is a computer file that contains metadata about files and folders To be distributed, and usually also a list of the network locations of trackers, which are computers that help participants in the system find each other and form efficient distribution groups called swarms. A Torrent file does not contain the content To be distributed; it only contains information about those files, such as their names, sizes, folder structure, and crypTographic hash values for verifying file integrity....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
This article explains that how To extract hyperlinks from Microsoft Office Word (.doc, .docx) documents...data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more...
This code snippet demonstrates how To extract information about known properties that can be encountered in a particular package.
Load a file To examine Get a collection of PropertyDescripTor instances for any desired metadata package Iterate through the extracted descripTors advanced_usage.GettingKnownPropertyDescripTors
try (Metadata metadata = new Metadata(Constants.InputDoc)) { WordProcessingRootPackage root = metadata.getRootPackageGeneric(); for (PropertyDescripTor descripTor : root.getDocumentProperties().getKnowPropertyDescripTors()) { System.out.println(descripTor.getName()); System.out.println(descripTor.getType()); System.out.println(descripTor.getAccessLevel()); for (PropertyTag tag : descripTor.getTags()) { System.out.println(tag); } System.out.println(); } } Note Not all possible properties are presented in the getKnowPropertyDescripTors collection....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
demo-app/business-plan.docx HOME BASED professional services Business Plan HOME BASED PROFESSIONAL SERVICES Business Plan TABLE OF CONTENTS Introduction 3 1. Executive Summary 5 2. Company Overview......marketing such as social media, email marketing, or SEO Sales strategy:...25 18 demo-app/business-plan.pdf HOME BASED PROFESSIONAL SERVICES...
Java API To remove all or selective metadata properties of DOCX, XLSX, PPTX, Pdf documents, JPEG, PNG, WebP images, Email, eBooks, Visio Drawings, Zip, etc....spreadsheets, presentations, PDF files, images, emails, eBooks, drawings...
Example demonstrates some advanced usage scenarios of the GroupDocs.Metadata search engine allowing To remove metadata properties...metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
The easiest way To remove metadata properties from a file is To use corresponding tags that allow you To locate the desired properties across all metadata packages....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...