Learn how to extract metadata from Word documents (.doc, .docx) using GroupDocs.Parser for .NET. Extract document properties like author, title, creation date, comments, and revision information from Word files....data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more...
This article explains how to access EXIF metadata in a file of any supported format....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and...
Detecting the version of a PDF document The following sample of code will help you to detect the PDF version a loaded document and extract some additional file format information.
Load a PDF document Extract the root metadata package Use the getPdfType method to obtain file format information advanced_usage.managing_metadata_for_specific_formats.document.pdf.PdfReadFileFormatProperties
try (Metadata metadata = new Metadata(Constants.InputPdf)) { PdfRootPackage root = metadata.getRootPackageGeneric(); System.out.println(root.getPdfType().getFileFormat()); System.out.println(root.getPdfType().getVersion()); System.out.println(root.getPdfType().getMimeType()); System.out.println(root.getPdfType().getExtension()); } Reading built-in metadata properties To access built-in metadata of a PDF document, please use the getDocumentProperties method defined in the DocumentRootPackage class....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and...
This article shows how to access IPTC metadata in a file of any supported format....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and...
First of all you need to create an index. An index can be created in memory or on disk. An index created in memory cannot be saved after exiting your program. In contrast, an index created on disk may be loaded in the future to continue working....over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our...
GroupDocs.Metadata for Java provides functionality that allows working with different kinds of WordProcessing documents such as DOC, DOCX, ODT, etc. For the full list of supported document formats please refer to Supported document formats.
Detecting the exact type of a document The following sample of code will help you to detect the exact type of a loaded document and extract some additional file format information.
Load a WordProcessing document Extract the root metadata package Use the getWordProcessingType method to obtain file format information advanced_usage....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and...
It supports DOCX, DOCM, DOC, DOT, DOTM, XLS, XLSX, PDF, PPT, JPG, PNG, HTML, EML and many more...Try the to convert DOCX, XLSX, PPTX, and more. This topic lists...
This page contains a description of all index settings that can be specified in an instance of the IndexSettings class....over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our...
This page contains a description of all index settings that can be specified in an instance of the IndexSettings class....over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our...
id: search-text-in-html-documents url: parser/net/search-text-in-html-documents title: Search text in HTML documents weight: 2 description: “To search a keyword in HTML documents Search(String) method is used. This method returns the collection of SearchResult objects.” keywords: search a keyword, search a keyword in HTML productName: GroupDocs.Parser for .NET hideChildren: False toc: true
To search a keyword in HTML documents Search(String) method is used. This method returns the collection of SearchResult objects. For details, see Search Text....data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more...