This article shows how to extract table of contents from Microsoft Word (DOC, DOCX etc), PDF documents and Ebooks (CHM, EPUB)....extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and...
To extract files from ZIP archives getContainer method is used. This method returns the collection of ContainerItem objects.
Zip Entry can contain the following metadata:
Name Description date The time and date at which the file indicated by the Zip Entry was last modified. crc The 32-bit CRC (Cyclic Redundancy Check) on the contents of the Zip Entry. These metadata refer to a container element itself, not a document.
Here are the steps to extract an email text from Zip archives:...extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and...
Learn how to extract table of contents (TOC) from Word documents, PDF files, and eBooks using GroupDocs.Parser for .NET. Extract TOC items with page numbers and depth levels in C#....images from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and...
Get ZIP format metadata The API allows detecting ZIP archives and reading format metadata. The following steps are needed to be followed:
Load a ZIP archive Get the root metadata package Extract the native metadata package using ZipRootPackage.ZipPackage Read the ZIP archive properties Loop through ZipPackage.Files to extract information about archived files The following code snippet shows how to get metadata from a ZIP archive.
AdvancedUsage.ManagingMetadataForSpecificFormats.Archive.ZipReadNativeMetadataProperties
Encoding encoding = Encoding....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
Reading AVI header properties The GroupDocs.Metadata API supports extracting format-specific information from AVI file headers.
The following are the steps to read the header of an AVI file.
Load an AVI video Get the root metadata package Extract the native metadata package using AviRootPackage.Header Read the AVI header properties AdvancedUsage.ManagingMetadataForSpecificFormats.Video.Avi.AviReadHeaderProperties
using (Metadata metadata = new Metadata(Constants.InputAvi)) { var root = metadata.GetRootPackage(); Console.WriteLine(root.Header.AviHeaderFlags); Console.WriteLine(root.Header.Height); Console.WriteLine(root.Header.Width); Console.WriteLine(root.Header.TotalFrames); Console.WriteLine(root.Header.InitialFrames); Console.WriteLine(root.Header.MaxBytesPerSec); Console.WriteLine(root.Header.PaddingGranularity); Console.WriteLine(root.Header.Streams); // ....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
This article shows how to redact the pages of a document as images, redacting entire areas of the page instead or in addition to a specific text....formats like PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and...
This article gives the knowledge about two ways to create a search query: in text or object form using Java search API....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with...
This article explains how to separately extract data from documents and add the extracted data to the index....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with...
Understand and extract human-readable (interpreted) values for metadata properties using GroupDocs.Metadata for Python via .NET....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
This article gives the knowledge about the complete specification of the search query DSL used in text queries....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with...