This article shows how to extract text with GroupDocs.Parser from PDF, Emails, Ebooks (EPUB, FB2, CHM), Microsoft Office formats: Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), LibreOffice formats and many others....This article shows how to extract text with GroupDocs.Parser from PDF, Emails, Ebooks (EPUB, FB2, CHM), Microsoft Office formats: Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), LibreOffice formats and many others.
This article shows how to extract images from PDF, Emails, Ebooks, Microsoft Office: Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), LibreOffice formats and many others....This article shows how to extract images from PDF, Emails, Ebooks, Microsoft Office: Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), LibreOffice formats and many others.
Work with containers such as ZIP archives, email stores, and PDF portfolios using GroupDocs.Parser for Python via .NET....Work with containers such as ZIP archives, email stores, and PDF portfolios using GroupDocs.Parser for Python via .NET.
GroupDocs.Parser provides the functionality to extract emails from remote servers. The following email protocols are supported:
Post Office Protocol (POP) Internet Message Access Protocol (IMAP) Exchange Web Services (EWS) To create an instance of Parser class to extract emails from a remote server the following constructor is used:
Parser(EmailConnection connection); Parser(EmailConnection connection, ParserSettings parserSettings) The second constructor allows to use ParserSettings object to control the process; for example, by adding logging functionality....GroupDocs.Parser provides the functionality to extract emails from remote servers. The following email protocols are supported:
Post Office Protocol (POP) Internet Message Access Protocol (IMAP) Exchange Web Services (EWS) To create an instance of Parser class to extract emails from a remote server the following constructor is used:
Parser(EmailConnection connection); Parser(EmailConnection connection, ParserSettings parserSettings) The second constructor allows to use ParserSettings object to control the process; for example, by adding logging functionality.
This article shows how to extract formatted text represented as HTML or Markdown with GroupDocs.Parser from documents of various formats like Emails, Ebooks (EPUB, FB2, CHM), Microsoft Office formats: Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), LibreOffice formats and many others....This article shows how to extract formatted text represented as HTML or Markdown with GroupDocs.Parser from documents of various formats like Emails, Ebooks (EPUB, FB2, CHM), Microsoft Office formats: Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), LibreOffice formats and many others.
The GroupDocs Metadata API provides the feature to read basic metadata in CAD files. The supported CAD formats are:
DWG DXF Reading CAD metadata To access metadata in a CAD drawing, the GroupDocs.Metadata API provides the CadRootPackage.getCadPackage method.
The following code snippet reads metadata associated with a CAD file.
advanced_usage.managing_metadata_for_specific_formats.cad.CadReadNativeMetadataProperties
try (Metadata metadata = new Metadata(Constants.InputDxf)) { CadRootPackage root = metadata.getRootPackageGeneric(); System.out.println(root.getCadPackage().getAcadVersion()); System.out.println(root.getCadPackage().getAuthor()); System.out.println(root.getCadPackage().getComments()); System.out.println(root.getCadPackage().getCreatedDateTime()); System.out.println(root.getCadPackage().getHyperlinkBase()); System.out.println(root.getCadPackage().getKeywords()); System.out.println(root.getCadPackage().getLastSavedBy()); System.out.println(root.getCadPackage().getTitle()); // ....The GroupDocs Metadata API provides the feature to read basic metadata in CAD files. The supported CAD formats are:
DWG DXF Reading CAD metadata To access metadata in a CAD drawing, the GroupDocs.Metadata API provides the CadRootPackage.getCadPackage method.
The following code snippet reads metadata associated with a CAD file.
advanced_usage.managing_metadata_for_specific_formats.cad.CadReadNativeMetadataProperties
try (Metadata metadata = new Metadata(Constants.InputDxf)) { CadRootPackage root = metadata.getRootPackageGeneric(); System.out.println(root.getCadPackage().getAcadVersion()); System.out.println(root.getCadPackage().getAuthor()); System.out.println(root.getCadPackage().getComments()); System.out.println(root.getCadPackage().getCreatedDateTime()); System.out.println(root.getCadPackage().getHyperlinkBase()); System.out.println(root.getCadPackage().getKeywords()); System.out.println(root.getCadPackage().getLastSavedBy()); System.out.println(root.getCadPackage().getTitle()); // .
In the BitTorrent file distribution system, a torrent file or METAINFO is a computer file that contains metadata about files and folders to be distributed, and usually also a list of the network locations of trackers, which are computers that help participants in the system find each other and form efficient distribution groups called swarms. A torrent file does not contain the content to be distributed; it only contains information about those files, such as their names, sizes, folder structure, and cryptographic hash values for verifying file integrity....In the BitTorrent file distribution system, a torrent file or METAINFO is a computer file that contains metadata about files and folders to be distributed, and usually also a list of the network locations of trackers, which are computers that help participants in the system find each other and form efficient distribution groups called swarms. A torrent file does not contain the content to be distributed; it only contains information about those files, such as their names, sizes, folder structure, and cryptographic hash values for verifying file integrity.