Learn how to deal.If you have a corporate sensitive data removal policy as a list of redaction rules, you don't need to specify them in your code. You can specify an XML document with a list of pre-configured redactions....DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online...
This article describes the main functions of GroupDocs.Parser for Python via .NET. Extracting text, images, metadata, tables, and structured data from documents with template-based parsing support....ZIP archives, PDF portfolios, emails, OST containers. You can simply...
To extract a text from PDF documents getText and getText(int) methods are used. These methods allow to extract a text from the entire document or a text from the selected page.
Here are the steps to extract a text from PDF document:
Instantiate Parser object for the initial document; Call getText method and obtain TextReader object; Read a text from reader. Warning getText method returns null value if text extraction isn’t supported for the document....DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online...
Learn how to extract tables from a specific document page using GroupDocs.Parser for .NET. Extract tables from Excel, Word, PDF pages by page index in C#....DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online...
To extract text from EPUB e-books getText and getText(pageIndex) methods is used. These methods allow to extract text from the entire document or a text from the selected page. Raw mode is not supported for EPUB....DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online...
To extract a text from Microsoft Office Word documents getText and getText(int) methods are used. These methods allow to extract a text from the entire document or a text from the selected page. TextOptions parameter is ignored for Microsoft Office Words documents.
Here are the steps to extract a text from Microsoft Office Word document:
Instantiate Parser object for the initial document; Call getText method and obtain TextReader object; Read a text from reader....DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online...
The GroupDocs Metadata API provides the feature to read basic metadata in CAD files. The supported CAD formats are:
DWG DXF Reading CAD metadata To access metadata in a CAD drawing, the GroupDocs.Metadata API provides the CadRootPackage.getCadPackage method.
The following code snippet reads metadata associated with a CAD file.
advanced_usage.managing_metadata_for_specific_formats.cad.CadReadNativeMetadataProperties
try (Metadata metadata = new Metadata(Constants.InputDxf)) { CadRootPackage root = metadata.getRootPackageGeneric(); System.out.println(root.getCadPackage().getAcadVersion()); System.out.println(root.getCadPackage().getAuthor()); System.out.println(root.getCadPackage().getComments()); System.out.println(root.getCadPackage().getCreatedDateTime()); System.out.println(root.getCadPackage().getHyperlinkBase()); System.out.println(root.getCadPackage().getKeywords()); System.out.println(root.getCadPackage().getLastSavedBy()); System.out.println(root.getCadPackage().getTitle()); // ....DOCX, PPT, PPTX, XLS, XLSX, emails, images and more with our free...
Reading Photoshop Metadata properties The GroupDocs Metadata API allows the user to read Adobe Photoshop metadata associated with a PSD image. For more information on the Photoshop file format and metadata blocks please refer to the specification: https://www.adobe.com/devnet-apps/photoshop/fileformatashtml/.
The code sample below demonstrates how to extract image resource blocks (building blocks of the Photoshop file format) from a PSD image.
Load a PSD file Get the root metadata package Extract the ImageResourcePackage instance and obtain a list of ImageResourceBlock objects Iterate trough the collection of resource blocks advanced_usage....DOCX, PPT, PPTX, XLS, XLSX, emails, images and more with our free...