GroupDocs.Parser provides the functionality To extract data from HTML documents and other markup formats.
The following table provides the list of supported formats:
Format Description HTML Hypertext Markup Language File XHTML Extensible Hypertext Markup Language File MHTML MIME HTML File MD Markdown XML XML File More resources GitHub examples You may easily run the code above and see the feature in action in our GitHub examples:
GroupDocs.Parser for .NET examples GroupDocs....data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more...
Working with XMP metadata GroupDocs.Metadata for Java allows managing XMP metadata in TIFF images. For more details please refer To the following guide: Working with XMP Metadata.
Working with EXIF metadata The GroupDocs.Metadata API supports handling EXIF metadata in TIFF images. Please find appropriate code samples in the Working with EXIF Metadata section.
Working with IPTC metadata GroupDocs.Metadata for Java is also able To work with IPTC metadata in TIFF images....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
Working with XMP metadata GroupDocs.Metadata for .NET allows managing XMP metadata in TIFF images. For more details please refer To the following guide: Working with XMP metadata.
Working with EXIF metadata The GroupDocs.Metadata API supports handling EXIF metadata in TIFF images. Please find appropriate code samples in the Working with EXIF metadata section.
Working with IPTC metadata GroupDocs.Metadata for .NET is also able To work with IPTC metadata in TIFF images....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
This article demonstrates that how To save the redacted document, replacing an original file...formats like PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more...
This article shows you how To view and edit metadata of Pdf, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails, images and more with our free online....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
This article shows how To extract data from documents of various formats including Pdf, Microsoft Word (DOC, DOCX), Excel (XLS, XLSX), LibreOffice formats etc....of various formats including PDF, Microsoft Word, Excel, LibreOffice...data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more...
Sometimes you may need To just remove all or clean metadata properties without applying any filters. The best way To do this is To use the Sanitize method....pdf" try ( Metadata metadata = new...metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
In some cases it’s required To specify the document format manually To guarantee correct output produced by GroupDocs.Parser. The following are the cases when the document format must be specified manually:
Markdown documents MHTML documents OTP documents (OpenDocument Presentation Template) Databases Emails from remote servers Here are the steps To specify the document format for Markup document.
Instantiate the LoadOptions object and pass the document format in LoadOptions(FileFormat) construcTor; Create Parser object and call any method....Presentation Template) Databases Emails from remote servers Here are...data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more...
Reading DICOM metadata properties The GroupDocs.Metadata API supports extracting format-specific information from DICOM images.
The following are the steps To read the native DICOM metadata.
Load a DICOM image Get the root metadata package Extract the native metadata package using DicomRootPackage.DicomPackage Read the DICOM metadata properties AdvancedUsage.ManagingMetadataForSpecificFormats.Image.Dicom.DicomReadNativeMetadataProperties
using (Metadata metadata = new Metadata(Constants.InputDicom)) { var root = metadata.GetRootPackage(); if (root.DicomPackage != null) { Console.WriteLine(root.DicomPackage.BitsAllocated); Console.WriteLine(root.DicomPackage.LengthValue); Console.WriteLine(root.DicomPackage.DicomFound); Console.WriteLine(root.DicomPackage.HeaderOffset); Console.WriteLine(root.DicomPackage.NumberOfFrames); // ....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...
Extract information about known properties available in a particular package using GroupDocs.Metadata for Python via .NET....metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images...