To extract hyperlinks from Microsoft Office Word document getStructure method is used. This method returns XML representation of the document. Hyperlinks are represented by “hyperlink” tag; “link” attribute contains hyperlink’s URL. For more details, see Extract Text structure. Hyperlink can contain a Text:
google.com Warning getStructure method returns null value if Text structure extraction isn’t supported for the document. For example, Text structure extraction isn’t supported for TXT files. Therefore, for TXT file getStructure method returns null....Watermark Product Solution GroupDocs...see . Hyperlink can contain a text: <hyperlink link= "www.google...
Learn how to extract data from documents on the local disk...Watermark Product Solution GroupDocs...( filePath )) { // Extract a text into the reader using ( TextReader...
Find Answers by API GroupDocs.Total Product Family GroupDocs.Conversion Product Family GroupDocs.Annotation Product F......Watermark Product Family GroupDocs.Merger...using C# How to Sign PDF with Text Signature using C# How to Convert...
This article explains how to get collection of changes between compared documents when using GroupDocs.Comparison for .NET....Watermark Product Solution GroupDocs... Page: {1}, Change ID: {2}, Text: {3}" , change . Type , change...
Learn how to Load document from stream....Watermark Product Solution GroupDocs...Parser ( stream )) { // Extract a text into the reader using ( TextReader...
This article explains that how to extract hyperlinks from document page....Watermark Product Solution GroupDocs...Description The page that contains the text area. The rectangular area on...
This topic describes how to use the GroupDocs.Viewer Java API to convert PDF files to HTML, PNG, and JPEG formats....Watermark Product Solution GroupDocs...elements of an HTML page (including text, graphics, and stylesheets)...
Character replacement during indexing can be used, for example, to convert all Text to lowercase characters or to remove diacritics from Text....Watermark Product Solution GroupDocs...for example, to convert all text to lowercase characters or to...
There might be cases when the document is presented only as a stream (without a copy on the local disk). To avoid the overhead of saving documents to the disk, GroupDocs.Parser enables to extract data from streams directly.
The following example shows how to load the document from the stream:
// Create the stream try (InputStream stream = new FileInputStream(Constants.SamplePdf)) { // Create an instance of Parser class with the stream try (Parser parser = new Parser(stream)) { // Extract a Text into the reader try (TextReader reader = parser....Watermark Product Solution GroupDocs...Parser ( stream )) { // Extract a text into the reader try ( TextReader...
Extract attachments from Emails To extract attachments from emails getContainer method is used. This method returns the collection of ContainerItem objects.
Email Attachment can contain the following metadata:
Name Description content-type The MIME type of the attachment content These metadata refer to a container element itself, not a document.
Here are the steps to extract an email Text from email attachments:
Instantiate Parser object for the initial document; Call getContainer method and obtain collection of ContainerItem objects; Check if collection isn’t null (container extraction is supported for the document); Iterate through the collection and obtain Parser object to extract a Text....Watermark Product Solution GroupDocs...the steps to extract an email text from email attachments: Instantiate...