This article explains how To get a list of indexed documents from an index, and how To get the text of indexed documents in HTML or plain text format....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more...
This page contains a description of the use of document filters for indexing, as well as descriptions of all types of filters with examples of their creation....pdf' ); const invertedFilter = groupdocs...extensions except of HTM, HTML, and PDF settings . setDocumentFilter...
Working with metadata in ASF files Reading ASF format-specific properties The GroupDocs.Metadata API supports extracting format-specific information from ASF files.
The following are the steps To read native ASF metadata.
Load an ASF video Get the root metadata package Extract the native metadata package using AsfRootPackage.AsfPackage Read the ASF metadata properties AdvancedUsage.ManagingMetadataForSpecificFormats.Video.Asf.AsfReadNativeMetadataProperties
using (Metadata metadata = new Metadata(Constants.InputAsf)) { var root = metadata.GetRootPackage(); var package = root.AsfPackage; // Display basic properties Console....edit metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails...
GroupDocs.Parser Product Family on GroupDocs Blog | Document AuTomation Solutions for .NET & Java Developers Recent content in GroupDocs.Parser Product Family on GroupDocs Blog | Document AuTomatio...... Extract Images from PDF Documents using C# Learn how...how to extract images from PDF files using C# within your .NET applications...
To extract a text from Microsoft Office Word documents getText and getText(int) methods are used. These methods allow To extract a text from the entire document or a text from the selected page. TexToptions parameter is ignored for Microsoft Office Words documents.
Here are the steps To extract a text from Microsoft Office Word document:
Instantiate Parser object for the initial document; Call getText method and obtain TextReader object; Read a text from reader....extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails...
Learn how To check which features are supported for a document using GroupDocs.Parser for .NET. Check text extraction, metadata, images, tables, and other feature support in C#....extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails...
This article demonstrates the ability To connect an external module (library) for the recognition of printed text (optical character recognition, OCR) on images, either separate or embedded in documents...search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more...
Using the GroupDocs.Metadata search engine you can extract desired metadata properties from files of different types. You don’t need To worry about the exact file format and metadata standards it can deal with. The same code will work for all supported formats in the same way. Most commonly used metadata properties are marked with tags that allow searching them across all supported files in various metadata packages. All tags defined in GroupDocs....edit metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails...