This page contains a description of the use of document filters for indexing, as well as descriptions of all types of filters with examples of their creation....pdf' ); const invertedFilter = groupdocs...extensions except of HTM, HTML, and PDF settings . setDocumentFilter...
This article explains how To get a list of indexed documents from an index, and how To get the text of indexed documents in HTML or plain text format....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more...
This page contains descriptions of all character types. Character types differ in how characters of these types are indexed....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more...
This article explains how To get a list of indexed documents from an index, and how To get the text of indexed documents in HTML or plain text format....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more...
This page contains descriptions of all character types. Character types differ in how characters of these types are indexed....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more...
Getting metadata and binary content of all fonts, used in the document, loaded To the GroupDocs.Viewer for .NET... For example, the PDF, DOCX, XLSX, PPTX, and other similar document...from 4 main formats family: PDF, WordProcessing, Spreadsheet...
OpenType is a format for scalable computer fonts. It was built on its predecessor TrueType, retaining TrueType’s basic structure and adding many intricate data structures for prescribing typographic behavior.
Note Please find more information on the OpenType format here: https://en.wikipedia.org/wiki/OpenType. Reading OpenType metadata The GroupDocs.Metadata API supports extracting format-specific information from OpenType font files.
The following are the steps To read the header of an OpenType file.
Load an OpenType font file Get the root metadata package Extract the native metadata package using the OpenTypeRootPackage....edit metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails...
This article shows that how To provides syntax of all elements allowed in text search queries....search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more...
To extract a text from Microsoft Office Word documents getText and getText(int) methods are used. These methods allow To extract a text from the entire document or a text from the selected page. TexToptions parameter is ignored for Microsoft Office Words documents.
Here are the steps To extract a text from Microsoft Office Word document:
Instantiate Parser object for the initial document; Call getText method and obtain TextReader object; Read a text from reader....extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails...
Learn how To get list of used fonts, specify or replace missing fonts, exclude fonts...the document file itself, like PDF. But others, like Office family...Spreadsheet, Presentation, and PDF, have specialized implementations...