OCR support means the ability to connect an external module (library) for the recognition of printed text (optical character recognition, OCR) on images, either separate or embedded in documents.
To connect OCR, you need to implement the IOcrConnector interface in the client code.
The following example demonstrates how to implement the OCR connector using com.aspose.ocr library for text recognition in images.
String indexFolder = "c:\\MyIndex"; String documentFolder = "c:\\MyDocuments"; // Creating an index Index index = new Index(indexFolder); // Setting the OCR indexing options IndexingOptions options = new IndexingOptions(); options.... DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our free online...
Getting metadata and binary content of all fonts, used in the document, loaded to the GroupDocs.Viewer for Java...Spreadsheet format family (like XLS, XLSX etc) cannot hold embedded fonts...For example, the PDF, DOCX, XLSX, PPTX, document formats support...
Remove selective metadata or clean all metadata properties using C# from DOCX, Xlsx, PPTX, PDF, JPG/JPEG, PNG, WebP images, email, eBooks, Visio, Zip files....the documents like DOCX, PDF, XLSX, etc using GroupDocs.Metadata...
GroupDocs’ online document viewer add-on provides you with a convenient interface for quickly accessing, viewing and managing documents directly in the FireFox browser, without the need of going to the GroupDocs website. GroupDocs Online Document Viewer is a web-based application that allows you to open Microsoft Office files and images directly in a web-browser, regardless of whether you have the software that was used for creating them, or not. This is a universal document viewer that reduces the need for installing the tons of software required to open each individual file format....(PPT, PPTX), SpreadSheets (XLS, XLSX), portable files (PDF), and...
The life cycle of an index begins at the moment of creating an instance of the Index class and first saving the index files to disk. The index life cycle ends when a folder containing index files is deleted. Below is a diagram of the recommended sequence of index life cycle states.
Please note that the index life cycle does not consider the events of loading and unloading the index from RAM.... DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our free online...
The easiest way to remove metadata properties from a file is to use corresponding tags that allow you to locate the desired properties across all metadata packages.... DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and more with...
This article demonstrates that how creating and assigning a logger of an index, as well as on the implementation of a custom logger using search API.... DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our free online...
This article shows how to extract table of contents from Microsoft Word (DOC, DOCX etc), PDF documents and Ebooks (CHM, EPUB).... DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free...