OCR support means the ability to connect an external module (library) for the recognition of printed text (optical character recognition, OCR) on images, either separate or embedded in documents.
To connect OCR, you need to implement the IOcrConnector interface in the client code.
The following example demonstrates how to implement the OCR connector using com.aspose.ocr library for text recognition in images.
String indexFolder = "c:\\MyIndex"; String documentFolder = "c:\\MyDocuments"; // Creating an index Index index = new Index(indexFolder); // Setting the OCR indexing options IndexingOptions options = new IndexingOptions(); options....your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our free...
GroupDocs Blog - GroupDocs Blog | Document Automation Solutions for .NET & Java Developers...the password from Excel files (XLS/XLSX). Subsequently, we will...
Document Automation APIs to enrich .NET and Java applications to view, edit, annotate, convert, compare, e-sign, parse, split, merge, redact, or classify documents of almost all the popular file formats....the password from Excel files (XLS/XLSX). Subsequently, we will...
GroupDocs’ online document viewer add-on provides you with a convenient interface for quickly accessing, viewing and managing documents directly in the FireFox browser, without the need of going to the GroupDocs website. GroupDocs Online Document Viewer is a web-based application that allows you to open Microsoft Office files and images directly in a web-browser, regardless of whether you have the software that was used for creating them, or not. This is a universal document viewer that reduces the need for installing the tons of software required to open each individual file format....presentations (PPT, PPTX), SpreadSheets (XLS, XLSX), portable files (PDF)...
The life cycle of an index begins at the moment of creating an instance of the Index class and first saving the index files to disk. The index life cycle ends when a folder containing index files is deleted. Below is a diagram of the recommended sequence of index life cycle states.
Please note that the index life cycle does not consider the events of loading and unloading the index from RAM....your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our free...
This article shows how to add metadata properties which is the most sophisticated feature of the GroupDocs.Metadata Python via .NET search engine...of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and more...
To extract a text from emails getText method is used. This method allows to extract a text from the entire document. Pagination and raw mode is not supported for emails.
Here are the steps to extract a text from an email:
Instantiate Parser object for the initial email; Call getText method and obtain TextReader object; Read a text from reader. Warning getText method returns null value if text extraction isn’t supported for the document....from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our...
This article explains that how to extract containers items and iterate through container items....from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our...