OCR support means the ability to connect an external module (library) for the recognition of printed Text (optical character recognition, OCR) on images, either separate or embedded in documents.
To connect OCR, you need to implement the IOcrConnector interface in the client code.
The following example demonstrates how to implement the OCR connector using com.aspose.ocr library for Text recognition in images.
String indexFolder = "c:\\MyIndex"; String documentFolder = "c:\\MyDocuments"; // Creating an index Index index = new Index(indexFolder); // Setting the OCR indexing options IndexingOptions options = new IndexingOptions(); options....Search Product Solution GroupDocs...GroupDocs.Search Product Family / GroupDocs.Search for Java /...
Stop words are frequently used words that do not carry a semantic meaning and can be removed from an index to reduce its size.
You can enable or disable the use of stop words by calling the setUseStopWords method of the IndexSettings class. The default value is true, meaning that stop words are filtered during indexing and not added to the index.
A list of stop words to use during indexing can be specified in the stop word dictionary....Search Product Solution GroupDocs...GroupDocs.Search Product Family / GroupDocs.Search for Node.js...
Document attributes is a special feature designed for marking indexed documents with Text labels without the need for re-indexing....Search Product Solution GroupDocs...GroupDocs.Search Product Family / GroupDocs.Search for Node.js...
It supports DOCX, DOCM, DOC, DOT, DOTM, XLS, XLSX, PDF, PPT, JPG, PNG, HTML, EML and many more....Search Product Solution GroupDocs...adding, searching, and removing watermarks. Use the search box below...
Tags on GroupDocs Blog | Document Automation Solutions for .NET & Java Developers Recent content in Tags on GroupDocs Blog | Document Automation Solutions for .NET & Java Developers .NET GroupDocs.......documents Compare DOCX compare text files compare two documents...Python Library Text Compare Text Comparison Text Differences Version...
Tags on GroupDocs Blog | Document Automation Solutions for .NET & Java Developers Recent content in Tags on GroupDocs Blog | Document Automation Solutions for .NET & Java Developers GroupDocs.Water......Search Log Files performance printing Text Extraction...annotation Python document search API document search Python GroupDocs...
This article gives the knowledge about the complete specification of the Search query DSL used in Text queries using Java Search API....Search Product Solution GroupDocs...GroupDocs.Search Product Family / GroupDocs.Search for Node.js...
GroupDocs.Search allows indexing documents from various sources....Search Product Solution GroupDocs...GroupDocs.Search Product Family / GroupDocs.Search for Node.js...
Central documentation index for GroupDocs on-premise document processing SDKs. Explore developer documentation and guides for all product families....Search Product Solution GroupDocs...comparison API that detects document text, styling and formating changes...