GroupDocs.Redaction supports both types of image documents for Optical Character Recognition (OCR):
image files, such as printed document scans (PNG, JPG, etc.) embedded images within office documents (PDF, DOCX, etc.) You have To implement IOcrConnecTor interface and pass the instance To RedacTorSettings construcTor.
For more details, see OCR Usage Basics article.
OCR usage limitations There are the following limitations of the OCR with GroupDocs.Redaction for Java v21.6:
textual replacements are not supported, so you have To use color box replacements To redact text in images.... Spreadsheets, HTML and Markdown document...
Articles in this section...Document Common List Image in Spreadsheet Document Common List Image...Document Common List Image in HTML Document Common List in Text...
Articles in this section...Document In-Paragraph List in Spreadsheet Document In-Paragraph List...Document In-Paragraph List in HTML Document In-Paragraph List in...
It supports DOCX, DOCM, DOC, DOT, DOTM, XLS, XLSX, PDF, PPT, JPG, PNG, Html, EML and many more....Worksheet * xlsx OpenDocument Spreadsheet Microsoft PowerPoint 97-2003...Portable Document Format (PDF) * html HyperText Markup Language (HTM)...
Articles in this section...Grouping and Ordering in Spreadsheet Document In-Table List with...Filtering Grouping and Ordering in HTML Document In-Table List with...
Articles in this section...In-Table Master-Detail in Spreadsheet Document In-Table Master-Detail...Document In-Table Master-Detail in HTML Document In-Table Master-Detail...
It supports DOCX, DOCM, DOC, DOT, DOTM, XLS, XLSX, PDF, PPT, JPG, PNG, Html, EML and many more....Worksheet * xlsx OpenDocument Spreadsheet Microsoft PowerPoint 97-2003...Portable Document Format (PDF) * html HyperText Markup Language (HTM)...
We are pleased To announce that the first version of GroupDocs.Parser for Java has been released. GroupDocs.Parser for Java allows the Java developers To extract raw and formatted text from the popular document formats. The API also supports working with containers such as ZIP and email containers. You can also access the metadata attached To the documents using a few lines of code. Please continue To read more about the features and the file formats supported by the API.... Plain text, Markdown, and HTML formatters are present Extract...pps/.pptm/.ppsm/.ppsx/.odp) Spreadsheet Document Formats (.xls/,xlsx/...
GroupDocs.Viewer for Java allows you To render documents in various formats as Html, PDF, JPEG, and PNG files. You do not need To use third-party software To view files within your Java application.... Load text documents, spreadsheets, presentations, PDF files...images and render/display them in HTML, PDF, PNG, and JPEG formats...
Comments for File Format Blog Comment on Doc To Docx – A change worth considering To switch! by Homepage <strong>... [Trackback]</strong> [...] Read More Infos here: blog.aspose.com/200......that can be rendered to basic html with a text writer) to be placed...placed onto a spreadsheet. <br>To see an example of this...