Supported File Formats The following table indicates the input and output file formats supported by GroupDocs.Assembly for Python via .NET.
Format Description Load Save Populate Remarks DOC Microsoft Word 97 - 2007 Document. DOT Microsoft Word 97 - 2007 Template. DOCX Office Open XML WordprocessingML Document (macro-free). DOCM Office Open XML WordprocessingML Macro-Enabled Document. DOTX Office Open XML WordprocessingML Template (macro-free). DOTM Office Open XML WordprocessingML Macro-Enabled Template. RTF RTF format....Watermark Product Solution GroupDocs...Information Free Trials Temporary License Policies My Orders & Quotes...
This article explains how to create instance of the EditableDocument class from Html files from disk or from Html markup with resources using GroupDocs.Editor for Java API....Watermark Product Solution GroupDocs...Information Free Trials Temporary License Policies My Orders & Quotes...
Learn how to extract text from Word documents (.doc, .docx) using GroupDocs.Parser for .NET. Extract text from entire documents or specific pages with error handling in C#....Watermark Product Solution GroupDocs...Information Free Trials Temporary License Policies My Orders & Quotes...
Follow this guide and learn how to edit text documents, spreadsheets and presentations using GroupDocs.Editor for .NET API features....Watermark Product Solution GroupDocs...Information Free Trials Temporary License Policies My Orders & Quotes...
This tutorial provides all the necessary procedures to convert Outlook email to Word in C# and a sample working application for C# Email to Word conversion....Watermark Product Family GroupDocs.Merger...snippet shows how to provide the license, input files, and store the...
To extract text from EPUB e-books getText and getText(pageIndex) methods is used. These methods allow to extract text from the entire document or a text from the selected page. Raw mode is not supported for EPUB....Watermark Product Solution GroupDocs...Information Free Trials Temporary License Policies My Orders & Quotes...
Let's convert RTF to MHtml using Node.js seamlessly. Follow a step-by-step guide to export RTF to MHtml in Node.js with accurate formatting and high quality....Watermark Product Family GroupDocs.Merger...MHTML, or MIME HTML, allows for the packaging of HTML content with...
This article shows how to save output to a stream when rendering a document...Watermark Product Solution GroupDocs...Information Free Trials Temporary License Policies My Orders & Quotes...
GroupDocs.Redaction supports both types of image documents for Optical Character Recognition (OCR):
image files, such as printed document scans (PNG, JPG, etc.) embedded images within office documents (PDF, DOCX, etc.) You have to implement IOcrConnector interface and pass the instance to RedactorSettings constructor.
For more details, see OCR Usage Basics article.
OCR usage limitations There are the following limitations of the OCR with GroupDocs.Redaction for Java v21.6:
textual replacements are not supported, so you have to use color box replacements to redact text in images....Watermark Product Solution GroupDocs...Information Free Trials Temporary License Policies My Orders & Quotes...