Java document parser API To extract text, images, metadata & encoding from databases, Word, Excel, presentations, PDF, email, EPUB and ZIP files.... Word Processing : DOC, DOCX, DOCM...Markup : HTML, XHTML, MHTML, MD, XML Portable Formats : PDF Email...