Java document parser API to extract text, images, metadata & encoding from databases, Word, Excel, presentations, PDF, email, EPUB and ZIP files....Markup : HTML, XHTML, MHTML, MD, XML Portable Formats : PDF Email...MHTML XLTX POTX XLA POTM OTP RAR XML DOTX TAR PPTM ZIP FB2 PPT Extract...