Java document parser API to extract text, images, metadata & encoding from databases, Word, Excel, presentations, PDF, email, EPUB and ZIP files....XLTM HTML CHM PPSM PPTX PDF BZ2 EML XLSX OST XHTML MHTML XLTX...XLTM XLTX FB2 RTF CSV CHM XHTML BZ2 EPUB TIF PPSM JPEG DOT BMP OST...