Java document parser API to extract text, images, metadata & encoding from databases, Word, Excel, presentations, PDF, email, EPUB and ZIP files....XLA, XLAM Presentations : PPT, PPTX, PPTM, PPS, PPSX, PPSM, POT...GIF ONE JPG XLTM HTML CHM PPSM PPTX PDF BZ2 EML XLSX OST XHTML MHTML...