Java Document parser API to extract text, images, metadata & encoding from databases, Word, Excel, presentations, PDF, email, EPUB and ZIP files....document file formats : Text : DOC, DOCX, DOT, DOTM, DOTX, DOCM...UTF16 BE, UTF8, and ANSI Text : DOC, DOCX, DOT, DOTX, DOTM, OTT,...