Java based library that will extract text from Microsoft Word for Windows binary documents including Word 1.0/2.0/4.0/6.0/95/97/2000/xp/2003.
Extracts text from fast-saved files as well.
Initially imported from : https://code.google.com/archive/p/text-mining/source/default/source
This version has the following improvement compared to the legacy project :
- compatible with Apache POI (version 3.17)
- mavenized project
- requires Java 8
- use of generics
This version is provided AS IS and is NOT actively maintained by Jalios.