Skip to content

Java based library that will extract text from Microsoft Word for Windows binary documents including Word 1.0/2.0/4.0/6.0/95/97/2000/xp/2003. Extracts text from fast-saved files as well.

License

Notifications You must be signed in to change notification settings

jalios/text-mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

text-mining

Java based library that will extract text from Microsoft Word for Windows binary documents including Word 1.0/2.0/4.0/6.0/95/97/2000/xp/2003.

Extracts text from fast-saved files as well.

Initially imported from : https://code.google.com/archive/p/text-mining/source/default/source

This version has the following improvement compared to the legacy project :

  • compatible with Apache POI (version 3.17)
  • mavenized project
  • requires Java 8
  • use of generics

This version is provided AS IS and is NOT actively maintained by Jalios.

About

Java based library that will extract text from Microsoft Word for Windows binary documents including Word 1.0/2.0/4.0/6.0/95/97/2000/xp/2003. Extracts text from fast-saved files as well.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages