Skip to content

Creación de un archivo para clusterización en Carrot Workbench a partir de archivos en un directorio

License

Notifications You must be signed in to change notification settings

cayorodriguez/temucluster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TEMUcluster

Descripcion

Creación de un archivo para clusterización en Carrot Workbench a partir de archivos en un directorio

The output is created as an XML file that can be loaded and visualized in the Carrot Workbench platform (https://doc.carrot2.org/):

1. First it reads and opens text files from specified directory.

2. It convert eachdocument into an entry in an xml document formatted for Carrot, where the "snippet" is the text of the file, and its title and url are the filenames

3. The xml should be opened in Carrot WorkBench with the XML source option, and selecting a specific algorithm and the "read all documents from input" options. Then press "Process"

4. For each algorithm selected, it will open a tab, so be careful with the memory usage.

5. You can also select a keyword in the query field to create a clusterization with ONLY docs that contain that keyword

Resultado sobre un directorio de casos clínicos de oncología:

Image of Clusterization

About

Creación de un archivo para clusterización en Carrot Workbench a partir de archivos en un directorio

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages