Skip to content
dabeat edited this page Jan 31, 2013 · 16 revisions

DeSR (DS)

UIMA wrapper for the DeSR dependency parser. This analysis engine creates dependency annotations.

The annotator descriptor allows to set which five attributes (token, lemma, pos, cpos, feats) are going to be used by the DeSR parser, as well as a flag for each attribute, indicating whether the attribute is to be used in the process or not.

Descriptor: DeSR.xml

C++ class: DeSR.cpp

Typesystem:

Input:

org.barcelonamedia.uima.ts.Sentence
org.barcelonamedia.uima.ts.Token 

Output:

Dependencies:

DeSR version: SVN revision 306. Follow README compilation steps. File DeSR_HOME/src/_desr.so will be generated, among other files. Rename this file to lib_desr.so.
Annotator shared library (DeSR.so) generated by the annotator makefile. 

These two shared libraries, lib_desr.so and DeSR.so, will be required by the annotator to properly be executed.

PEARs using annotator:

Technical description:

UIMA DeSR analysis engine is a UIMA C++ annotator. It has been developed by means of C++ SDK provided by UIMA. This SDK allows to develop C++ UIMA annotators. In particular, version 2.3.0 has been used to develop DeSR annotator, although last uimacpp version (2.4.0) has also been succesfully tested for compiling and executing. DeSR annotator source code has been compiled with gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9).

UIMA C++ SDK allows to develop C++ annotators, generating a shared library that which will be called from the pipeline. Annotator descriptor must be set with the C++ implementation option. UIMA pipeline will look for a shared library named the same as the annotator in the dynamic library path specified in the $LD_LIBRARY_PATH environmental variable.

DeSR analysis engine was developed on top of UIMA C++ SDK. Nevertheless, Java + Swig option was evaluated, but eventually, UIMA C++ SDK was the choosen option.

Some environmental variables are needed in order to compile and execute UIMA C++ SDK examples:

Environment Variables

The following environmental variables are needed for UIMA C++ to function properly.

* UIMACPP_HOME should point to the uimacpp directory of your unpacked Apache UIMA C++
  distribution. UIMACPP_HOME is used when compiling & linking UIMA C++ components.
* Append $UIMACPP_HOME/bin to your PATH to pick up the runAECpp test driver
  and shared libraries (Windows)
* Append $UIMACPP_HOME/lib to your LD_LIBRARY_PATH (Linux) or DYLD_LIBRARY_PATH (MacOSX)
  so that the necessary shared libraries can be found.

Also note that UIMA C++ annotators are built as shared libraries, so they must be in a directory in the LD_LIBRARY_PATH, DYLD_LIBRARY_PATH or PATH (as appropriate to your platform) as well. An example of this is given in the next section.

For better runtime integration between Java and C++, the Apache UIMA Java SDK command line utilities and Eclipse run configurations automatically add $UIMA_HOME/uimacpp/lib to LD_LIBRARY_PATH and DYLD_LIBRARY_PATH, and add $UIMA_HOME/uimacpp/bin to PATH.

It is also needed to add following C++ code lines into the examples source code to succesfully execute them:

#include

before the line:

#include "uima/api.hpp"

It was made in the DaveDetector.cpp example, as it is the C++ annotator example which DeSR was developed from.

DeSR requieres Python and Python-devel 2.6. It might be necessary to install doxygen in /usr/local/bin (Ubuntu sets doxygen in /usr/bin).

Warning: Unlike UIMA Java SDK, UIMA SDK C++ doesn't require annotator Type System implementation, It generates automatically the necessary classes from the Type System specification in the descriptor and Type System XML files. Creating the shared library

Execute makefile located in UIMA_SVN/trunk/code/DeSR-cpp/:

make -f DeSR.mak

Explanatory note: it is mandatory to have set the environmental variables required by UIMA C++ SDK in order to execute this command.

Explanatory note: $DESR_HOME environmental variable is required for the DeSR annotator compilation. In varovani /NAS_Backup/proyectos/uima/DeSR_parser/r306 is found DeSR source code as well as the shared libraries generated by the compilation process. $DESR_HOME should point to this path. Using DeSR annotator from a pipeline

Path to shared library DeSR.so must be set in the corresponding environmental variable, as well as the path to the lib_desr.so DeSR parser shared library.

The annotator descriptor file can be found in the annotator Eclipse project, as well as the needed resources in /resources. Generate the pear, install it, and then the annotator will use the shared library DeSR.so specified in the annotator descriptor file as weel as in the environmental variable.

In the annotator PEAR del componente are found the necessary shared libraries. Once the annotator has been installed, shared libraries lib_desr.so and DeSR.so will be located in /lib directory.

Clone this wiki locally