Evaluator updates #103

omri374 · 2024-10-12T08:56:39Z

Revamp of the evaluation process, including:

Improved evaluation notebooks: Notebook 4 shows a vanilla Presidio evaluation, notebook 4.5 shows a more customized Presidio with improved accuracy
Removed the Pseudonomyzation notebook as there is a more advanced approach within Presidio
Added the ability to use generic entities and skip words
Added the ability to do faster batch predict
Added sample_id to be able to reproduce the full sample
Fixed issue with hospital provider networking

- minor bug fixes in evaluation - added the ability to use generic entities and skip words - added the ability to do faster batch predict - removed the old CRF implementation - added sample_id to be able to reproduce the full sample from error

- minor bug fixes in evaluation - added the ability to use generic entities and skip words - added the ability to do faster batch predict - removed the old CRF implementation - added sample_id to be able to reproduce the full sample from error - fixed issue with hospital provider networking

NOTICE

presidio_evaluator/data_generator/faker_extensions/data_objects.py

presidio_evaluator/data_generator/faker_extensions/providers.py

presidio_evaluator/data_generator/faker_extensions/span_generator.py

presidio_evaluator/data_generator/presidio_data_generator.py

presidio_evaluator/data_generator/raw_data/templates.txt

presidio_evaluator/data_objects.py

presidio_evaluator/evaluation/evaluation_result.py

presidio_evaluator/evaluation/evaluator.py

presidio_evaluator/evaluation/model_error.py

presidio_evaluator/models/flair_model.py

presidio_evaluator/models/presidio_analyzer_wrapper.py

presidio_evaluator/models/presidio_recognizer_wrapper.py

presidio_evaluator/models/spacy_model.py

presidio_evaluator/models/text_analytics_wrapper.py

omri374 · 2024-10-24T21:12:20Z

/azp run

azure-pipelines · 2024-10-24T21:12:28Z

Azure Pipelines successfully started running 1 pipeline(s).

tranguyen221

Hello Omri, great PR! Thanks for working on it. I’ve run it locally, and it works perfectly. I just have a few minor comments.

notebooks/4_Evaluate_Presidio_Analyzer.ipynb

tranguyen221 · 2024-10-31T06:27:31Z

presidio_evaluator/evaluation/evaluator.py

            """Graph most common false positive and false negative tokens for each entity."""
-            ModelError.most_common_fp_tokens(self.errors)
            fps_frames = []
            fns_frames = []
            for entity in self.model.entity_mapping.values():


As you have identified the 'Wrong-Entity' error, I'm not sure if it makes sense to include a plot for it. Should we add a plot for the most common 'Wrong-Entity' errors?

Now it's part of FPs. Would you suggest to separate the two? or have it as part of FPs and wrong entity?

The plot_most_common_tokens function calls different functions depending on the type of error it is analyzing. For false positive errors, it calls the get_fps_dataframe function with the parameter error_type="FP". For false negative errors, it calls the get_fns_dataframe function with the parameter error_type='FN'. Additionally, to analyze errors related to the wrong entity, it calls the get_wrong_entity_dataframe function with the parameter error_type="Wrong entity"

tranguyen221 · 2024-10-31T06:42:00Z

presidio_evaluator/evaluation/evaluator.py

-        return ((1 + beta**2) * precision * recall) / (
-            ((beta**2) * precision) + recall
-        )
+        return ((1 + beta**2) * precision * recall) / (((beta**2) * precision) + recall)

    class Plotter:


I think we could separate classes Evaluation and Plotter because it enhances maintainability and readability, especially since the file is lengthy at 727 lines. It's also easier for testing and reuse of each class.

tranguyen221 · 2024-10-31T08:33:18Z

presidio_evaluator/evaluation/evaluator.py

+            fig.update_traces(textfont=dict(size=10))
+            fig.update_layout(width=800, height=800)
+
+            fig.show(save_as)


Maybe adding error handling here to handle unexpected value of save_as paramter?

tranguyen221 · 2024-10-31T08:33:37Z

presidio_evaluator/evaluation/evaluator.py

                )
                fig.update_layout(yaxis={"categoryorder": "total ascending"})
-                fig.show()
+                fig.show(save_as)


Maybe adding error handling here to handle unexpected value of save_as paramter?

Also I'm thinking in the automated pipeline, the end-user might prefer to save the plot as a file rather than visualizing it directly. Therefore, it could be beneficial to include an option like fig.write_image('filename.png')

One way to do this is just to return the fig object, which allows the user to take it and save it. Another is to add a parameter with a path to save in. Which one would be better in your perspective?

I prefer to return the fig object.

omri374 added 2 commits October 12, 2024 11:53

- improved evaluation notebooks

c9984e7

- minor bug fixes in evaluation - added the ability to use generic entities and skip words - added the ability to do faster batch predict - removed the old CRF implementation - added sample_id to be able to reproduce the full sample from error