Issues after tuning #2408

gotyca · 2022-04-12T15:09:00Z

gotyca
Apr 12, 2022

Hello.
I'm using haystack framework and some of the models developed by deepset to assess data quality.

The idea is: I pass a small context to the reader and ask a specific question in order to check if the answer is inside the context or not.
For that I tuned (using the haystack) roberta-base-squad2 with some domain specific data.

The problem: In general, the model works pretty well but sometimes it returns just the first word of the answer (with high score) in a not consistent way. That is, sometimes the model returns the whole answer and sometimes not, for very similar contexts.

Here is an example where the model failed to answer:

{'query': 'When the risk status will change?', 'no_ans_gap': 19.94709014892578, 'answers': [{'answer': 'Return', 'score': 0.9999945163726807, 'context': 'ty (product idea through change released to customers) with recommendations made to address any improvements. Return to expected targeted for Q1 2022.', 'offset_start': 110, 'offset_end': 116, 'offset_start_in_doc': 458, 'offset_end_in_doc': 464, 'document_id': 'bebf00db8e1fb6846705afbbfcba4d14'}]}

The expected answer is: "Return to expected targeted for Q1 2022" but the model just gets the word "Return"

In this other case the model succeeded:

{'query': 'When the risk status will change?', 'no_ans_gap': 33.04367637634277, 'answers': [{'answer': 'Return to Expected in February 2023', 'score': 0.9999982118606567, 'context': ' automation where possible as well as accurate identification of in scope population requiring reviews/approvals. Return to Expected in February 2023.', 'offset_start': 114, 'offset_end': 149, 'offset_start_in_doc': 341, 'offset_end_in_doc': 376, 'document_id': '4d6c5b8b2838c5d8abf5f0433e06eeaa'}]}

The expected answer is: "Return to Expected in February 2023"

Have you seen that before? How can I make the model answer consistently?

Have you seen this behavior before? Any ideas of

brandenchan · 2022-04-21T08:42:50Z

brandenchan
Apr 21, 2022

Hi @gotyca, it is unfortunate that in the first example the Reader only picks out "Return" as answer! So the answer span that the model picks is influenced by many factors and the slightest differences in the context around the answer, or the query itself can cause the model to choose a different span. I expect that the differences in context account for the different predictions in your first and second examples.

One thing I would suggest is perhaps rephrasing the query and seeing if the model changes its prediction. Could you try asking "When will the risk status change?" or "When is the risk status expected to change?"

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues after tuning #2408

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Issues after tuning #2408

gotyca Apr 12, 2022

Replies: 1 comment

brandenchan Apr 21, 2022

gotyca
Apr 12, 2022

brandenchan
Apr 21, 2022