-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp of the date pipeline #22
Conversation
The test that fails is I'm removing this single test |
Remove unneeded pattersn
Remove modifiers Update key parsing
Modify the axes. Types: absolute, relative (with direction), duration Mode: SINCE, UNTIL, etc
Drum roll please 🥁 |
@Thomzoy & @percevalw, I'm eager to have your opinion on this one. |
Codecov Report
@@ Coverage Diff @@
## master #22 +/- ##
==========================================
+ Coverage 95.88% 96.22% +0.34%
==========================================
Files 131 139 +8
Lines 3107 3151 +44
==========================================
+ Hits 2979 3032 +53
+ Misses 128 119 -9
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not review all files but what I've seen is terrific
Great work ! Benchmark on 10000 clinical docs:
It is a bit slower but the new features are definitely worth it, and we can still optimise in the future. |
Thanks for the benchmark! I am a bit disappointed by the results though :) I just pushed a fix for the code block checks, I'll merge the changes as soon as the CI passes. Thank you for your inputs ! |
Description
This PR proposes a new
eds.dates
pipeline. It addresses the many limitations suffered by the current component.The proposed pipeline drops the
dateparser
dependency. Albeit powerful,dateparser
was not a perfect fit, and incurred a significant toll on the pipeline's overall complexity to work around the library's limitations.The new date matcher handles:
pendulum
Example
cette année
{'year': 0, 'direction': <Direction.CURRENT: 'CURRENT'>}
0 microseconds
pendant trois semaines
{'mode': <Mode.DURATION: 'DURATION'>, 'week': 3}
3 weeks
cette année pendant trois semaines
{<Mode.FROM: 'FROM'>: cette année, <Mode.DURATION: 'DURATION'>: pendant trois semaines}
The experimental period detection algorithm uses pre detected dates.
Checklist