-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only parse Schedule A itemizations #45
Comments
Hi @NickCrews, thanks for the question. I agree this is an important and useful feature to add. I'll see how easy it is to add a flag to pass a regex form filter. Would something like |
I think that will mostly work but I have observed out-of-order forms in the past (very rare). @chriszs may have more insight |
Dylan's correct. Order is not guaranteed, though it's often ordered that way. For multi-gigabyte files, the limiting factor tends to be download speed. Filtering form types in FastFEC would speed up parsing, but it wouldn't bail half way through the download as this does, so it wouldn't have much of an impact on the overall time. Speeding up the download using |
Thank you for the responses. That makes sense that we can't rely on order, darn. And I would see how if we need to download the whole file then skipping parsing won't gain much speed. I guess save some disk space. So this isn't a super must have for me, if you aren't interested in supporting it then I wouldn't be heartbroken. I would say that I might prefer explicit table names, instead of a regex, there aren't that many options. (Unless I'm wrong and there are a lot?) |
Hi! Thanks for this great utility.
I only care about the Schedule A itemizations. In some cases of multi gig .FEC files, the non-schedule A entries can take up more than half of the file, and so really slow down parsing.
Can we add some options to only parse particular itemizations?
In the meantime, I do this, do you see any problems with it? Like are schedule A itemizations always going to come before other schedules?
and use it as
curl https://docquery.fec.gov/dcdev/posted/13360.fec | filter_fec.sh | fastfec 13360
The text was updated successfully, but these errors were encountered: