Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: (maybe?) Missing trailing commas from output #47

Closed
NickCrews opened this issue Aug 23, 2022 · 1 comment
Closed

BUG: (maybe?) Missing trailing commas from output #47

NickCrews opened this issue Aug 23, 2022 · 1 comment

Comments

@NickCrews
Copy link

NickCrews commented Aug 23, 2022

Not sure if this is a bug or not. If I run fastfec 878160 and I look at the resulting output/878160/SA11D.csv, then I see this:

form_type,filer_committee_id_number,transaction_id,back_reference_tran_id_number,back_reference_sched_name,entity_type,contributor_organization_name,contributor_last_name,contributor_first_name,contributor_middle_name,contributor_prefix,contributor_suffix,contributor_street_1,contributor_street_2,contributor_city,contributor_state,contributor_zip_code,election_code,election_other_description,contribution_date,contribution_amount,contribution_aggregate,contribution_purpose_descrip,contributor_employer,contributor_occupation,donor_committee_fec_id,donor_committee_name,donor_candidate_fec_id,donor_candidate_last_name,donor_candidate_first_name,donor_candidate_middle_name,donor_candidate_prefix,donor_candidate_suffix,donor_candidate_office,donor_candidate_state,donor_candidate_district,conduit_name,conduit_street1,conduit_street2,conduit_city,conduit_state,conduit_zip_code,memo_code,memo_text_description,reference_code
SA11D,C00477828,C7168136,,,CAN,,Clarke,Hansen,,,,2900 E Jefferson Ave,Apt C4,Detroit,MI,482074242,P2012,,2013-06-30,565.73,565.73,,,,,,H0MI13398,Clarke,Hansen,,,,H,MI,13,,,,,,,,"* In-Kind: In-kind, web hosting and phone services, to be reimbursed"

It looks to me that this is missing the required trailing comma that separates the memo_text_description and (the missing) reference_code value. If I try to load this with a pyarrow csv reader with the given 45 column names, it gets mad because it only sees 44 values in the row. You can replicate with pd.read_csv(path, engine="pyarrow"). Other CSV parsers such as vanilla pandas (pd.read_csv(path)) and vaex are more forgiving and just fill in NA for the missing reference_code values, so perhaps that is why this hasn't been caught before.

If I look at at the resulting output/878160/SB17.csv, it's a similar story: there is one less trailing comma than there should be to separate the missing last value.

However, if I look at output/878160/F3S.csv, then this looks correct. I'd guess this is because the last value in that row are non-missing:

form_type,filer_committee_id_number,date_general_election,date_day_after_general_election,a_total_contributions_no_loans,b_total_contribution_refunds,c_net_contributions,a_total_operating_expenditures,b_total_offsets_to_operating_expenditures,c_net_operating_expenditures,a_i_individuals_itemized,a_ii_individuals_unitemized,a_iii_individuals_total,b_political_party_committees,c_all_other_political_committees_pacs,d_the_candidate,e_total_contributions,transfers_from_other_auth_committees,a_loans_made_or_guarn_by_the_candidate,b_all_other_loans,c_total_loans,offsets_to_operating_expenditures,other_receipts,total_receipts,operating_expenditures,transfers_to_other_auth_committees,a_loan_repayment_by_candidate,b_loan_repayments_all_other_loans,c_total_loan_repayments,a_refund_individuals_other_than_pol_cmtes,b_refund_political_party_committees,c_refund_other_political_committees,d_total_contributions_refunds,other_disbursements,total_disbursements
F3S,C00477828,2012-11-06,2012-11-07,3120.73,0.00,3120.73,2153.17,3340.65,-1187.48,1500.00,55.00,1555.00,0.00,1000.00,565.73,3120.73,0.00,0.00,0.00,0.00,3340.65,0.00,6461.38,2153.17,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,2153.17
@NickCrews
Copy link
Author

Dupe of #24

@NickCrews NickCrews closed this as not planned Won't fix, can't repro, duplicate, stale Dec 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant