Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database vs. document cloud data mismatch #65

Open
jywsn opened this issue Oct 14, 2015 · 6 comments
Open

database vs. document cloud data mismatch #65

jywsn opened this issue Oct 14, 2015 · 6 comments
Assignees
Labels
Milestone

Comments

@jywsn
Copy link
Contributor

jywsn commented Oct 14, 2015

Joe got this link to a document in his search alert email:
https://cityhallmonitor.knightlab.com/documents/160595

The 160595 is the MatterAttachment ID.

This record does not exist in the staging database.

When I look up that record in the production database, the matter_id is 79730 and hyperlink is
http://ord.legistar.com/Chicago/attachments/bf8b9908-d560-4e8a-95a2-a071f4b5d64d.pdf

This matches what is on legistar:
http://webapi.legistar.com/v1/chicago/Matters/79730/Attachments

When I look up that document on DC by MatterAttachmentID, the Source is http://ord.legistar.com/Chicago/attachments/eb5113a9-9ad5-4582-ac2a-082fc3dd1fe2.pdf

When I search DC for documents with MatterId: 79730 I see only ^^ this one.

@jywsn jywsn added the bug label Oct 14, 2015
@jywsn jywsn added this to the Release 0.1 milestone Oct 14, 2015
@jywsn
Copy link
Contributor Author

jywsn commented Oct 16, 2015

More weirdness. Or maybe this is not weird. I'm not sure:

The prd database has 61264 MatterAttachment records.

According to DC, there are 61288 documents in the 'Chicago City Hall Monitor` project.

@jywsn jywsn self-assigned this Oct 19, 2015
@jywsn
Copy link
Contributor Author

jywsn commented Oct 19, 2015

working on a data audit...

@jywsn
Copy link
Contributor Author

jywsn commented Oct 22, 2015

Details from 10/19 audit here: https://basecamp.com/1937751/projects/9758052/documents/9934054

Exposed some issue that should be rectified before another audit is done.

@jywsn jywsn removed the in-progress label Oct 22, 2015
@jywsn jywsn removed their assignment Oct 22, 2015
@jywsn jywsn modified the milestones: Iteration 0.1b, Iteration 0.1a Oct 22, 2015
@jywsn
Copy link
Contributor Author

jywsn commented Oct 23, 2015

Also see the `Data Audit' wiki page in this repo.

@hbillings hbillings modified the milestones: Iteration 0.1b, Iteration 0.2, Iteration 1.0 Jan 13, 2016
@JoeGermuska JoeGermuska self-assigned this Jan 19, 2016
@JoeGermuska
Copy link
Member

May no longer be an issue?

@JoeGermuska
Copy link
Member

There are a very small number (3 at this moment) cases where we have a MatterAttachment without a corresponding Document. These are all cases where there are multiple attachments per matter, and two of the three are for non-PDFs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants