Skip to content
konklone edited this page Oct 29, 2012 · 43 revisions

This project gathers and produces public domain data about legislation in Congress. Right now, data is collected from [THOMAS.gov](http://thomas.gov), an official source for legislative information, run by the Library of Congress, and covers 1973 to the present.

Download in bulk

Current and archival data is available for public download, hosted here on Github. You can download ZIP files containing a JSON file for every bill in each Congress since 1973.

Data for the current Congress is refreshed nightly. There is no need to download it more frequently than this; the official sources are not updated in real time.

112th Congress - 2011-2012 107th Congress - 2001-2002 102nd Congress - 1991-1992 97th Congress - 1981-1982
111th Congress - 2009-2010 106th Congress - 1999-2000 101st Congress - 1989-1990 96th Congress - 1979-1980
110th Congress - 2007-2008 105th Congress - 1997-1998 100th Congress - 1987-1988 95th Congress - 1977-1978
109th Congress - 2005-2006 104th Congress - 1995-1996 99th Congress - 1985-1986 94th Congress - 1975-1976
108th Congress - 2003-2004 103rd Congress - 1993-1994 98th Congress - 1983-1984 93rd Congress - 1973-1974

Documentation

Every bill has a JSON file, data.json, with fields related to a bill's ID, status, names, sponsorship, amendments, and history.

These examples use data excerpts from H.R. 3590 from the 111th Congress - the Patient Protection and Affordable Care Act, also known as Obamacare.

Basic information

{
  "bill_id": "hr3590-111", 
  "bill_type": "hr", 
  "number": "3590", 
  "congress": "111", 
  "introduced_at": "2009-09-17"
}

Bill IDs are of the form [bill_type][number]-[congress]. Bills are renumbered at the beginning of each elected Congress. A "Congress" is a two year period in which there are no nationwide elections to elect new members. The 111th Congress was from January 2009 to December 2010. In modern times, sessions of Congress begin in January and end in December, but this hasn't always been the case, and doesn't need to be.

Bill types can be: hr, hres, hjres, hconres, s, sres, sjres, sconres.

All introduction dates are dates, not specific times.

Titles

{
  "official_title": "An act entitled The Patient Protection and Affordable Care Act.", 
  "popular_title": "Health care reform bill", 
  "short_title": "Patient Protection and Affordable Care Act", 
  "titles": [
    {
      "as": null, 
      "title": "Health care reform bill", 
      "type": "popular"
    }, 
    {
      "as": "enacted", 
      "title": "Patient Protection and Affordable Care Act", 
      "type": "short"
    }, 
    {
      "as": "amended by senate", 
      "title": "An act entitled The Patient Protection and Affordable Care Act.", 
      "type": "official"
    }
  ]
}

Bills can have "official" descriptive titles (almost always), "short" catchy titles (sometimes), and "popular" nickname titles (rare). They can have many of these titles, given at various stages of a bill's life. The current official, short, and popular titles are kept in top-level official_title, short_title, and popular_title fields.

Popular titles are assigned by the Library of Congress, and can be added at any time.

Summary and keywords

{
  "subjects": [
    "Abortion", 
    "Administrative law and regulatory procedures", 
    "Adoption and foster care"
  ], 

  "summary": "Patient Protection and Affordable Care Act - Title I: Quality, Affordable Health Care for All Americans..."
}

The Library of Congress assigns official summaries to some bills, and official keywords to most bills. These values are written by a human in the Library of Congress, and are not usually present when information on the bill is first published. They can be added at any time.

Bill history

{
  "history": {
    "house_passage_result": "pass", 
    "house_passage_result_at": "2010-03-21T22:48:00-05:00", 
    "senate_passage_result": "pass", 
    "senate_passage_result_at": "2009-12-24", 
    "vetoed": false,
    "awaiting_signature": false, 
    "enacted": true, 
    "enacted_at": "2010-03-23"
  }
}

The history section contains a number of useful flags and timestamps for documenting the life cycle of a bill. Timestamps can be either dates or times - as of now, the House provides time information and the Senate does not.

Not appearing above: house_override_result, house_override_result_at, senate_override_result, senate_override_result_at document results and times for attempts to override a veto.

vetoed_at and awaiting_signature_since will be present as dates or timestamps if their corresponding flag fields are set to true.

Sponsorships

{
  "sponsor": {
    "district": "15", 
    "name": "Rangel, Charles B.", 
    "state": "NY", 
    "thomas_id": "944", 
    "title": "Rep"
  },
  "cosponsors": [
    {
      "district": "31", 
      "name": "Becerra, Xavier", 
      "sponsored_at": "2009-09-17", 
      "state": "CA", 
      "thomas_id": "70", 
      "title": "Rep", 
      "withdrawn_at": null
    }, 
    {
      "district": "1", 
      "name": "Berkley, Shelley", 
      "sponsored_at": "2009-09-17", 
      "state": "NV", 
      "thomas_id": "1576", 
      "title": "Rep", 
      "withdrawn_at": null
    }
  ]
}

Information on the sponsor (almost always present) and cosponsors (sometimes present) includes basic information on their name, state, district, title, and when they sponsored a bill. Sometimes, cosponsors withdraw cosponsorship.

The most useful field here is the thomas_id. This can be used in conjunction with the dataset at congress-legislators to find much more information about the legislator - including their IDs in other useful systems.

Committees

{
  "committees": [
    {
      "activity": [
        "referral"
      ], 
      "committee": "House Ways and Means"
    }
  ]
}

Bills, upon introduction, are typically referred to one or more committees. The committees field will list which committees have what relation to the bill, by the name they are referenced on THOMAS.gov.

To denormalize committee names into common unique identifiers, use GovTrack's historical committees.xml. Committee names can change over time.

Amendments

{
  "amendments": [
    {
      "amendment_id": "s2786-111", 
      "amendment_type": "s",
      "chamber": "s", 
      "number": "2786"
    }, 
    {
      "amendment_id": "s2787-111", 
      "amendment_type": "s",
      "chamber": "s", 
      "number": "2787"
    }
  ]
}

Any amendments introduced in relation to this bill. Amendment IDs are of the form [amendment_type][number]-[congress].

Almost all the time, the amendment_type is the chamber. In the 97th and 98th Congresses, there appear some "Senate Unprinted Amendments". For these amendments, the amendment_type is "su".

More detailed amendment information is not yet part of this project, but we plan to include it soon.

Related bills

{
  "related_bills": [
    {
      "bill_id": "hconres254-111", 
      "reason": "related"
    },
    {
      "bill_id": "hr4872-111", 
      "reason": "related"
    }
  ]
}

The IDs and relationships of related bills. It's not highly useful - values for reason include: related, unknown, and identical.

Official activity

{
  "actions": [
    {
      "acted_at": "2009-12-23", 
      "references": [
        {
          "reference": "CR S13796-13866", 
          "type": "consideration"
        }
      ], 
      "text": "Considered by Senate.", 
      "type": "action"
    }, 
    {
      "acted_at": "2009-12-24", 
      "how": "roll", 
      "references": [
        {
          "reference": "CR S13890-142124", 
          "type": "text"
        }
      ], 
      "result": "pass", 
      "roll": "396", 
      "state": "PASS_BACK:SENATE", 
      "text": "Passed Senate with an amendment and an amendment to the Title by Yea-Nay Vote. 60 - 39. Record Vote Number: 396.", 
      "type": "vote", 
      "vote_type": "vote2", 
      "where": "s"
    }, 
    {
      "acted_at": "2010-03-21T22:48:00-05:00", 
      "how": "roll", 
      "references": [
        {
          "reference": "CR H1920-2152", 
          "type": "text as House agreed to Senate amendments"
        }
      ], 
      "result": "pass", 
      "roll": "165", 
      "state": "PASSED:BILL", 
      "text": "On motion that the House agree to the Senate amendments Agreed to by recorded vote: 219 - 212 (Roll no. 165).", 
      "type": "vote", 
      "vote_type": "pingpong", 
      "where": "h"
    }, 
    {
      "acted_at": "2010-03-23", 
      "references": [], 
      "text": "Signed by President.", 
      "type": "signed"
    }, 
    {
      "acted_at": "2010-03-23", 
      "references": [], 
      "state": "ENACTED:SIGNED", 
      "text": "Became Public Law No: 111-148.", 
      "type": "enacted"
    }
  ]
}

Many actions can occur to a bill over its life, and there will always be at least one (its referral to a committee). Every action has its text, a date or timestamp, and a list of any references to the Congressional Record where this action can be found.

Where possible, metadata is parsed out of the text of an action to infer more information about it. Some actions will have a more specific type, and any action that effects a change in the bill's state will have a state field. There are other specific fields for vote actions, such as the mechanism of the vote, and its number (if its mechanism was a roll call).

Becoming Law

{
  "enacted_as": {
    "congress": "111", 
    "law_type": "public", 
    "number": "148"
  }
}

If a bill has been enacted as law, the enacted_as field will be present, with the public or private law number the bill turned into. They are typically cited in the form of "Public Law 111-148" or "Private Law 111-1".

law_type can be either "public" or "private". Most laws are public. Private laws mean laws affecting a particular person or group. For example, sometimes individuals are granted citizenship directly through private laws, such as with S. 4010 in the 111th Congress.

Clone this wiki locally