SunlightLabs API
Sunlightlabs Cross-Reference Database
A number of non-profits are working together to promote accountability and transparency in elected officials by tracking data such as campaign contributions, votes, lobbying and contracts. The problem is that different organizations use different IDs for the same person.
That is, Senator Obama is N00009638 to the Center for Responsive Politics , 400629 to GovTrack.us and o000167 to the Washington Post. So, if we want to match contributions to Senator Obama (from one site) with Senator Obama's votes (from a different site), there has been no easy way to match IDs. SunlightLabs has created a database to cross-reference all the different IDs for members of congress.
We have compiled basic biographical/clerical information about representatives and senators of the 110th Congress as well as their IDs from a number of non-profits and government data sources:
As well as tying up the IDs, The SunlightLabs have also created APIs (application programmer interfaces, i.e. a way to convert between IDs programmatically) to these data. Thus, a developer can more easily create a mashup that pulls data from one site, gets the ID for that person, coross-references to the ID for the same person but on a different site and bring those two pieces of data together, with confidence that they are the same person. No more matching names with their middle name, nick name variations. We sincerely hope that this database + API will stimulate transparency-related mashup innovation.
A Common ID
The primary key of the data is currently called "entity_id". The final form of this id has yet to be set but is currently "fakeopenID"+integer, e.g., "fakeopenID23". Ultimately, we would like to promote a common ID from the get go. That way, cross-referencing will not be needed and everybody's life will be easier. Think of it as a "social-security number" for members of congress. If everyone uses this ID we—as a community—can track their political activity far more easily.
We are looking to discuss with the community at large what common, organizationally-neutral ID form makes the most sense. A fake social security number, an OpenID-like ID (say something like "cardin.ben.senate.gov") or something else?
What's in the database?
12304 records covering 27 attributes about 539 distinct entities.
The database is organized around entity IDs associated with a name / value pair. This allows us to add new columns—IDs or fields from other organizations—very easily.
These are the name fields used in the database:
| Name | Description |
|---|---|
| CRPcandID | The candidate ID used by the Center For Responsive politics. |
| URL | The person's or entity's website |
| state_full_name | The full name of a state, e.g. "North Dakota" |
| state_abbreviation | The abbreviation of a state's name, e.g. "CA" |
| district | The integer value of a representative's district, e.g., "6" |
| WashPost_ID | Washington Post's ID for this person |
| party | The person's party, i.e. "D", "R" or "I' |
| VoteSmart_ID | The ID used by Project Vote Smart (http://votesmart.org/index.htm) |
| title | A person's title, e.g. "Representative" or "Senator" |
| member110congress | Is/Was the person a member of the 110th Congress?, i.e., "yes" or "no" |
| FEC_ID | FEC's ID for this person |
| webform | A contact webform for this person |
| Email for this person | |
| senator_class | Class for this person (Senator): I, II, or III |
| phone | Phone number |
| name_suffix | Name Suffix, e.g. "Jr." or "II" |
| congress_office | Address of their Congressional Office, e.g. "1502 Longworth HOB, Washington, DC 20515-1101" |
| gender | Gender: "M" or "F" |
| photo | The filename for their photo at http://sunlightlabs.com/widgets/popuppoliticians/resources/images/ |
| congresspedia | The URL for their Congresspedia page |
| BioGuide_ID | ID for their BioGuide entry at http://bioguide.congress.gov/biosearch/biosearch.asp |
| Eventful_ID | The performer ID for an Eventful politician |
| lastname | Last Name(s) |
| firstname | First Name(s) |
| middlename | Middle Names(s) |
| nickname | Nickname |
| GovTrack_ID | The ID used by Govtrack.us |
| entity_id | The common id being promoted by the labs |
Data Sources
| source_id | Source | Date Accessed |
|---|---|---|
| Feb2007_house_dot_gov_list | http://www.house.gov/house/MemberWWW.shtml | Feb 5, 2007 |
| Feb2007_senate_dot_gov_list | http://senate.gov/general/contact_information/senators_cfm.cfm | Feb 5, 2007 |
| Feb2007_house_member_list | http://clerk.house.gov/member_info/mcapdir.html | Feb 5, 2007 |
| Feb2007_110_member_mailing_labels | http://clerk.house.gov/member_info/excelmemberlabels_110.xls | Feb 5, 2007 |
| Feb2007_WashPost_senator_list | http://projects.washingtonpost.com/congress/110/senate/members/ | Feb 6, 2007 |
| govtracksite | http://www.govtrack.us/congress/findyourreps.xpd | Feb 6, 2007 |
| csaction_senator_email_list | http://csaction.org/resources/senate.html | Feb 6, 2007 |
| contactcongress | ftp://ftp.visi.com/users/juan/ContactingCongress.db.txt | Feb 6, 2007 |
| actup | http://www.chicagoabc.org/take_action.htm | Feb 6, 2007 |
| votesmart_member_list | http://votesmart.org/official_congress_search.php?type=all&criteria=none | Feb 8, 2007 |
| eventful | list of performer IDs sent from jed at eventful | Feb 12, 2007 |

