Sunlight Labs
Project of The Sunlight Foundation    
Read the labs blog

SunlightLabs API

Your API keys

Sunlightlabs Cross-Reference Database

A number of non-profits are working together to promote accountability and transparency in elected officials by tracking data such as campaign contributions, votes, lobbying and contracts. The problem is that different organizations use different IDs for the same person.

That is, Senator Obama is N00009638 to the Center for Responsive Politics , 400629 to GovTrack.us and o000167 to the Washington Post. So, if we want to match contributions to Senator Obama (from one site) with Senator Obama's votes (from a different site), there has been no easy way to match IDs. SunlightLabs has created a database to cross-reference all the different IDs for members of congress.

Show me the data!

See all the data

See the representative data only

See the senator data only

We have compiled basic biographical/clerical information about representatives and senators of the 110th Congress as well as their IDs from a number of non-profits and government data sources:

As well as tying up the IDs, The SunlightLabs have also created APIs (application programmer interfaces, i.e. a way to convert between IDs programmatically) to these data. Thus, a developer can more easily create a mashup that pulls data from one site, gets the ID for that person, coross-references to the ID for the same person but on a different site and bring those two pieces of data together, with confidence that they are the same person. No more matching names with their middle name, nick name variations. We sincerely hope that this database + API will stimulate transparency-related mashup innovation.

A Common ID

The primary key of the data is currently called "entity_id". The final form of this id has yet to be set but is currently "fakeopenID"+integer, e.g., "fakeopenID23". Ultimately, we would like to promote a common ID from the get go. That way, cross-referencing will not be needed and everybody's life will be easier. Think of it as a "social-security number" for members of congress. If everyone uses this ID we—as a community—can track their political activity far more easily.

We are looking to discuss with the community at large what common, organizationally-neutral ID form makes the most sense. A fake social security number, an OpenID-like ID (say something like "cardin.ben.senate.gov") or something else?

What's in the database?

12304 records covering 27 attributes about 539 distinct entities.

The database is organized around entity IDs associated with a name / value pair. This allows us to add new columns—IDs or fields from other organizations—very easily.

These are the name fields used in the database:

NameDescription
CRPcandIDThe candidate ID used by the Center For Responsive politics.
URLThe person's or entity's website
state_full_nameThe full name of a state, e.g. "North Dakota"
state_abbreviationThe abbreviation of a state's name, e.g. "CA"
districtThe integer value of a representative's district, e.g., "6"
WashPost_IDWashington Post's ID for this person
partyThe person's party, i.e. "D", "R" or "I'
VoteSmart_IDThe ID used by Project Vote Smart (http://votesmart.org/index.htm)
titleA person's title, e.g. "Representative" or "Senator"
member110congressIs/Was the person a member of the 110th Congress?, i.e., "yes" or "no"
FEC_IDFEC's ID for this person
webformA contact webform for this person
emailEmail for this person
senator_classClass for this person (Senator): I, II, or III
phonePhone number
name_suffixName Suffix, e.g. "Jr." or "II"
congress_officeAddress of their Congressional Office, e.g. "1502 Longworth HOB, Washington, DC 20515-1101"
genderGender: "M" or "F"
photoThe filename for their photo at http://sunlightlabs.com/widgets/popuppoliticians/resources/images/
congresspediaThe URL for their Congresspedia page
BioGuide_IDID for their BioGuide entry at http://bioguide.congress.gov/biosearch/biosearch.asp
Eventful_IDThe performer ID for an Eventful politician
lastnameLast Name(s)
firstnameFirst Name(s)
middlenameMiddle Names(s)
nicknameNickname
GovTrack_IDThe ID used by Govtrack.us
entity_idThe common id being promoted by the labs

Data Sources

source_idSourceDate Accessed
Feb2007_house_dot_gov_listhttp://www.house.gov/house/MemberWWW.shtmlFeb 5, 2007
Feb2007_senate_dot_gov_listhttp://senate.gov/general/contact_information/senators_cfm.cfmFeb 5, 2007
Feb2007_house_member_listhttp://clerk.house.gov/member_info/mcapdir.htmlFeb 5, 2007
Feb2007_110_member_mailing_labelshttp://clerk.house.gov/member_info/excelmemberlabels_110.xlsFeb 5, 2007
Feb2007_WashPost_senator_listhttp://projects.washingtonpost.com/congress/110/senate/members/Feb 6, 2007
govtracksitehttp://www.govtrack.us/congress/findyourreps.xpdFeb 6, 2007
csaction_senator_email_listhttp://csaction.org/resources/senate.htmlFeb 6, 2007
contactcongressftp://ftp.visi.com/users/juan/ContactingCongress.db.txtFeb 6, 2007
actuphttp://www.chicagoabc.org/take_action.htmFeb 6, 2007
votesmart_member_listhttp://votesmart.org/official_congress_search.php?type=all&criteria=noneFeb 8, 2007
eventfullist of performer IDs sent from jed at eventfulFeb 12, 2007