MORE 'DATA-SETS' POSTS
From a U.S. Census Announcement: We are pleased to announce the release of the 2010 American Community Survey (ACS) 1-Year Public Use Microdata Sample (PUMS) files. PUMS files from the ACS show the full range of population and housing unit responses collected on individual ACS questionnaires. These files enable users to design tabulations that aggregate […]
Six Provocations for Big Data The era of Big Data has begun. Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists, and many others are clamoring for access to the massive quantities of information produced by and about people, things, and their interactions. Diverse groups argue about the potential benefits and costs of analyzing information […]
You can access the new web site at: http://www.opengovpartnership.org/ From Alex Howard (O’Reilly Radar): Open government is about to assume a higher profile in foreign affairs. On July 12, 2011, the State Department hosted an historic gathering in Washington to announce the (OGP) with Brazil and six other nations. Today in New York City, this […]
Geographic Databases for 2011 Released The U.S. Department of Transportation’s Bureau of Transportation Statistics (BTS), a part of the Research and Innovative Technology Administration, released the 2011 edition of the National Transportation Atlas Databases (NTAD) this week. The 2011 edition features updated datasets from last year’s NTAD and premieres the Customs and Border Protection’s Border […]
If you’re not a developer, here’s some news that might be of interest to them. Perhaps it’s time to for developers and non-developers to brainstorm new ways to use Dept. of Labor data. The team at the wonderful ProgrammableWeb.com, home to a massive directory of API’s and mashups pointed out that the U.S. Department of […]
Civilian Casualties in Afghanistan: Data and Documents This page contains data and documents relating to civilian casualties of the conflict in Afghanistan. They were obtained by Science correspondent John Bohannon after embedding with military forces in Kabul and Kandahar in October 2010. All military data and documents were released voluntarily by the International Security Assistance […]
Company Names Matching in the Large Patents Dataset This paper addresses the name matching (duplicate detection) problem in the US patent dataset. It contains more then 400K unique company names spellings. In order to solve the matching problem we choose appropriate string similarity measure and clustering approach and estimate their parameters. Finally we apply them […]
Data rescue initiatives: bringing historical climate data into the 21st century The currently limited availability of long and high-quality surface instrumental climate records continues to hamper our ability to carry out more robust assessments of the climate. Such assessments are needed to better understand, detect, predict and respond to global climate variability and change. Despite […]
From a Blog Post by Jonathan Gray on the Open Knowledge Foundation Blog: We’re very pleased to announce an alpha version of datacatalogs.org, a website to help keep track of open data catalogues from around the world. The project is being launched to coincide with our annual conference, OKCon 2011. The project was borne out […]
USDA’s Animal Care Program Enhances Searchable Database to Provide the Public with Greater Access to Animal Welfare Information The U.S. Department of Agriculture’s Animal and Plant Health Inspection Service (APHIS) has developed an expanded and improved search engine that provides greater access to information about USDA licensees and registrants regulated under the Animal Welfare Act […]