To find out when whooping cough started making a comeback in Ohio, or how often measles kills in America, we turn to historical records. But those records aren’t very useful when they’re squirreled away in a distant office basement. The same goes for when they are embedded in a report—you can only look at them in the same way you might admire a painting, but you cannot drop the data into a spreadsheet and hunt for statistical significance. If you are only looking at a couple years’ worth of information that formatting dilemma is not such a big deal. You can scour the data and manually punch it into your analysis. It only becomes a huge problem when you are looking at hundreds or thousands of data points.
Such is the problem that public health experts at University of Pittsburgh encountered when they were exploring old medical data and developing models that predict future outbreaks. “We found ourselves going back and pulling out historical datasets repeatedly. We kept doing it over and over and finally got to the point where we thought it would be not only a service to ourselves but everybody if all the data was made digital and open access,” says Donald Burke, the dean of Pittsburgh’s graduate school of public health.
Four years ago, buoyed by funds from the National Institutes of Health and the Gates Foundation, they started the process of digitalizing 125 years worth of medical records. The endeavor was dubbed Project Tycho, named for the Danish nobleman Tycho Brahe who made the voluminous astronomical observations that Kepler later tapped to develop the laws of planetary motion. (But no pressure, right?)
The online, open-access resource now features accounts of 47 diseases between 1888 and today. It includes data from the weekly Nationally Notifiable Disease Surveillance reports for the United States, standardized in such a way that the data can be immediately analyzed.
Written By: Dina Fine Maron
continue to source article at blogs.scientificamerican.com