Computers could generate many useful hypotheses with little help from humans

Scientists today cannot hope to manually track all of the published science relevant to their work. A cancer biologist, for instance,can find more than 2 million relevant papers in the PubMed archieve, more than 200 million web pages with a Google search, and databases holding results from experiments that produce millions of gigabytes of data.

This explosion of knowledge is changing the landscape of science. Computers already play an important role in helping scientists store, manipulate, and analyze data. New capabilities, however, are extending the reach of computers from analysis to hypothesis. Drawing on approaches from artificial intelligence, computer programs increasingly are able to integrate published knowledge with experimental data, search for patterns and logical relations, and enable new hypotheses to emerge with little human intervention. Scientists have used such computational approaches to repurpose drugs, functionally characterize genes, identify elements of cellular biochemical pathways, and highlight essential breaches of logic and inconsistency in scientific understanding. We predict that within a decade, even more powerful tools will enable automated, high volume hypothesis generation to guide high throughput experiments in biomedicine, chemistry, physics, and even the social sciences.

Proponents of data driven science conjecture that hypotheses are absolute: new knowledge will simply emerge from mechanical application of algorithms that mine data for plausible patterns. This approach is attractive, but there are potential pitfalls. The discovery of patterns from data alone is similar to the taks faced by an explorer in an unfamiliar jungle, without a guide. WIth no sense of what is already known about the environment or its perils, she is likely to misclassify what she sees-fearing the intimidating but harmless snake; ignoring the tiny lethal frog.


Swanson pioneered the ABC model of hypothesis generation, which focuses on hypothesis that cross boundaries between distinct scientific literatures. If concepts A and B are studied in one literature, and B and C in another, Swanson assumed transitivity to hypothesize that A implies C. He then demonstrated that novel A-to-C inferences were likely to be true, although unlikely to be arrived at via other means. Through this approach, Swanson hypothesized that fish oil could lessen the symptoms of Raysis of biomolecules common to several fields of biomedicine, for instance, suggests that many communities could profit from generating predictions that bridge field boundaries and link disparate properties of these molecules or other scientific concepts.

——-For the full article, please see Science Magazine July 23 2010.—-

Based on the article:  Machine Science (Philosophy of science), by J. Evans and A. Rzhetsky, Science Magazine.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s