Public Release: 

In new statistical approach, data decide model

University of Illinois at Urbana-Champaign

CHAMPAIGN, Ill. -- A data-driven computational approach developed by a University of Illinois statistician is revealing secrets about inner Earth and discovering unique gene expressions in fruit flies, zebra fish and other living organisms.

"Using mathematical concepts from inverse scattering and modern statistics, we let the data 'speak,' and automatically generate an appropriate model," said Ping Ma, a professor of statistics at the U. of I. and lead author of a paper describing the technique that has been accepted for publication in the Journal of Geophysical Research.

To study features deep within Earth, for example, Ma and colleagues first process the seismic data with a numeric technique called inverse scattering. Instead of beginning with a geophysical structure and calculating the scattering, the researchers use the scattered seismic waves to reconstruct the scattering structures.

In that initial step, the researchers develop a generalized Radon transform of global seismic network data to map thousands of seismograms to a set of multiple images of the same target structure.

"These 'common image-point gathers' reveal common structure among the messy seismic waves, and are the key notion that we exploit in the statistical development of the generalized Radon transform," said Ma, who also is affiliated with the university's Institute for Genomic Biology.

In the second step, the researchers use "mixed effects" statistical models to analyze the common image-point gathers and enhance the generalized Radon transform images.

The combined use of the generalized Radon transform and the mixed-effect statistical inference exploits the redundancy in the data and allows the transformation of vast volumes of network data to statistical estimates and quantitative analysis, Ma said.

In one recent application, Ma and colleagues at the Massachusetts Institute of Technology and Purdue University used the numeric technique to analyze seismic waves and infer the shape and temperature of Earth's core-mantle boundary region. The researchers reported their findings in the March 30, 2007, issue of the journal Science.

The data-driven statistical methodology is not limited to analyzing seismic data. In computational biology, for example, Ma and colleagues have used the technique to discover unique patterns of gene expression in fruit flies and roundworms, to study differential gene expression of the retinal development in zebra fish, and to explore the effect of histone modifications on gene transcription rates in yeast.


The work was funded by the National Science Foundation.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.