Who says data is boring? Andy Houseman, associate professor in Biostatistics, makes it come alive on the August cover of Cancer Epidemiology, Biomarkers & Prevention, a monthly journal of the American Association for Cancer Research.
The article presents an analysis of three separate population-based data sets collected over 10 years, using an algorithm he has developed over the last 5 years [1-2] to help analyze DNA methylation data. The recent work featured in CEBP applies the algorithm to a slice of the data that is related to white blood cell differentiation, and shows that these strictly immune-related markers distinguish cancer cases from healthy controls as well as or better than previously published DNA methylation based models. The article cover showcases a figure produced by his algorithm.
In another article  his group recently published in BMC Bioinformatics, he shows how immune profiles can be precisely quantified using DNA methylation assays alone.
Houseman, part statistician, part biologist, says both recently published articles tell us that epigenetics is a bit more complicated than first thought and that researchers may not be measuring what they think they’re measuring when assessing DNA methylation obtained from whole blood.
“People want early-detection biomarkers for cancer, but we’re finding we’re not necessarily measuring a strictly epigenetic response but also possibly an immune response,” he says. “Both are potentially important, but they are distinct processes. My work addresses that complexity.”
- Peripheral blood immune cell methylation profiles are associated with nonhematopoietic cancers. [featured CEBP article]
- Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions. 
- Semi-supervised recursively partitioned mixture models for identifying cancer subtypes. 
- DNA methylation arrays as surrogate measures of cell mixture distribution.