While genome-wide gene expression data are generated at an increasing rate, the repertoire of approaches for pattern discovery in these data is still limited. Identifying subtle patterns of interest in large amounts of data (tens of thousands of profiles) associated with a certain level of noise remains a challenge. A microarray time series was recently generated to study the transcriptional program of the mouse segmentation clock, a biological oscillator associated with the periodic formation of the segments of the body axis. A method related to Fourier analysis, the Lomb-Scargle periodogram, was used to detect periodic profiles in the dataset, leading to the identification of a novel set of cyclic genes associated with the segmentation clock. Here, we applied to the same microarray time series dataset four distinct mathematical methods to identify significant patterns in gene expression profiles. These methods are called: Phase consistency, Address reduction, Cyclohedron test and Stable persistence, and are based on different conceptual frameworks that are either hypothesis- or data-driven. Some of the methods, unlike Fourier transforms, are not dependent on the assumption of periodicity of the pattern of interest. Remarkably, these methods identified blindly the expression profiles of known cyclic genes as the most significant patterns in the dataset. Many candidate genes predicted by more than one approach appeared to be true positive cyclic genes and will be of particular interest for future research. In addition, these methods predicted novel candidate cyclic genes that were consistent with previous biological knowledge and experimental validation in mouse embryos. Our results demonstrate the utility of these novel pattern detection strategies, notably for detection of periodic profiles, and suggest that combining several distinct mathematical approaches to analyze microarray datasets is a valuable strategy for identifying genes that exhibit novel, interesting transcriptional patterns.
This research was partially supported by DARPA grant HR 0011-05-1-0057. HE and YM mathematical work was supported by DARPA grant HR0011-05-1-0007. AS research was supported by a Lucent Technologies Bell Labs Graduate Research. Fellowship; AK and MR research was supported by NIH grant GM U54 GM74942; and SA research was supported by Association pour la Recherche sur le Cancer (ARC), France. OP, AM, MLD, EG and GH research was supported by the Stowers Institute for Medical Research. OP is a Howard Hughes Medical Institute Investigator.
Dequéant M, Ahnert S, Edelsbrunner H, et al. Comparison of pattern detection methods in microarray time series of the segmentation clock. PLoS One. 2008;3(8). doi:10.1371/journal.pone.0002856
Dequéant, M., Ahnert, S., Edelsbrunner, H., Fink, T., Glynn, E., Hattem, G., … Pourquie, O. (2008). Comparison of pattern detection methods in microarray time series of the segmentation clock. PLoS One, 3(8). https://doi.org/10.1371/journal.pone.0002856
Dequéant, Mary, Sebastian Ahnert, Herbert Edelsbrunner, Thomas Fink, Earl Glynn, Gaye Hattem, Andrzej Kudlicki, et al. “Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock.” PLoS One 3, no. 8 (2008). https://doi.org/10.1371/journal.pone.0002856.
M. Dequéant et al., “Comparison of pattern detection methods in microarray time series of the segmentation clock,” PLoS One, vol. 3, no. 8, 2008.
Dequéant M, Ahnert S, Edelsbrunner H, Fink T, Glynn E, Hattem G, Kudlicki A, Mileyko Y, Morton J, Mushegian A, Pachter L, Rowicka M, Shiu A, Sturmfels B, Pourquie O. 2008. Comparison of pattern detection methods in microarray time series of the segmentation clock. PLoS One. 3(8).
Dequéant, Mary, et al. “Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock.” PLoS One, vol. 3, no. 8, Public Library of Science, 2008, doi:10.1371/journal.pone.0002856.