CCPortal
DOI10.1186/1752-0509-7-119
Decision tree-based method for integrating gene expression, demographic, and clinical data to determine disease endotypes
Williams-DeVane, ClarLynda R.1; Reif, David M.2; Hubal, Elaine Cohen2; Bushel, Pierre R.3; Hudgens, Edward E.4; Gallagher, Jane E.4; Edwards, Stephen W.1
发表日期2013-11-04
ISSN1752-0509
卷号7
英文摘要

Background: Complex diseases are often difficult to diagnose, treat and study due to the multi-factorial nature of the underlying etiology. Large data sets are now widely available that can be used to define novel, mechanistically distinct disease subtypes (endotypes) in a completely data-driven manner. However, significant challenges exist with regard to how to segregate individuals into suitable subtypes of the disease and understand the distinct biological mechanisms of each when the goal is to maximize the discovery potential of these data sets.


Results: A multi-step decision tree-based method is described for defining endotypes based on gene expression, clinical covariates, and disease indicators using childhood asthma as a case study. We attempted to use alternative approaches such as the Student's t-test, single data domain clustering and the Modk-prototypes algorithm, which incorporates multiple data domains into a single analysis and none performed as well as the novel multi-step decision tree method. This new method gave the best segregation of asthmatics and non-asthmatics, and it provides easy access to all genes and clinical covariates that distinguish the groups.


Conclusions: The multi-step decision tree method described here will lead to better understanding of complex disease in general by allowing purely data-driven disease endotypes to facilitate the discovery of new mechanisms underlying these diseases. This application should be considered a complement to ongoing efforts to better define and diagnose known endotypes. When coupled with existing methods developed to determine the genetics of gene expression, these methods provide a mechanism for linking genetics and exposomics data and thereby accounting for both major determinants of disease.


英文关键词Asthma;Endotypes;Gene Expression;Integrated analysis
语种英语
WOS记录号WOS:000327504500001
来源期刊BMC SYSTEMS BIOLOGY
来源机构美国环保署
文献类型期刊论文
条目标识符http://gcip.llas.ac.cn/handle/2XKMVOVA/59513
作者单位1.US EPA, Natl Hlth & Environm Effects Res Lab, Integrated Syst Toxicol Div, Durham, NC 27711 USA;
2.US EPA, Natl Ctr Computat Toxicol, Durham, NC 27711 USA;
3.NIEHS, Biostat Branch, Durham, NC 27709 USA;
4.US EPA, Natl Hlth & Environm Effects Res Lab, Environm Publ Hlth Div, Durham, NC 27711 USA
推荐引用方式
GB/T 7714
Williams-DeVane, ClarLynda R.,Reif, David M.,Hubal, Elaine Cohen,et al. Decision tree-based method for integrating gene expression, demographic, and clinical data to determine disease endotypes[J]. 美国环保署,2013,7.
APA Williams-DeVane, ClarLynda R..,Reif, David M..,Hubal, Elaine Cohen.,Bushel, Pierre R..,Hudgens, Edward E..,...&Edwards, Stephen W..(2013).Decision tree-based method for integrating gene expression, demographic, and clinical data to determine disease endotypes.BMC SYSTEMS BIOLOGY,7.
MLA Williams-DeVane, ClarLynda R.,et al."Decision tree-based method for integrating gene expression, demographic, and clinical data to determine disease endotypes".BMC SYSTEMS BIOLOGY 7(2013).
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Williams-DeVane, ClarLynda R.]的文章
[Reif, David M.]的文章
[Hubal, Elaine Cohen]的文章
百度学术
百度学术中相似的文章
[Williams-DeVane, ClarLynda R.]的文章
[Reif, David M.]的文章
[Hubal, Elaine Cohen]的文章
必应学术
必应学术中相似的文章
[Williams-DeVane, ClarLynda R.]的文章
[Reif, David M.]的文章
[Hubal, Elaine Cohen]的文章
相关权益政策
暂无数据
收藏/分享

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。