Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus MEDLINE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

Comparison of parametric and machine methods for variable selection in simulated Genetic Analysis Workshop 19 data.

Title: Comparison of parametric and machine methods for variable selection in simulated Genetic Analysis Workshop 19 data.
Authors: Holzinger ER; Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA.; Szymczak S; Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA ; Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Christian-Albrechts-Platz 4, 24118 Kiel, Germany.; Malley J; Division of Computational Bioscience, Center for Information Technology, National Institutes of Health, 9000 Rockville Pike, Building 12A, Bethesda, MD 20892 USA.; Pugh EW; Center for Inherited Disease Research, IGM, Johns Hopkins School of Medicine, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA.; Ling H; Center for Inherited Disease Research, IGM, Johns Hopkins School of Medicine, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA.; Griffith S; Center for Inherited Disease Research, IGM, Johns Hopkins School of Medicine, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA.; Zhang P; Center for Inherited Disease Research, IGM, Johns Hopkins School of Medicine, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA.; Li Q; Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA.; Cropp CD; Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA.; Bailey-Wilson JE; Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 333 Cassell Drive, Suite 1200, Baltimore, MD 21224 USA.
Source: BMC proceedings [BMC Proc] 2016 Oct 18; Vol. 10 (Suppl 7), pp. 147-152. Date of Electronic Publication: 2016 Oct 18 (Print Publication: 2016).
Publication Type: Journal Article
Language: English
Journal Info: Publisher: BioMed Central Country of Publication: England NLM ID: 101316936 Publication Model: eCollection Cited Medium: Print ISSN: 1753-6561 (Print) Linking ISSN: 17536561 NLM ISO Abbreviation: BMC Proc Subsets: PubMed not MEDLINE
Imprint Name(s): Original Publication: [London] : BioMed Central
Abstract: Current findings from genetic studies of complex human traits often do not explain a large proportion of the estimated variation of these traits due to genetic factors. This could be, in part, due to overly stringent significance thresholds in traditional statistical methods, such as linear and logistic regression. Machine learning methods, such as Random Forests (RF), are an alternative approach to identify potentially interesting variants. One major issue with these methods is that there is no clear way to distinguish between probable true hits and noise variables based on the importance metric calculated. To this end, we are developing a method called the Relative Recurrency Variable Importance Metric (r2VIM), a RF-based variable selection method. Here, we apply r2VIM to the unrelated Genetic Analysis Workshop 19 data with simulated systolic blood pressure as the phenotype. We compare the number of "true" functional variants identified by r2VIM with those identified by linear regression analyses that use a Bonferroni correction to calculate a significance threshold. Our results show that r2VIM performed comparably to linear regression. Our findings are proof-of-concept for r2VIM, as it identifies a similar number of functional and nonfunctional variants as a more commonly used technique when the optimal importance score threshold is used.
References: Curr Med Chem. 2012;19(25):4289-97. (PMID: 22830342); BioData Min. 2016 Feb 01;9:7. (PMID: 26839594); Bioinformatics. 2010 Jul 15;26(14):1752-8. (PMID: 20505004); Psychol Methods. 2009 Dec;14(4):323-48. (PMID: 19968396); Nucleic Acids Res. 2014 Jan;42(Database issue):D1001-6. (PMID: 24316577); BMC Proc. 2016 Oct 18;10(Suppl 7):71-77. (PMID: 27980614); Nature. 2009 Oct 8;461(7265):747-53. (PMID: 19812666)
Grant Information: HHSN268201200008C United States HL NHLBI NIH HHS; HHSN268201200008I United States HL NHLBI NIH HHS; R01 GM031575 United States GM NIGMS NIH HHS
Entry Date(s): Date Created: 20161217 Latest Revision: 20240603
Update Code: 20260130
PubMed Central ID: PMC5133476
DOI: 10.1186/s12919-016-0021-1
PMID: 27980627
Database: MEDLINE

Journal Article