Chemometrics with R
Multivariate Data Analysis in the Natural Sciences and Life Sciences
"Chemometrics with R" offers readers an accessible introduction to the world of multivariate statistics in the life sciences, providing a complete description of the general data analysis paradigm, from exploratory analysis to modeling to validation. Several more specific topics from the area of chemometrics are included in a special section. The corresponding R code is provided for all the examples in the book; scripts, functions and data are available in a separate, publicly available R package. For researchers working in the life sciences, the book can also serve as an easy-to-use primer on R.
1;Chemometrics with R;3 1.1;Preface;7 1.2;Contents;11 1.3;1 Introduction;15 1.4;Part I Preliminaries;19 1.4.1;2 Data;20 1.4.2;3 Preprocessing;26 1.4.2.1;3.1 Dealing with Noise;26 1.4.2.2;3.2 Baseline Removal;31 1.4.2.3;3.3 Aligning Peaks - Warping;33 1.4.2.3.1;3.3.1 Parametric Time Warping;35 1.4.2.3.2;3.3.2 Dynamic Time Warping;39 1.4.2.3.3;3.3.3 Practicalities;44 1.4.2.4;3.4 Peak Picking;44 1.4.2.5;3.5 Scaling;46 1.4.2.6;3.6 Missing Data;51 1.4.2.7;3.7 Conclusion;52 1.5;Part II Exploratory Analysis;53 1.5.1;4 Principal Component Analysis;54 1.5.1.1;4.1 The Machinery;55 1.5.1.2;4.2 Doing It Yourself;57 1.5.1.3;4.3 Choosing the Number of PCs;59 1.5.1.3.1;4.3.1 Statistical Tests;60 1.5.1.4;4.4 Projections;62 1.5.1.5;4.5 R Functions for PCA;64 1.5.1.6;4.6 Related Methods;68 1.5.1.6.1;4.6.1 Multidimensional Scaling;68 1.5.1.6.2;4.6.2 Independent Component Analysis and Projection Pursuit;71 1.5.1.6.3;4.6.3 Factor Analysis;74 1.5.1.6.4;4.6.4 Discussion;76 1.5.2;5 Self-Organizing Maps;78 1.5.2.1;5.1 Training SOMs;79 1.5.2.2;5.2 Visualization;82 1.5.2.3;5.3 Application;84 1.5.2.4;5.4 R Packages for SOMs;87 1.5.2.5;5.5 Discussion;88 1.5.3;6 Clustering;90 1.5.3.1;6.1 Hierarchical Clustering;91 1.5.3.2;6.2 Partitional Clustering;96 1.5.3.2.1;6.2.1 K-Means;96 1.5.3.2.2;6.2.2 K-Medoids;98 1.5.3.3;6.3 Probabilistic Clustering;101 1.5.3.4;6.4 Comparing Clusterings;106 1.5.3.5;6.5 Discussion;108 1.6;Part III Modelling;111 1.6.1;7 Classification;112 1.6.1.1;7.1 Discriminant Analysis;113 1.6.1.1.1;7.1.1 Linear Discriminant Analysis;114 1.6.1.1.2;7.1.2 Crossvalidation;118 1.6.1.1.3;7.1.3 Fisher LDA;120 1.6.1.1.4;7.1.4 Quadratic Discriminant Analysis;123 1.6.1.1.5;7.1.5 Model-Based Discriminant Analysis;125 1.6.1.1.6;7.1.6 Regularized Forms of Discriminant Analysis;127 1.6.1.1.6.1;Diagonal Discriminant Analysis;128 1.6.1.1.6.2;Shrunken Centroid Discriminant Analysis;129 1.6.1.2;7.2 Nearest-Neighbour Approaches;131 1.6.1.3;7.3 Tree-Based Approaches;135 1.6.1.3.1;7.3.1 Recursive Partitioning and Regression Trees;135 1.6.1.3.1.1;Constructing the Tree;139 1.6.1.3.2;7.3.2 Discussion;144 1.6.1.4;7.4 More Complicated Techniques;144 1.6.1.4.1;7.4.1 Support Vector Machines;145 1.6.1.4.1.1;Extensions to More than Two Classes;148 1.6.1.4.1.2;Finding the Right Parameters;149 1.6.1.4.2;7.4.2 Artificial Neural Networks;150 1.6.2;8 Multivariate Regression;154 1.6.2.1;8.1 Multiple Regression;154 1.6.2.1.1;8.1.1 Limits of Multiple Regression;156 1.6.2.2;8.2 PCR;158 1.6.2.2.1;8.2.1 The Algorithm;158 1.6.2.2.2;8.2.2 Selecting the Optimal Number of Components;161 1.6.2.3;8.3 Partial Least Squares (PLS) Regression;164 1.6.2.3.1;8.3.1 The Algorithm(s);165 1.6.2.3.2;8.3.2 Interpretation;169 1.6.2.3.2.1;PLS Packages for R;172 1.6.2.4;8.4 Ridge Regression;172 1.6.2.5;8.5 Continuum Methods;174 1.6.2.6;8.6 Some Non-Linear Regression Techniques;174 1.6.2.6.1;8.6.1 SVMs for Regression;174 1.6.2.6.2;8.6.2 ANNs for Regression;177 1.6.2.7;8.7 Classification as a Regression Problem;179 1.6.2.7.1;8.7.1 Regression for LDA;179 1.6.2.7.2;8.7.2 Discussion;181 1.7;Part IV Model Inspection;182 1.7.1;9 Validation;183 1.7.1.1;9.1 Representativity and Independence;184 1.7.1.2;9.2 Error Measures;186 1.7.1.3;9.3 Model Selection;187 1.7.1.4;9.4 Crossvalidation Revisited;189 1.7.1.4.1;9.4.1 LOO Crossvalidation;189 1.7.1.4.2;9.4.2 Leave-Multiple-Out Crossvalidation;191 1.7.1.4.3;9.4.3 Double Crossvalidation;191 1.7.1.5;9.5 The Jackknife;192 1.7.1.6;9.6 The Bootstrap;194 1.7.1.6.1;9.6.1 Error Estimation with the Bootstrap;195 1.7.1.6.2;9.6.2 Confidence Intervals for Regression Coefficients;198 1.7.1.6.3;9.6.3 Other R Packages for Bootstrapping;203 1.7.1.7;9.7 Integrated Modelling and Validation;203 1.7.1.7.1;9.7.1 Bagging;204 1.7.1.7.2;9.7.2 Random Forests;205 1.7.1.7.3;9.7.3 Boosting;210 1.7.2;10 Variable Selection;213 1.7.2.1;10.1 Tests for Coefficient Significance;214 1.7.2.1.1;10.1.1 Confidence Intervals for Individual Coefficients;215 1.7.2.1.2;10.1.2 Tests Based on Overall Error Contribu
Wehrens, Ron
ISBN | 9783642178412 |
---|---|
Artikelnummer | 9783642178412 |
Medientyp | E-Book - PDF |
Auflage | 2. Aufl. |
Copyrightjahr | 2011 |
Verlag | Springer-Verlag |
Umfang | 286 Seiten |
Sprache | Englisch |
Kopierschutz | Adobe DRM |