TY - JOUR
T1 - PARAFASCA: ASCA combined with PARAFAC for the analysis of metabolic fingerprinting data
AU - Jansen, J.J.
AU - Bro, R.
AU - Hoefsloot, H.C.J.
AU - van den Berg, F.W.J.
AU - Westerhuis, J.A.
AU - Smilde, A.K.
N1 - Reporting year: 2008
Metis note: 4300;CTE; MTI; file:///C:/pdfs/PDFS2008/Jansen_ea_4300.pdf
PY - 2008
Y1 - 2008
N2 - Novel post-genomics experiments such as metabolomics provide datasets that are highly multivariate and often reflect an underlying experimental design, developed with a specific experimental question in mind. ANOVA-simultaneous component analysis (ASCA) can be used for the analysis of multivariate data obtained from an experimental design instead of the widely used principal component analysis (PCA). This increases the interpretability of the model in terms of the experimental question. Aside from the levels of individual factors, variation that can be described by the experimental design may also depend on levels of multiple (crossed) factors simultaneously, e.g. the interactions. ASCA describes each contribution with a PCA model, but a contribution depending on crossed factors may be described more parsimoniously by multiway models like parallel factor analysis (PARAFAC). The combination of PARAFAC and ASCA, named PARAFASCA, provides a view on the data that is both parsimonious and focused on the experimental question. The novel method is used to analyze a dataset in which the effect of two doses of hydrazine on the urinary chemical composition of rats is investigated by time-resolved metabolic fingerprinting with nuclear magnetic resonance (NMR) spectroscopy. This experiment has been conducted to monitor the dose-specific urine composition changes in time upon hydrazine administration. Comparison of the PCA, the ASCA and the PARAFASCA models shows that ASCA and PARAFASCA describe the data more dedicated to the experimental question than PCA, but that PARAFASCA is more parsimonious than ASCA, and separates the variation underlying different effects better.
AB - Novel post-genomics experiments such as metabolomics provide datasets that are highly multivariate and often reflect an underlying experimental design, developed with a specific experimental question in mind. ANOVA-simultaneous component analysis (ASCA) can be used for the analysis of multivariate data obtained from an experimental design instead of the widely used principal component analysis (PCA). This increases the interpretability of the model in terms of the experimental question. Aside from the levels of individual factors, variation that can be described by the experimental design may also depend on levels of multiple (crossed) factors simultaneously, e.g. the interactions. ASCA describes each contribution with a PCA model, but a contribution depending on crossed factors may be described more parsimoniously by multiway models like parallel factor analysis (PARAFAC). The combination of PARAFAC and ASCA, named PARAFASCA, provides a view on the data that is both parsimonious and focused on the experimental question. The novel method is used to analyze a dataset in which the effect of two doses of hydrazine on the urinary chemical composition of rats is investigated by time-resolved metabolic fingerprinting with nuclear magnetic resonance (NMR) spectroscopy. This experiment has been conducted to monitor the dose-specific urine composition changes in time upon hydrazine administration. Comparison of the PCA, the ASCA and the PARAFASCA models shows that ASCA and PARAFASCA describe the data more dedicated to the experimental question than PCA, but that PARAFASCA is more parsimonious than ASCA, and separates the variation underlying different effects better.
U2 - 10.1002/cem.1105
DO - 10.1002/cem.1105
M3 - Article
SN - 0886-9383
VL - 22
SP - 114
EP - 121
JO - Journal of Chemometrics
JF - Journal of Chemometrics
IS - 1-2
ER -