lmm2met: Linear mixed-effects modeling to improve clinical metabolomics information

Kwanjeera Wanichthanarak, Saharuetai Jeamsripong, Natapol Pornputtapong, Sakda Khoomrung

2019-04-15

Contents

1 INSTALLATION

2 TUTORIAL

2.1 Input format

2.2 Processing clinical metabolomics data with lmm2met

2.3 Multivariate analysis

1 INSTALLATION

  1. Require R software 3.3.1 or higher (https://www.r-project.org/)
  2. Install lmm2met using the following commands in R terminal:
## Install devtools package, if not exist
install.packages("devtools")

## Install lmm2met package
devtools::install_github("kwanjeeraw/lmm2met")

2 TUTORIAL

Lipid profiles of adipose tissue samples [1] are used in this tutorial. The data set contains subcutaneous (SC) and paired visceral (VS) tissues from 59 patients. Patient metadata is in Table 1. Lipid profiling is performed using liquid chromatography qTOF MS (UPLC-QTOF MS), resulting in 158 lipids.

Table 1. Characteristics of colorectal carcinoma patients

Table 1. Characteristics of colorectal carcinoma patients

2.1 Input format

lmm2met accepts a dataframe of patient metadata and metabolite profiles as shown in Table 2.

Label Id Stage Age Sex BMI ICD10 Location Tissue X1 X2 X3 X4 X5
SAT_1 1 SII 65 F 30.44983 187 Colon SC 13.18725 9.410631 12.84474 10.386373 10.61844
SAT_10 2 SIII 70 F 22.30815 180 Colon SC 14.37285 9.744867 14.33294 10.701784 10.94010
SAT_11 3 SIV 70 F 17.63000 20 Rectum SC 15.34028 10.518095 16.42373 12.111801 12.99843
SAT_12 4 SII 60 M 28.39373 180 Colon SC 15.48707 9.811469 14.47447 11.968216 12.40126
SAT_13 5 SII 66 M 26.36560 183 Colon SC 14.12675 9.527597 12.98986 10.876466 11.58416
SAT_14 6 SIII 72 M 29.37758 20 Rectum SC 13.63333 9.753116 13.15815 10.495233 10.92507
SAT_15 7 SII 66 F 17.90886 19 Rectum SC 15.07497 10.332314 15.20835 11.227977 11.64551
SAT_16 8 SIII 68 M 30.44983 20 Rectum SC 14.65777 9.800992 13.39161 11.238705 11.03547
SAT_17 9 SI 55 M 27.46530 186 Colon SC 14.92884 9.592731 13.99792 10.528907 11.69798
SAT_18 10 SII 48 M 38.15512 180 Colon SC 13.20325 9.411735 12.24911 9.814709 10.05035

Table 2. Part of adipose tissue data set is illustrated including patient metadata (red) and metabolite profiles (blue).

2.2 Processing clinical metabolomics data with lmm2met

Patient characteristics including age, BMI, sex, tumor location, and tumor stage formulate the fixed-effects part of the LMM. Patient identifier (Id) is included as a random intercept term of the LMM.

Use the following commands in R terminal to process the data set:

## Load lmm2met package
library(lmm2met)

## Examine structure of adipose data
str(adipose)

## LMM fitting
fitMet = fitLmm(fix=c('Sex','Age','BMI','Stage','Location','Tissue'), random='(1|Id)', data=adipose, start=10)

## Observe LMM of the first metabolite
summary(fitMet$completeMod[[1]])

## Observe processed data
fitMet$fittedDat[1:10,1:14]

## Heatmaps of coefficients and significance levels of fixed effects
plot(fitMet, type='fixeff')

## Render coefficient table
plot(fitMet, type='coeff')

## Plot metabolites before and after LMM fitting (Figure 1)
plot(fitMet, type='fitted', fix='Tissue')
Figure 1. Boxplots comparing metabolite X2 before and after LMM fitting. Tissue type (SC, subcutaneous and VS, visceral) is the factor of interest.

Figure 1. Boxplots comparing metabolite X2 before and after LMM fitting. Tissue type (SC, subcutaneous and VS, visceral) is the factor of interest.

2.3 Multivariate analysis

Use the following commands in R terminal to perform PCA:

library(scales)
library(ggplot2)
mycolors = RColorBrewer::brewer.pal(name="Set1", n = 9)

## PCA before LMM fitting
pca = prcomp(adipose[,-1:-9], center = TRUE, scale. = TRUE)
prop.pca = summary(pca)
pcadata = cbind(adipose[,1:9], pca = pca$x)
ggplot(pcadata, aes(pca.PC1, pca.PC2)) +
  geom_point(aes(colour = Tissue), size = 1) +
  scale_color_manual(labels=c("non-tumor","tumor"), values = mycolors) +
  stat_ellipse(type = "norm", color='grey70', size=0.3) +
  theme_minimal() +
  labs(title = 'Before LMM fitting', x = paste("PC1 [", percent(prop.pca$importance[2,1]), "]", sep=""), y = paste("PC2 [", percent(prop.pca$importance[2,2]), "]", sep=""))

## PCA after LMM fitting
pca = prcomp(fitMet$fittedDat[,-1:-9], center = TRUE, scale. = TRUE)
prop.pca = summary(pca)
pcadata = cbind(fitMet$fittedDat[,1:9], pca = pca$x)
ggplot(pcadata, aes(pca.PC1, pca.PC2)) +
  geom_point(aes(colour = Tissue), size = 1) +
  scale_color_manual(labels=c("non-tumor","tumor"), values = mycolors) +
  stat_ellipse(type = "norm", color='grey70', size=0.3) +
  theme_minimal() +
  labs(title = 'After LMM fitting', x = paste("PC1 [", percent(prop.pca$importance[2,1]), "]", sep=""), y = paste("PC2 [", percent(prop.pca$importance[2,2]), "]", sep=""))
Figure 2. PCA score plot of adipose tissue samples. Before LMM fitting (top) and after LMM fitting (bottom).

Figure 2. PCA score plot of adipose tissue samples. Before LMM fitting (top) and after LMM fitting (bottom).

Use the following commands in R terminal to perform PLS-DA:

## Perform PLS-DA using mixOmics package
dat = as.matrix(fitMet$fittedDat[,-1:-9])
vac.plsda <- mixOmics::plsda(dat, Y=fitMet$fittedDat$Tissue, ncomp=3)

## Score plot (Figure 3)
mixOmics::plotIndiv(vac.plsda)

## Feature selections based on VIP
vp = mixOmics::vip(vac.plsda)
df = data.frame(vp,row.names = dimnames(vp)[[1]])

## Discriminants on the 1st component with VIP > 1
ft = row.names(df)[which(df$comp.1 > 1)] 
Figure 3. PLS-DA score plot of adipose tissue samples. SC, subcutaneous tissue and VS, visceral tissue).

Figure 3. PLS-DA score plot of adipose tissue samples. SC, subcutaneous tissue and VS, visceral tissue).

Use the following commands in R terminal to perform OPLS-DA:

## Perform OPLS-DA using ropls package
dat = as.matrix(fitMet$fittedDat[,-1:-9])
ropls.oplsda <- ropls::opls(dat, y=fitMet$fittedDat$Tissue, plotL=F, predI = 1, orthoI = NA)

## Feature selections based on VIP
vp = ropls::getVipVn(ropls.oplsda)

## Discriminants on the 1st component with VIP > 1
ft = names(vp[vp > 1])
Figure 4. OPLS-DA outputs.

Figure 4. OPLS-DA outputs.

REFERENCES

  1. Liesenfeld, D.B., et al., Metabolomics and transcriptomics identify pathway differences between visceral and subcutaneous adipose tissue in colorectal cancer patients: the ColoCare study. Am J Clin Nutr, 2015. 102(2): p. 433-43.

  2. Rohart, F., et al., mixOmics: An R package for ’omics feature selection and multiple data integration. PLoS Comput Biol, 2017. 13(11): p. e1005752.

  3. Thevenot, E.A., et al., Analysis of the Human Adult Urinary Metabolome Variations with Age, Body Mass Index, and Gender by Implementing a Comprehensive Workflow for Univariate and OPLS Statistical Analyses. J Proteome Res, 2015. 14(8): p. 3322-35.