9.5 Try it Question 2 - Do diet, age, gender and levels of metabolites correlate with microbe variation between individuals

A PCoA plot is a principal coordinate analysis used to represent similarity between samples (sample microbiomes in our case). Using the “MISO” study dataset, we will use the PCoA plot to summarize individuals based on ASVs and plot the resulting relationships between individuals. Based on how well color-coding the different variables matches the sample distribution on the PcOA plot, we will aim to help explain potential sources of sample similarity.

Approach: Perform multidimensional scaling (also known as principal component analysis) to establish a relationship between the samples given multivariate data (metadata variables). Using a PcOA plot (via commands ordinate() and plot_ordination() in phyloseq), you will condense the original high-dimensional data into a low-dimensional one by converting data to distance map (matrix) with 2 dimensions, x and y, that best explain variability in your data. From your PCoA plot you will assess the contribution of different study variables to sample diversity and identify variables that help explain sample diversity. In a PcOA plot, samples with similar microbial profiles will be plotted close and may appear as “clusters”.

9.5.1 Step 1. Make a PcOA plot, ordinating on the entire miso dataset, and coloring by individual. Investigate resulting plot shape.

  • Ordination is a term used to summarize a multidimensional dataset when projected onto a low-dimensional space (like X & Y axes) and then observing any pattern the data may possess with a visual inspection.

  • Subsequent coloring of the pattern with metadata variables can reveal underlying relationships between data and experiment variables.

  • color - by subject

  • Include a title for your PCoA plot

  • Use the following code as a template:

#Ordinate 
miso.pcoa <- ordinate(miso, method="PCoA", distance="bray") 

#Plot PCoA
plot_ordination(miso, miso.pcoa, 
        color = "fill in the blank",  
        title="fill in the blank")
1A-1. Paste your plot below:

1A-2. Does the PCoA organize samples into groups? Is there any pattern?

1A-3. Do you observe any sample clustering by individual? This is represented by points of the same color clustering together. If so, give an example of an individual who shows this pattern.

1A-4. Offer one question you want to ask about this PCoA plot?

9.5.2 Step 2. Using the PCoA plot from Step 1 above, look for any correlations between the shape of your PCoA plot and diet, gender and age

Step 2A. Using the PCoA plot from Step 1, look for any correlations between the shape of your PCoA plot and diet?

How well is your data explained by the variation in the diet - look for signs of correlation between PCoA shape and variable “diet” representing BD, HD and WO diets. Note that since we are using the same plot, we do not need to re-ordinate (no ordinate() function).

  • Color - by diet
  • Include a title for your PcOA plot
  • Use the following code as a template:
#Ordinate 
miso.pcoa <- ordinate(miso, method="PCoA", distance="bray") 

plot_ordination(miso, miso.pcoa, 
        color = "fill in the blank",  
        title="fill in the blank")
2A-1. Paste your plot below:

2A-2 Do samples of the same diet treatment (color) cluster together?

2A-3 Based on what you wrote above, are samples from the same diet treatment similar to each other?

Step 2B. Using the PCoA plot from Step 1, look for any correlations between the shape of your PCoA plot and gender?

How well is your data explained by the variation in the gender - look for signs of correlation between PcOA shape and variable “gender”.

  • Color - by gender
  • Include a title for your PcOA plot
  • Use the following code as a template.
#Ordinate 
miso.pcoa <- ordinate(miso, method="PCoA", distance="bray") 

plot_ordination(miso, miso.pcoa, 
        color = "fill in the blank",  
        title="fill in the blank")
2B-1. Paste your plot below:

2B-2 Are samples from the same gender similar to each other?

Step 2C. Using the PCoA plot from Step 1, look for any correlations between the shape of your PCoA plot and age? How well is your data explained by the variation in the age - look for signs of correlation between PcOA shape and variable “age”.

  • Color - age
  • Include title name for your PcOA plot
  • Use the following code as a template. To enhance the color range of the age variable, additionally included below (in blue) is the command to specify the color gradient with yellow–to-blue gradient.
  • Try running the code with and without the scale_colour_gradient() command to appreciate the difference.
#Ordinate 
miso.pcoa <- ordinate(miso, method="PCoA", distance="bray") 

plot_ordination(miso, miso.pcoa, 
        color = "fill in the blank",  
        title="fill in the blank") + 
scale_colour_gradient(low = "blue", high = "yellow") 
2C-1. Paste your plot below:

2C-2 Are samples of a similar age similar to each other?


9.5.3 Step 3. Using the PCoA plot from Step 1, look for any correlations between the shape of your PCoA plot and the level of metabolites.

How well is your data correlated with the variation in the 5 metabolites: Creatinine, PCS, IS, HIPP, or PAG? Test each metabolite by coloring each metabolite at a time, then, choose your favorite metabolite (e.g. one with most difference) and show the plot below.

  • Color - metabolite one at a time: Creatinine, PCS, IS, HIPP, PAG
  • Include a title for your PcOA plot
  • Use the following code as a template:
#Ordinate 
miso.pcoa <- ordinate(miso, method="PCoA", distance="bray") 

plot_ordination(miso, miso.pcoa, 
        color = "fill in the blank",  
        title="fill in the blank") + 
scale_colour_gradient(low = "blue", high = "yellow") 
3A-1. Paste your metabolite plot(s) below:


3A-2. Which metabolite did you choose to show and why?


3A-3. Are samples of the same metabolite level similar to each other?


9.5.4 Step 4. One way of checking if the metabolite levels indeed correlate with your data, is to see if subsetting smaller chunks or specific chunks of the data will still maintain the metabolite-data relationship or break it.

Using the subset() command, subset out e.g. HD diet specifically, and then BD diet. Then, generate new PCoA plots and see if the correlation with your metabolite of interest still holds.

Step 4A. Subset “HD”, ordinate on “HD”, and make a new PCoA plot.

  • Subset data - by the diet “HD”
  • Ordinate - misoHD
  • Color - by your selected metabolite from part 3
  • Include a title for your PcOA plot
  • Use the following code as a template:
#Subset only “HD” timepoint from your miso data, creating a new phyloseq object called “misoHD”
misoHD = subset_samples(miso, diet == "fill in the blank") 

#Ordinate using only “HD” subset, creating a new ordination matrix called “pcoa.misoHD”
pcoa.misoHD <- ordinate(misoHD, method="PCoA", distance="bray") 

#Make a PCoA plot using your  “HD” subset.
plot_ordination(misoHD, pcoa.misoHD, 
        color = "fill in the blank", 
        title="fill in the blank")
4A-1. Paste your plot below for HD subset:

4A-2 Are samples of the same metabolite level similar to each other considering only samples from the HD diet?

4A-3 Has the relationship between the metabolite and samples changed within the HD diet compared to the entire dataset? (ex. Have any new patterns emerged or disappeared?)

Step 4B. Repeat the analysis on only the BD diet samples. Subset “BD”, ordinate on “BD”, and make a new PCoA plot.

  • Subset data - by the diet “BD”
  • Ordinate - misoBD
  • Color - by your selected metabolite from part 3
  • Include a title for your PcOA plot
  • Use the code below as a template.
misoBD = subset_samples(miso, diet == "fill in the blank")

pcoa.misoBD <- ordinate(misoBD, method="PCoA", distance="bray") 

plot_ordination(misoBD, pcoa.misoBD, 
        color = "fill in the blank", 
        title="fill in the blank")
4B-1. Paste your plot below for BD subset:

4B-2 Are samples of the same metabolite level similar to each other considering only samples from the BD diet?

4B-3 Has the relationship between the metabolite and samples changed within the BD diet compared to the entire dataset? (ex. Have any new patterns emerged or disappeared?)

9.5.5 Footnotes

9.5.5.2 Contributions and affiliations

  • Valeriya Gaysinskaya, Johns Hopkins University
  • Gauri Paul, Clovis Community College
  • Frederick Tan, Johns Hopkins University
  • Sayumi York, Notre Dame of Maryland University