A PCoA plot is a principal coordinate analysis used to represent similarity between samples (sample microbiomes in our case). Using the “MISO” study dataset, we will use the PCoA plot to summarize individuals based on ASVs and plot the resulting relationships between individuals. Based on how well color-coding the different variables matches the sample distribution on the PcOA plot, we will aim to help explain potential sources of sample similarity.
Approach: Perform multidimensional scaling (also known as principal component analysis) to establish a relationship between the samples given multivariate data (metadata variables). Using a PcOA plot (via commands ordinate() and plot_ordination() in phyloseq), you will condense the original high-dimensional data into a low-dimensional one by converting data to distance map (matrix) with 2 dimensions, x and y, that best explain variability in your data. From your PCoA plot you will assess the contribution of different study variables to sample diversity and identify variables that help explain sample diversity. In a PcOA plot, samples with similar microbial profiles will be plotted close and may appear as “clusters”.
Step 2. Using the PCoA plot from Step 1 above, look for any correlations between the shape of your PCoA plot and diet, gender and age
Step 2A. Using the PCoA plot from Step 1, look for any correlations between the shape of your PCoA plot and diet?
How well is your data explained by the variation in the diet - look for signs of correlation between PCoA shape and variable “diet” representing BD, HD and WO diets. Note that since we are using the same plot, we do not need to re-ordinate (no ordinate() function).
- Color - by diet
- Include a title for your PcOA plot
- Use the following code as a template:
#Ordinate
miso.pcoa <- ordinate(miso, method="PCoA", distance="bray")
plot_ordination(miso, miso.pcoa,
color = "fill in the blank",
title="fill in the blank")
Step 2B. Using the PCoA plot from Step 1, look for any correlations between the shape of your PCoA plot and gender?
How well is your data explained by the variation in the gender - look for signs of correlation between PcOA shape and variable “gender”.
- Color - by gender
- Include a title for your PcOA plot
- Use the following code as a template.
#Ordinate
miso.pcoa <- ordinate(miso, method="PCoA", distance="bray")
plot_ordination(miso, miso.pcoa,
color = "fill in the blank",
title="fill in the blank")
Step 2C. Using the PCoA plot from Step 1, look for any correlations between the shape of your PCoA plot and age? How well is your data explained by the variation in the age - look for signs of correlation between PcOA shape and variable “age”.
- Color - age
- Include title name for your PcOA plot
- Use the following code as a template. To enhance the color range of the age variable, additionally included below (in blue) is the command to specify the color gradient with yellow–to-blue gradient.
- Try running the code with and without the scale_colour_gradient() command to appreciate the difference.
#Ordinate
miso.pcoa <- ordinate(miso, method="PCoA", distance="bray")
plot_ordination(miso, miso.pcoa,
color = "fill in the blank",
title="fill in the blank") +
scale_colour_gradient(low = "blue", high = "yellow")
Step 4. One way of checking if the metabolite levels indeed correlate with your data, is to see if subsetting smaller chunks or specific chunks of the data will still maintain the metabolite-data relationship or break it.
Using the subset() command, subset out e.g. HD diet specifically, and then BD diet. Then, generate new PCoA plots and see if the correlation with your metabolite of interest still holds.
Step 4A. Subset “HD”, ordinate on “HD”, and make a new PCoA plot.
- Subset data - by the diet “HD”
- Ordinate - misoHD
- Color - by your selected metabolite from part 3
- Include a title for your PcOA plot
- Use the following code as a template:
#Subset only “HD” timepoint from your miso data, creating a new phyloseq object called “misoHD”
misoHD = subset_samples(miso, diet == "fill in the blank")
#Ordinate using only “HD” subset, creating a new ordination matrix called “pcoa.misoHD”
pcoa.misoHD <- ordinate(misoHD, method="PCoA", distance="bray")
#Make a PCoA plot using your “HD” subset.
plot_ordination(misoHD, pcoa.misoHD,
color = "fill in the blank",
title="fill in the blank")
Step 4B. Repeat the analysis on only the BD diet samples. Subset “BD”, ordinate on “BD”, and make a new PCoA plot.
- Subset data - by the diet “BD”
- Ordinate - misoBD
- Color - by your selected metabolite from part 3
- Include a title for your PcOA plot
- Use the code below as a template.
misoBD = subset_samples(miso, diet == "fill in the blank")
pcoa.misoBD <- ordinate(misoBD, method="PCoA", distance="bray")
plot_ordination(misoBD, pcoa.misoBD,
color = "fill in the blank",
title="fill in the blank")