4.4 Running a module on AnVIL

Title card for Running modules on AnVIL including running the modules and then closing the session. A prerequisite is listed: Your instructor (if you are a student) has added you to an AnVIL workspace.

4.4.0.1 Purpose

In this section we will go over how to run C-MOOR modules on AnVIL. We will go over how to create an RStudio environment in that workspace to run the module and properly end a session on AnVIL to prevent runaway costs.

4.4.0.2 Learning Objectives

  1. Launch a module through the cloned workspace
  2. Close out a session on AnVIL properly to prevent runaway costs

4.4.1 What is a workspace?

The workspace is the heart of AnVIL. Here are some key points about workspaces:

  • Every workspace comes with its own Google Bucket (our cloud storage). Your bucket will be empty.
  • Every workspace has its own billing project. Students who are not yet associated with a billing project will not be able to compute on their workspace.
  • We can control access levels of users and set them either as owners, writers, or readers. Students will be writers with compute access.

4.4.2 Running modules on AnVIL

4.4.2.1 Starting a module on AnVIL

When you open the workspace, you will be on the dashboard tab by default. The dashboard contains the instructions on how to use the workspace, links to C-MOOR websites, and the startup script. Let’s try running a module.

An image titled Running modules on AnVIL showing a C-MOOR workspace and red boxes showing the container image and start up script lines needed to create an environment.

  1. Take note of the container image for the custom environment. We recommend copying this to a word document or notepad. Make sure there are no spaces before or after what you copy. You will need to input this URL soon.

  2. Take note of the startup script. Make sure there are no spaces before or after what you copy. This script is held in the original workspace everyone cloned. It does not have to be in your own workspace for it to work. You will need to input this URL soon.

  3. Click on the Environment Configuration button , the cloud with a thunderbolt.

Image showing how to access the cloud environment and highlighting what settings with red boxes to adjust as per list below.

  1. In the RStudio section, click Settings.

  2. Make sure you have the following settings matching these instructions. Under Application configuration, choose “Custom environment”. In the container image field that appears, paste the container image URL that we copied earlier from the workspace. The URL should end with Bioconductor 3.19.1. In the startup script field, paste the URL for the startup script. This URL contains the words C-MOOR Startup Script. Set the creation timeout limit to 15 minutes.

  3. Select 4 CPUs and 15 gigabytes of memory.

  4. Confirm that the cloud compute cost is 20 cents per hour. If it is not 20 cents per hour, reselect CPUs and memory allocation in part 6. This is a known bug in AnVIL at the writing of this guide.

  5. Scroll to the bottom of the window and click “Create”.

Image showing the RStudio environment lifecycle at different stages. Blue for busy, green for ready, and orange for paused.

It will take some time for the RStudio Environment to be created. You can keep track of the status of the environment based on the colored dot next to the RStudio icon. The dot will turn green when the environment is ready. While it is loading (blue), you cannot interact with it.

Image with a red box around the pop-up that appears when the RStudio environment is ready

  1. When the environment is ready, use the Open RStudio button that will pop up. You can also access RStudio through the Analyses tab. If you hold down Ctrl as you click, you can open RStudio in a new window.

Image showing the RStudio interface open on AnVIL, with red boxes showing how to use the file explorer and navigating to C-MOOR modules

  1. Use the file explorer in RStudio to navigate to your module of choice. First, enter the folder of the curriculum you are using, either rnaseq (not cure-rnaseq) or 16s. Then enter the folder of the module you want to run.

Image showing a .Rmd file in a C-MOOR folder surrounded by a red box.

  1. In the module’s directory, open the .Rmd file by double clicking its name.

Image showing the RStudio interface and a red box around the Run Document button, distinct from the run button.

  1. Click Run Document in the open .Rmd file

When you are finished, make sure you close out your session properly to prevent runaway costs!.

4.4.3 Closing out a session on AnVIL

Image showing the steps needed to close out an interactive session on AnVIL with each step in the list below shown in a red box.

  1. On the right side of the screen, click the Cloud Environment button. This is the Cloud with the lightning symbol.
  2. Under the RStudio section, click settings.
  3. Scroll to the bottom of the new window and click delete environment.
  4. Check Delete everything, including the persistent disk or your instructor’s billing account will incur costs for storage.