1.4 AnVIL Onboarding
1.4.1 Join AnVIL
1.4.1.1 Purpose
In this course we will use the online cloud platform, AnVIL, to do some data analysis for your research project. The purpose of this assignment is to register for an AnVIL account, and then to inform the instructor of your username so that you can be added to the AnVIL workspace for this course and access course materials.
1.4.1.2 Learning Objectives
- Create an account on AnVIL
- Share the email you used to sign up for AnVIL with your instructor
1.4.1.3 Introduction
AnVIL (The Genomic Data Science Analysis, Visualization, and Informatics Lab-space) is a platform created by the National Human Genome Research Institute (NHGRI) in collaboration with cloud computing platform providers like Google and Microsoft. Using AnVIL we can access computing resources on the cloud through your browser without need for any fancy physical equipment. Through AnVIL you will also have access to all the software and data necessary to complete your research project.
Please note that Google Chrome is the only supported browser for AnVIL. If you choose a different browser, you may run into technical errors. Additionally, AnVIL works only on laptops. If you have a tablet, pair up with someone else or ask to borrow a laptop from your school.
1.4.1.4 Part 1 – Create an account on AnVIL
Follow the written steps below or refer to the slides or video guide.

- Open anvil.terra.bio in Google Chrome . Google Chrome is the only officially supported web browser for AnVIL. Because of this, while you can run AnVIL in other browsers you strongly suggest using Chrome.
- Tip: bookmark this page so that you can easily access it throughout the course.
- Click the hamburger icon (3 lines) in the top left corner of the screen
- Click “Sign in”

- Click “Sign in with Google”.
- Sign in with a Google associated email address such as an institutional email that uses Gmail or a personal Gmail account. You must use a Google associated email address to gain access to Google Cloud computing resources.
- If you are a student, share the email you used to sign up for AnVIL with your instructor following their instructions.
1.4.2 First AnVIL Tutorial
Interactive tutorials introducing various data science concepts
1.4.2.1 Purpose
The purpose of this assignment is to (1) confirm you have been added to the class AnVIL workspace so you can access course materials, and (2) learn how to access the tutorials for this course on AnVIL.
1.4.2.2 Learning Objectives
- Confirm you have been added to the class workspace on AnVIL
- Start up an RStudio environment on AnVIL
- Complete your first LearnR tutorial
- Delete your RStudio environment
1.4.2.3 Introduction
Before beginning this assignment, you should have already created an AnVIL account and submitted the email you used to sign up for AnVIL to your instructor. In this assignment you will learn how to setup an RStudio environment on SciServer. This environment is analogous to preparing a lab space for a physical lab. You have to have the right equipment and reagents to be able to do the activity.
This assignment shows you how to set up the RStudio environment and start up your first C-MOOR tutorial.
1.4.2.4 Part 1 – Confirm you have access to your class workspace
The workspace is the heart of AnVIL. To be able to run modules, you need to have access to the class workspace.
- Open AnVIL in Google Chrome
- Click on the hamburger icon in the top left corner
- Login to your AnVIL account
- Click on the hamburger icon in the top left corner again
- Click workspaces
- Confirm you see your class workspace
- Click on the workspace name to enter the workspace.
1.4.2.5 Part 2 – Start up an RStudio environment
When you open the workspace, you will be on the dashboard tab by default. The dashboard contains the instructions on how to use the workspace, links to C-MOOR websites, and the startup script. Let’s try running a module.

Take note of the container image for the custom environment. We recommend copying this to a word document or notepad. Make sure there are no spaces before or after what you copy. You will need to input this URL soon.
Take note of the startup script. Make sure there are no spaces before or after what you copy. This script is held in the original workspace everyone cloned. It does not have to be in your own workspace for it to work. You will need to input this URL soon.
Click on the Environment Configuration button , the cloud with a thunderbolt.

In the RStudio section, click Settings.
Make sure you have the following settings matching these instructions. Under Application configuration, choose “Custom environment”. In the container image field that appears, paste the container image URL that we copied earlier from the workspace. The URL should end with Bioconductor 3.19.1. In the startup script field, paste the URL for the startup script. This URL contains the words C-MOOR Startup Script. Set the creation timeout limit to 15 minutes.
Select 4 CPUs and 15 gigabytes of memory.
Confirm that the cloud compute cost is 20 cents per hour. If it is not 20 cents per hour, reselect CPUs and memory allocation in part 6. This is a known bug in AnVIL at the writing of this guide.
Scroll to the bottom of the window and click “Create”.

It will take some time for the RStudio Environment to be created. You can keep track of the status of the environment based on the colored dot next to the RStudio icon. The dot will turn green when the environment is ready. While it is loading (blue), you cannot interact with it.

- When the environment is ready, use the Open RStudio button that will pop up. You can also access RStudio through the Analyses tab. If you hold down Ctrl as you click, you can open RStudio in a new window.
1.4.2.6 Part 3 – Complete your first LearnR tutorial
- Use the file explorer in RStudio to navigate to your module of choice. Go to rnaseq > 1-intro-model-org
- In the module’s directory, open the .Rmd file by double clicking its name.
- Click Run Document in the top left area in the open .Rmd file to run it. Note that the Run Document button is different from the Run button!
1.4.2.7 Part 4 – Closing out a session on AnVIL
You should be deleting your RStudio environment every time you run a module. Do not leave it running overnight as you will continue to run up charges on your class workspace (note that this is paid by your instructor or university)!

- On the right side of the screen, click the Cloud Environment button. This is the Cloud with the lightning symbol.
- Under the RStudio section, click settings.
- Scroll to the bottom of the new window and click delete environment.
- Check Delete everything, including the persistent disk or your instructor’s billing account will incur costs for storage.