Optimization of data analysis ecosystems is an important frontier in data analytics science. Access to compute resources with sufficient RAM and technology to analyze genomic datasets is often a rate limiting step to deploying analysis workflows. Furthermore, as datasets become larger and more complex file sizes increase exponentially and storage costs increase in concert with the size of the data. The increased availability of compute resources and the multiple storage tiers available on the cloud has the potential to democratize complex data analytic algorithms that require state of the art technology to run analyses while reducing data storage costs by making use of archival storage for raw files.
The NIGMS has supported the creation of a series of self-guided data science training modules that leverage cloud compute resources to analyze a multitude of data types and analytic procedures. Each module is built around a jupyter notebook interface and will train users to conduct computational analyses of biological data using the Google Cloud Platform. These training materials are accessible through the NIGMS github site, but they will cost money to run using the Google Cloud Platform. The NIGMS and NH-INBRE are providing cloud credits to 10 people per cohort which will provide access to all NIGMS data science training modules for 6 weeks at a time. The dates are as follows:
Sign up by | Cohort starts | Credits expire | |
Cohort 2 | August 14 | August 15 | September 29 |
Cohort 3 | September 29 | September 30 | November 13 |
Here is a list of the training modules available:
- Fundamentals of Bioinformatics – Dartmouth College
- DNA Methylation Sequencing Analysis with WGBS – University of Hawaii at Manoa
- Transcriptome Assembly Refinement and Applications – MDI Biological Laboratory
- RNAseq Differential Expression Analysis – University of Maine
- Proteome Quantification – University of Arkansas for Medical Sciences
- ATAC-Seq and Single Cell ATAC-Seq Analysis – University of Nebraska Medical Center
- Consensus Pathway Analysis in the Cloud – University of Nevada Reno
- Integrating Multi-Omics Datasets – University of North Dakota
- Metagenomics Analysis of Biofilm-Microbiome – University of South Dakota
- Introduction to Data Science for Biology – San Francisco State University
- Analysis of Biomedical Data for Biomarker Discovery – University of Rhode Island
- Biomedical Imaging Analysis using AI/ML approaches – University of Arkansas
To sign up for access to these modules please contact Shannon Soucy (Shannon.Soucy@dartmouth.edu) and indicate the cohort you would be interested in signing up for.