The biomedical informatics core of the CTSC is offering the NextflowWorkbench language to help write data analysis pipelines/workflows. NextflowWorkbench takes advantage of the Nextflow middleware and makes it possible for beginners in bioinformatics to quickly assemble efficient, parallel, portable, and reproducible workflows.
Developed workflows are portable. They can run either on personal computers as well as on compute clusters or commercial clouds.
The training session (1.5hr) will provide an introduction to the development of workflows with NextflowWorkbench. In this session, trainees will create a workflow useful to analyze RNA-Seq data, including:
- Download read files from the Short Read Archive (SRA)
- Estimate quality control measurements (with FastQC)
- Estimate counts against the human transcriptome (with Kallisto and an Ensembl Transcript sequence database)
- Combine these counts into one matrix, a pre-requisite to using these counts for differential expression (e.g., with the MetaR Limma Voom analysis protocol)
Here is a diagram of the pipeline that we will develop during the training:
NextflowWorkbench is part of the Data Analysis Workbench and is being developed to facilitate data analysis for biomedical scientists with minimal computational skills. The software is fully functional, open-source and provided free of charge.
The software runs as a desktop application with an interactive user interface, on MacOS X (10.8.3+) and Linux (with support for docker).
Training and assistance in the use of the software is offered to investigators who hold an appointment in one of our CTSC institutions (i.e., Weill Cornell, MSKCC, HSS, and Hunter College).
See http://workflow.campagnelab.org for software and video tutorials.
Users interested in learning how to use the software are encouraged to attend one of the monthly training sessions. Training sessions are held on select Tuesdays at 10:30 AM.
The sessions are limited to a small number of participants and pre-registration is required. Please use the registration form (http://goo.gl/forms/tW13LBjjkr) to reserve a seat. The first training session will be held Feb 2nd 2016.
Some knowledge of the Linux/UNIX command line is needed to develop new Workflows, but the training does not require strong proficiency. Some elements of programming may also be beneficial. Trainees will be requested to precisely follow the installation instructions to download and install the software on their laptop before attending the training session.
This software is provided by the Biomedical Informatics Core of the Clinical and Translational Science Center and by the Campagne laboratory. Please contact Dr. Fabien Campagne if you have any questions or comments at 646-962-5613.