QIIME 1.5.0 is live!

9 05 2012

Hello QIIME users,
We’re very excited to announce the 1.5.0 release of QIIME, which is available for download here. As always, you can find the latest QIIME AMI ID here, and we’ll be releasing the new VirtualBox images in one week. This release is packed with way too many exciting new features to mention all of them here, but here are some of the ones we’re most excited about.

* The biggest change in this release of QIIME is the switch to the BIOM format for representing OTU tables on disk and the biom-format objects for representing OTU tables in memory. You can find a discussion of the motivations for the switch here, but briefly it will support interoperability of related tools (e.g., QIIME, MG-RAST, mothur, and VAMPS), it provides a more efficient representation of sparse matrix data than tab-separated text, and it allows for storage of OTU counts, OTU metadata (e.g., taxonomy), and sample metadata (e.g., environmental parameters) in a single file. A manuscript describing the BIOM format is currently in press at GigaScience. You can find information about converting between BIOM-formatted and “classic”-formatted OTU tables here.

* Our AWS AMI now support use with StarCluster and the IPython Notebook. StarCluster provides an extremely convenient way to boot virtual clusters on the Amazon Cloud, and we think it will be key toward making very large analyses (e.g., based on several Illumina runs) accessible to groups without large compute clusters. Using StarCluster you can now easily run your QIIME analyses across multiple AWS instances: for example, you can boot 20 eight-processor instances to create a virtual cluster with 160 processors. The IPython Notebook provides a web-based interface for developing API and/or command line based workflows. These are easy to share with others as .ipynb files, or to publish with your journal articles. Using the IPython Notebook with the QIIME AWS images enables truly reproducible computation. You can find information on how to use these new features here.

* We’ve added a number of new statistical approaches via the compare_distance_matrices.py and compare_categories.py scripts. These include Adonis, Anosim, BEST, Moran’s I, MRPP, PERMANOVA, PERMDISP, RDA, Partial Mantel, and Mantel Correlogram. Two new tutorials illustrate how and when to use these methods – you can find these here and here. This code was all developed for an undergraduate Computer Science capstone project at Northern Arizona University – their project website is here.

* We’ve added support for the RTAX method for performing taxonomy assignment in assign_taxonomy.py. RTAX is specifically designed for assigning taxonomy to paired-end reads, but additionally works on single-end reads. You can find a paper on RTAX here, and a tutorial describing how to use this new code here.

* Along with the switch to BIOM format for OTU tables, we’ve updated the cleaned up the interfaces, usage examples, and help text associated with many of the scripts in QIIME. Notable examples are the replacement of filter_otu_table.py with filter_otus_from_otu_table.py, and the replacement of filter_by_metadata.py with filter_samples_from_otu_table.py.

* Support for inserting sequences into trees has been added via the new insert_seqs_into_tree.py script. This wraps the pplacer, RAxML, and ParsInsert applications.

* We’ve added the pick_subsampled_reference_otus_through_otu_tables.py, a more efficient open reference OTU picking workflow script for processing very large Illumina (or other) data sets. This is being used to process the Earth Microbiome Project data, so is designed to scale to tens of HiSeq runs. A new tutorial has been added that describes this process.

* The check_id_map.py code was completely refactored. It now creates html output to display locations of errors and warnings in the mapping file, so should provide a very convenient way to detect errors in your metadata mapping files.

* Added the start_parallel_jobs_sc.py script to support parallel jobs on SGE queueing systems, which is the default queueing system on StarCluster. This has only been tested on StarCluster at this point (hence ‘sc’ in the name), but we expect that it will work on other systems using SGE.

QIIME releases are massive collaborative efforts. Thanks to all of the developers for their hard work in making this release happen, and to our users for the suggestions, support, feature requests and bug reports. A lot of the QIIME developers will be at ISME this summer, so come find us and say hello!

Enjoy QIIME 1.5.0!

Greg





Announcing the ISME 14 Bioinformatics Workshop: Using QIIME and MG‐RAST to study microbial communities.

6 03 2012

Core members of the QIIME and MG-RAST development groups will present a one-day workshop on 18 August, 2012, the day before the ISME meeting begins, at Copenhagen University. If you’d like to interact with QIIME and MG-RAST developers to learn these tools in a hands-on setting, this will be a great opportunity. You can find details here:

http://www.qiime.org/workshops/isme14_bioinformatics.pdf

Hope to see you there!

Greg





QIIME 1.4.0 is live!

14 12 2011

We’re very excited to announce the 1.4.0 release of QIIME. You can find the new version here. We’ll be posting the new EC2 images (release and development versions) tomorrow, and you’ll be able to find the AMI identifiers on the “Resources” page of the new QIIME website.

This release contains a fix for the make_distance_histograms.py bug that we announced last week. Again, we’re sorry for any inconvenience that that may have caused. This release is additionally packed with a lot of new features – some key ones are:

* A lot of new tutorials including retraining of the RDP classifier, working with Amazon Web Services, coverage of basic unix/linux commands, and others.

* Addition of the OTUPIPE workflow for chimera detection, quality filtering, and OTU picking. This is now available via the pick_otus.py module, and will require you to install the usearch software (even in EC2 and VirtualBox, due to licensing restrictions).

* Addition of code to support plotting comparisons of raw distance data in QIIME. This is available in the new scripts make_distance_comparison_plots.py and make_distance_boxplots.py, and covered in a new tutorial which includes some examples of the plots that can be generated.

* Added new script nmds.py to support Non-Metric Multidimensional Scaling analysis.

* Support in the pick_otus_through_otu_table.py scipt for running uclust_ref in parallel with creation of new clusters (i.e., open-reference OTU picking with the reference step running in parallel and the de novo step running serially).

* assign_taxonomy_reference_seqs_fp and assign_taxonomy_id_to_taxonomy_fp are new qiime_config values, allowing users to set defaults for the dataset they’d like to perform taxonomy assignment against. This works for the serial and parallel versions of assign_taxonomy for both BLAST and RDP.

* Added option (-e/–max_rare_depth) to the command line of alpha_rarefaction.py. This provides a convenient way for users to specify the maximum rarefaction depth on the command line, and is useful for when it needs to be set to something other than the median rarefaction depth. Also added option to control minimum rarefaction depth from the alpha_rarefaction.py command line.

* Added support for 5- and 10-fold and leave-one-out cross-validation to supervised_learning.py.

* Added subsample_fasta.py module for randomly subsampling fasta files.

Plus lots of additional new features: the list continues in the ChangeLog for this release.

Thanks, and have fun!

Greg





Bug identified in make_distance_histograms.py affecting all versions of QIIME

3 12 2011

Hi all,

We discovered a serious bug last night in the make_distance_histograms.py script. This bug results in incorrect t statistics and p values being generated under some circumstances. We’re still investigating this now, and we’re working on a fix.

This bug only affects the output from the make_distance_histograms.py script. If you’re using this script through the beta_diversity_through_plots.py workflow, these will be the files in the “<metric>_histograms” directory where <metric> refers to the beta diversity metric that you’re using. It does look like this bug affects all versions of QIIME.

We’re very sorry for any inconvenience this has caused. We take these issues very seriously, and we’d like to thank the QIIME Forum user “Poom” for bring this to our attention. We’re going to follow up in the QIIME Forum with additional details on this bug. Our plan for getting a fix to users is that we’re going to bump up the priority on the QIIME 1.4.0 release, which will incorporate this fix (as well as a lot of new features). Since we are already close to a release, we decided that this will be a more efficient solution than patching the 1.3.0 release. We’ll keep you posted with details on QIIME 1.4.0, but we expect that this will come out within the next two weeks.

Greg





New QIIME homepage

13 10 2011

Hello all,

We posted a new home page at www.qiime.org today. This is in response to users who told us that there are too many QIIME sites to keep track of (the documentation, the blog, the forums, our YouTube help videos page). All of these independent sites are still in the same locations, but you can now easily access them all (plus some new stuff that will be going up soon) from the home page at www.qiime.org. The previous homepage is now the top-level of the QIIME documentation. Enjoy!

Greg





QIIME 1.3.0 is live!

29 06 2011

We’re proud to announce the release of QIIME 1.3.0 today! Some important things to know about the upgrade are listed on the new “Upgrading …” page. Go here for instructions on how to upgrade existing installations of the QIIME VB. For AWS users, the QIIME 1.3.0 EC2 image identifier is ami-dce51eb5.

This a major release of QIIME, packed with too many new features to cover them all here. Some highlights are:

* Added support for the RDP Classifier version 2.2 in response to several feature requests. QIIME now supports RDP 2.0 and RDP 2.2.

* Overhauled support for Illumina data in QIIME with a new script, split_libraries_fastq.py. fastq is now the default format for incoming Illumina data to QIIME, and several scripts have been provided to convert other file formats to fastq (process_qseq.py and process_iseq.py). Check out the “Processing Illumina Data” tutorial for an overview of how to work with Illumina data in QIIME.

* Full integration of the QIIME Denoiser (Reeder and Knight, 2010) into Qiime: Denoiser is no longer a stand-alone package. Some enhancements were made to the Denoiser in the process, including check-pointing for failure recovery.

* Added support for AmpliconNoise via the ampliconnoise.py script. We’ve also dropped support for PyroNoise as AmpliconNoise is the successor to that package.

* Added the core_qiime_analyses.py script. This plugs together many components of QIIME (split libraries.py, pick_otus_through_otu_table.py, beta_diversity_through_plots.py, alpha_rarefaction.py, and several others) in an effort to combine the core functionality in a single command. The output of this script includes an html page that serves as an index into the results, to facilitate sharing of results. This script should be considered to be in BETA status. One known issue is no failure recovery – if an analysis fails at some point in the analysis it’s not possible to continue where the failed run left off, possibly resulting in wasted compute time.

* Parameter files have been made optional for all workflow scripts. This greatly simplifies the interface for the workflow scripts when users want to work with default parameters.

* Added the plot_taxa_summary.py workflow script, which includes (optionally) summarizing the OTU table by category, then summarizing taxa for an OTU table, and generating area, bar, and pie charts.

* A lot of updated and new documentation: the QIIME Overview tutorial, the Illumina data processing tutorial, and the Denoiser tutorial have all been overhauled, and a new tutorial has been added that covers processing 18S data with QIIME. A lot of minor documentation changes have been pushed in through-out the code base.

* The beta_diversity_through_3d_plots.py has been renamed to beta_diversity_through_plots.py to reflect additional functionality. It now also generates 2d plots and distance histograms, and any of the plots can be disabled by passing the options –suppress_distance_histograms, –suppress_2d_plots, and –suppress_3d_plots.

* Added inflate_denoiser_output.py script to simplify the integration of denoiser results into the QIIME pipeline. See the Denoiser tutorial for details. To reduce the number of possible pathways through QIIME with denoising (which were difficult to support), support for denoising was removed from pick_otus_through_otu_table.py in favor of working with the pipeline presented in the tutorial.

* Reorganized and renamed output from the workflow scripts for clarity. You’ll notice changes in your results from pick_otus_through_otu_tables.py, pick_reference_otus_through_otu_tables.py, and beta_diversity_through_plots.py. The same files are still created, but with less complex naming of files and a simpler directory structure.

* Added plot_semivariogram.py to plot semivariograms using two distance matrices.

* Added filter_tree.py which prunes a list of tips from a tree. Lists of tips can be provided in similar ways as to filter_fasta.py.

* Added make_tep.py, which creates a TopiaryExplorer project file (.tep) from an otu table, a metadata mapping file, and a tree.

* We’ve posted help videos for getting setup with the QIIME VirtualBox and EC2 which you can find here.

Check out the ChangeLog for a more comprehensive list of the changes in QIIME 1.3.0.

A few additional notes:

* For QIIME VB users: we now recommend making at least 2GB of RAM available to your VirtualBox.

* We’d like to thank Paul Marshall for his work on the app_deploy.py script. This is a very useful tool for installing/upgrading QIIME (and other software) in Linux environments, and was used for building the QIIME 1.3.0 VirtualBox and EC2 instances. It is available for our users to upgrade/install QIIME, but should be considered to be in BETA status – this has been tested in the QIIME VirtualBox, but is not recommended for use on production systems at this stage.

* Due to popular demand, we’ll be adding a page soon that lists courses and workshops that will cover QIIME. We’ll post to the blog when that page goes live.

Thanks to all of the QIIME developers for the hard work that went into QIIME 1.3.0, and to our users for all the feedback that helps us improve QIIME. As always, get in touch on the forums if you run into questions. Have fun!

-Greg





QIIME 1.2.1 is live!

23 02 2011

We are happy to announce a minor QIIME release today: QIIME 1.2.1.

Some of the notable new features are:

  • In response to NCBI’s announcement to drop support for SRA, we no longer support submission of data to the SRA using QIIME. We now support submission of data to MG-RAST using the new submit_to_mgrast.py script.
  • The make_pie_charts.py script has been replaced with a new script, plot_taxa_summary.py, which in addition to creating pie charts will also create area charts and bar charts. These new plots are particularly useful for looking at how taxonomy changes across time/space gradients. We’ve created a new taxa summary tutorial showing how to use this new script. You can find some example bar plots there.
  • Added binary_otu_gain as a new beta diversity metric to compute non-phylogenetic gain (G), or the amount of new OTUs in a sample (or samples) with respect to another sample (or samples). This complements the phylogenetic variant of this metric, unifrac_g.
  • Added a reference-based OTU picking workflow script, pick_reference_otus_through_otu_table.py, which performs strict reference-based OTU picking where a pre-existing tree and taxonomy will be used (allowing users to bypass the slow steps where these are created in the pick_otus_through_otu_table.py workflow). This can also be used for applying the Shotgun UniFrac pipeline.
  • Changed defaults for uclust and uclust_ref OTU pickers, as described in this blog post.
  • Added support for generating inVUE plots in make_3d_plots.py.
  • Changed the method for p-value calculation in Procrustes analysis Monte Carlo in response to SF bug # 3189200.

As of QIIME 1.2.1, we moved the EC2 image out of beta testing status, and have now released it as an EBS which means that it is possible to save the state of the machine and (manually) pause it to save money when not actively running jobs. We encourage users to try this out as our initial uses of this have been very successful. This is a great way to get a lot of compute power, relatively cheaply, for your QIIME runs. Information on running QIIME on EC2 is here. You can find the official image by searching for AMI ami-0c12e165. After booting the image in your own amazon account, you’ll then log in as user ‘ubuntu’.

Updated versions of the QIIME VirtualBox and the EC2 image are now available, as is a new version of the QIIME VirtualBox update script. We recommend updating to the latest version of the Oracle VirtualBox software as they seem to have fixed some issues that were present in the previous versions.

Also note that we’ll be maintaining links to the most recent versions of dependencies, as well as our most recent build of the Greengenes references OTUs on the top right corner of the blog homepage so you can always find that information there.

Have fun!

Greg








Follow

Get every new post delivered to your Inbox.

Join 317 other followers