Ergatis is a web-based utility that is used to create, run, and monitor reusable computational analysis pipelines. It contains pre-built components for common bioinformatics analysis tasks. These components can be arranged graphically to form highly-configurable pipelines. Each analysis component supports multiple output formats, including the Bioinformatic Sequence Markup Language (BSML). The current implementation includes support for data loading into project databases following the CHADO schema, a highly normalized, community-supported schema for storage of biological annotation data.
Ergatis uses the Workflow engine to process its work on a compute grid. Workflow provides an XML language and processing engine for specifying the steps of a computational pipeline. It provides detailed execution status and logging for process auditing, facilitates error recovery from point of failure, and is highly scalable with support for distributed computing environments. The XML format employed enables commands to be run serially, in parallel, and in any combination or nesting level.
This framework has been employed in the annotation of several large, eukaryotic organisms, including Aedes aegypti and Trichomonas vaginalis.
CloVR is a virtual applicance for automated sequence analysis that supports cloud computing platforms, including Amazon EC2. The CloVR VM includes a dynamically scalable Ergatis installation with Sun Grid Engine.
Orvis J, Crabtree J, Galens K, Gussman A, Inman JM, Lee E, Nampally S, Riley D, Sundaram JP, Felix V, Whitty B, Mahurkar A, Wortman J, White O, Angiuoli SV. Ergatis: A web interface and scalable software system for bioinformatics workflows. Bioinformatics. 2010 Jun 15;26(12).