You are here: Home Case Studies QFAB Services Tools and Platform Australian mirror of the UCSC Genome browser
Australian mirror of the UCSC Genome browser

Australian mirror of the UCSC Genome browser

The UCSC genome browser provides the ability to visualise and query the genomic annotations of over 50 species. Our team at QFAB, with the assistance of the project partners from the University of Queensland, Griffith University, Queensland University of Technology and CSIRO Livestock Industries, have implemented a full mirror of the UCSC genome browser that is now available for use by the Australian Research community.

The public QFAB mirror of the UCSC genome browser provides:

  •     An alternative access point for Australian researchers
  •     A convenient location for fast bulk downloads of the large genomic datasets that can otherwise be problematic due to slow international network connections
  •     Research groups with the ability to easily add additional data annotation tracks to the browser that can then be easily integrated with the existing public data tracks


Hosting a full mirror and keeping it updated as new data is released requires a significant commitment of both time and resources. The QFAB system has been designed to provide all of the benefits of private and secure mirrors to each of the project partners while taking advantage of the economies of scale having all of the mirrors co-located and sharing resources.

In addition to the data visualisation and querying tools available using the web interface of the UCSC genome browser, the underling databases and framework is a useful environment for computational analysis of genomic data. Project partners have access to a dedicated computational cluster that calls on the genome browser database and associated utilities.

How it works


The system runs on a dedicated 10-node cluster (4 x 2.6 GHz Opteron dual-core CPUs and 32 Gb of memory per node) hosted by the Institute for Molecular Bioscience at the University of Queensland. The cluster runs VMware ESX virtualisation, allowing the computational resources of the physical nodes to be efficiently divided into as many virtual machines as required.

A full mirror of the genome browser database and files currently requires approximately 3.5 Tb of storage space. We maintain two independent copies of this data, one live copy that is in use by the mirrors and a second off-line copy to be updated in the background to the latest UCSC data releases. When a full update is complete, the new data is mounted for use by the mirrors and the update cycle can begin again.

Each mirror runs from an independent virtual machine (4 cpu, 16 GB memory) that can access UCSC data via a shared read-only file system while user-tracks are stored on a locally attached 1 TB file system. Partners have http and ssh access to their mirror via a secure vpn connection. In addition to the processing power available on each head node, partners also have access to a virtual 38 node Linux cluster (1x 2.6 GHz CPU, 4 GB RAM) that utilises the remaining computing resources not in use by the head nodes.

 

icon Tools & Platform Case Studies


Document Actions