EMBL

last update

09 Feb, 2005

Bioinformatics

One major advantage of our approach is that our starting material is not a random cDNA library, but is a collection of pre-sequenced cDNAs. This means that before we carry out the cloning and localisation experiments we have access to the ORF sequence and therefore bioinformatics can make predictions regarding the localisation and function of the encoded protein.

In order to do this rapidly for our extensive ORF collection we have written software termed Harvester that automatically gathers this information from a number of independent bioinformatic resources. Harvester parses an accession number (or protein sequence) to the servers shown below and saves the output (text or screenshot) on a single local HTML page - one HTML page per sequence. This provides the convenience of instant access (no waiting times), and allows us to at once compare the predictions of the various bioinformatic tools. Harvester gathers this information approximately every three weeks to ensure that it is up to date.

On every page that shows the localisation information for an GFP-ORF, a link is provided (Swissprot ID) to the corresponding Harvester bioinformatics page. If this link has not been activated, this means that we are still awaiting a Swissprot ID number for the corresponding cDNA sequence.

Harvester collects bioinformatic data from a number of sources