last update

09 Feb, 2005


The GFP-cDNA localisation project is a central platform for a number of associated projects that are trying to identify and characterise the protein products of large numbers of newly identified human open reading frames (ORFs). Although subcellular localisation is one important piece of information, the results must fit into the context of large-scale cloning, bioinformatic analysis and functional assays. These components are described below, and in greater detail by following the links.

1. Large-scale cloning

Gateway Cloning Strategy

2. Localisation

Examples of Subcellular Localisations

3. Bioinformatics

Harvester Bioinformatics

4. Functional assays

Functional Assays

The ORFs of novel cDNAs identified by projects such as those of the German cDNA Consortium and the Mammalian Gene Collection are PCR amplified using primers which add specific recombination sites to the 5' and 3' ends. Recombination cloning using Gateway (Invitrogen) or Creator (BD Biosciences) allows the ORFs to be rapidly and conveniently shuttled between functionally useful vectors without the need for conventional restriction cloning. Our strategy places the ORF into GFP fusion vectors (cyan [CFP] or yellow [YFP] spectral variants) as both N-terminal and C-terminal fusions. These are then transfected into cells and the subcellular localisations of the fusions recorded.

The first step in the collection of functional information about the novel proteins is the determination of their subcellular localisation. For every ORF this is performed with N- and C-terminally tagged constructs. Expression clones are individually transfected into Vero cells and the subcellular localisations of the expressed proteins are recorded from the living cells at various time points. Results from the N- and C-terminal fusions are assessed and in turn these data are compared to the bioinformatic predictions. A final subcellular localisation (from approximately 20 categories) is then assigned for each ORF. This 'pooling' of the ORFs into subcellular groups determines the next functional assays to which they are subjected.

Storing up-to-date bioinformatic information for each of our ORFs is vital for the project. To do this we have written software to automatically script public databases such SMART, BLAST, MapView, SOSUI and ProtParam. The software retrieves database entries on a monthly basis and takes screenshots of the output. These data are collected and published on our local web server, reducing database or calculation waiting times. This ‘Harvester’ software allows rapid correlation of bioinformatic and localisation data.   

Novel ORFs grouped according to their subcellular localisation are then used in functional assays in an attempt to further identify their function. We have developed automated systems for image acquisition, processing and quantification to allow these experiments to be performed on such a large set of molecules. Our own assays are particularly focussed on intracellular membrane traffic, but further assays can easily be developed.