Special Report: The impact of a process centric approach on ontology development
providing full GO annotation to genes associated with cardiovascular processes
The UCL based curation team is funded by the British Heart Foundation to provide manual Gene Ontology (GO) annotation for human proteins involved in cardiovascular (CV) processes and diseases. Part of this annotation effort includes the ongoing development of the GO to fully capture the CV-specific processes and functions that are being curated.
At the start of the project (Nov 2007) the UCL team worked in a protein-centric manner, annotating each protein to completion (Reference Genome Project standard), before moving onto the next protein in the CV gene list. Ontology development therefore occurred in a small-scale manner with one or two terms being created at a time. This approach worked well, but many papers contain information about multiple proteins, and often describe a group of proteins acting together in a process. Therefore it was decided to trial process-centric curation, where the curator selects papers describing a particular process and annotates those fully. One major benefit of the process-centric curation approach is that the curator gains a deeper knowledge of the process through reading the literature, and thus their input into ontology development can be significant.
One of the areas selected for process-centric curation was heart development. However, the GO had not been comprehensively developed in this area, and it was not possible to fully capture the process with the existing terms. In consultation with the GO editorial team, the UCL curators organized a workshop to generate heart development ontology terms.
Pre-workshop tasks included the identification of heart development experts who would be willing to devote time to the development of the GO, and the collation of relevant resources onto a wiki. The wiki proved invaluable both prior to the workshop (meeting logistics), as well as during the meeting (as a repository of resources relevant to the ongoing discussion).
An important part of the workshop was the initial GO presentation given for the benefit of the experts, to introduce GO concepts and ontology standards. This presentation, along with the live GO editing demonstration, meant that the experts quickly became comfortable with GO and were happy to suggest possible hierarchies for the new terms being generated. This meant that the generation of new terms and their placement into the evolving ontology was very efficient.
As a result of this workshop and the intense effort of the GO editorial team in the weeks after the workshop, there are now 250 new terms in the biological process ontology describing various aspects of heart development, and 12 existing GO terms that were modified as a direct result of the workshop. The figure (below) shows part of the heart development ontology, with newly developed terms highlighted.
We conclude that the process-centric approach for annotation and ontology development is a productive and efficient method of curation. With the aim of increasing the use of GO by experts in heart development, this work will be presented at the 8th London Heart Development Meeting in Dec 2009 and will also be described in a report for publication.
by Varsha Khodiyar, Cardiovascular Gene Ontology Annotator, University College London