The Gene Ontology (GO) is the major resource of annotations for genes and proteins. Despite the presence of large efforts to avoid errors and inconsistencies, some unreliabilities are still present. In particular electronically inferred annotations are more unreliable than manual ones and their number is growing. Thus, the need for an accurate evaluation of annotations in an automatic way arises. In the past, some approaches for improving annotation consistencies have been proposed using association rule mining to discover hidden relationships among GO terms. However such approaches consider all the GO terms equally, while GO terms have different Information Content, i.e. different relevance. Consequently we designed a novel algorithm, (GO-WAR), Mining Weighted Association Rules from GO, that is based on the extraction of weighted association rules considering the IC of terms. We evaluated our algorithm considering seven different species and all the GO ontologies. In all the experiments GO-WAR outperformed state of the art approaches.
Improving annotation quality in gene ontology by mining cross-ontology weighted association rules
Agapito G;Milano M;Guzzi PH;Cannataro M
2014-01-01
Abstract
The Gene Ontology (GO) is the major resource of annotations for genes and proteins. Despite the presence of large efforts to avoid errors and inconsistencies, some unreliabilities are still present. In particular electronically inferred annotations are more unreliable than manual ones and their number is growing. Thus, the need for an accurate evaluation of annotations in an automatic way arises. In the past, some approaches for improving annotation consistencies have been proposed using association rule mining to discover hidden relationships among GO terms. However such approaches consider all the GO terms equally, while GO terms have different Information Content, i.e. different relevance. Consequently we designed a novel algorithm, (GO-WAR), Mining Weighted Association Rules from GO, that is based on the extraction of weighted association rules considering the IC of terms. We evaluated our algorithm considering seven different species and all the GO ontologies. In all the experiments GO-WAR outperformed state of the art approaches.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.