The Gene Ontology (GO) is a controlled vocabulary of concepts (called GO Terms) structured on three main ontologies. Each GO Term contains a description of a biological concept that is associated to one or more gene products through a process also known as annotation. Each annotation may be derived using different methods and an Evidence Code (EC) takes into account of this process. The importance and the specificity of both GO terms and annotations are often measured by their Information Content (IC). Mining annotations and annotated data may extract meaningful knowledge from a biological stand point. For instance, the analysis of these annotated data using association rules provides evidence for the co-occurrence of annotations. Nevertheless classical association rules algorithms do not take into account the source of annotation nor the importance yielding to the generation of candidate rules with low IC. This paper presents a methodology for extracting Weighted Association Rules from GO implemented in a tool named GO-WAR (Gene Ontology-based Weighted Association Rules). It is able to extract association rules with a high level of IC without loss of Support and Confidence from a dataset of annotated data. A case study on using of GO WAR on publicly available GO annotation dataset is used to demonstrate that our method outperforms current state of the art approaches.
GO-WAR: A Tool for Mining Weighted Association Rules from Gene Ontology Annotations
Giuseppe Agapito;Cannataro M;Pietro H. Guzzi;Marianna Milano
2015-01-01
Abstract
The Gene Ontology (GO) is a controlled vocabulary of concepts (called GO Terms) structured on three main ontologies. Each GO Term contains a description of a biological concept that is associated to one or more gene products through a process also known as annotation. Each annotation may be derived using different methods and an Evidence Code (EC) takes into account of this process. The importance and the specificity of both GO terms and annotations are often measured by their Information Content (IC). Mining annotations and annotated data may extract meaningful knowledge from a biological stand point. For instance, the analysis of these annotated data using association rules provides evidence for the co-occurrence of annotations. Nevertheless classical association rules algorithms do not take into account the source of annotation nor the importance yielding to the generation of candidate rules with low IC. This paper presents a methodology for extracting Weighted Association Rules from GO implemented in a tool named GO-WAR (Gene Ontology-based Weighted Association Rules). It is able to extract association rules with a high level of IC without loss of Support and Confidence from a dataset of annotated data. A case study on using of GO WAR on publicly available GO annotation dataset is used to demonstrate that our method outperforms current state of the art approaches.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.