Functional Analysis of Gene Networks

Gene networks are a common and meaningful way of representing the associations between genes inferred from experimental data. In a graph theoretic setting, nodes describe genes and edges the respective associations (commonalities) between them. However, networks built from experimental data can be highly biased due to small sample sizes and the presence of noise.

As a consequence, attempts for functional predictions of uncharacterized genes rely substantially on the quality of experimental data that can highly vary due to biases arising from small sample sizes and the presence of noise. A major challenge of bioinformatics and computational data analysis lies in identifying artifacts in biased data and separating them from biological meaningful information. With the bioinformatics toolbox EGAD we provide a package of highly efficient methods for calculating functional properties of networks based on the „Guilt-by-association“-principle, that allows for rapid gene function prediction, performance evaluation and determination of optimal priors. Two of the most useful methods are: a function prediction algorithm which is fully vectorized, allowing network characterization across even thousands of functional groups to be accomplished in minutes in cross-validation and an analytic determination of the optimal prior to guess candidates genes across multiple functional sets. We demonstrate the methods by tracing the effects of selection biases arising in the transfer of function predictions for orthologous genes from humans to model organisms, focusing on autism candidate genes.

In a second project, we attempt to describe a novel null-model for co-expression networks by evaluating the transitivity of correlations from a mathematical perspective. The model yields insights into methodological biases in co-expression networks or, more general, correlation-based biological networks: By analyzing the network topology of modules around so-called hub genes, we determine constraints on connectivity not typically accounted for when nulls are constructed through link permutation.

Bioinformatics Toolbox

  • S. Ballouz*, M. Weber*, P. Pavlidis and J. Gillis (2016): "EGAD: Extending guilt by association by degree." R package version 1.2.0. [Bioconductor][User Guide] (*: contributed equal)

References

  • S. Ballouz, M. Weber, P. Pavlidis and J. Gillis (2016): “EGAD - Ultrafast Analysis of Genetic Networks”. Bioinformatics, vol. 33 (4), 612-614. [bioRxiv][Journal] (*: contributed equal)
  • M. Weber and J. Gillis: “Quantifying necessary transitivity in gene networks”. In preparation.