Infobiotics

Infobiotics is the synergy of executable biology, evolutionary and machine learning methods, mesoscopic simulation techniques and experimental data for a more principled practice of origins of life, bioinformatics, computational systems and synthetic biology research.

As part of our infobiotics efforts we have developed a series of bioinformatics servers:

ProCKSI ProCKSI is a decision support system for protein structure comparison that computes structural similarities using a variety of measures to produce a consensus. It contains tools for visualising, analysing and easily comparing all results, linking to external resources for further information and literature about protein structures. These services are part of a greater investigation of a suitable framework/architecture for very large scale protein structure comparison, clustering and analysis in parallel and distributed environments, involving the evaluation and selection of, optimal middleware software, database model, programming libraries, tools and algorithms.

RCH? Exp? CN? SA? PSP server contains a collection of web services that predict Protein Structure Prediction (PSP) sub-problems such as coordination number, solvent accessibility or recursive convex hull using Learning Classifier Systems. These subproblems are structural features of protein residues that contain information about the end product of the folding process. These features are related to the density of packing of different parts of a protein or how buried/exposed, far/close to the surface are different residues within a protein. The server also includes calculation services, where the actual values for all these features are computed from a PDB file

PSPbenchmarks repository The Infobiotics PSP benchmarks repository contains an adjustable real-world family of benchmarks suitable for testing the scalability of classification/regression methods. When we test a machine learning method we usually choose a test suite containing datasets with a broad set of characteristics, as we are interested in knowing how the learning method reacts to a veriety of scenarios. The PSP field provides us with a whole family of real-world classification/regression problems that can be adjusted almost arbitrarily in terms of number of variables, number of classes, class balance, etc. Thus, these datasets are an ideal benchmark suite for data mining methods.

Array Mining ArrayMining.net is a server for automatic analysis of DNA-microarray data. It provides modular combinations of five common tasks in gene expression analysis: Cross-study normalisation, Feature selection, Clustering,Prediction and Gene Set analysis. Array Mining uses ensemble and consensus techniques and performs automaticparameter selection. Integration of functional annotation data, a rule-based classification method and 3D VRML visualisation of clustering results further simplify the analysis.

VRMLGen VRMLGen is a free R software package to generate 3D representations of biological data in the Virtual Reality Markup Language (VRML). Annotated charts, bar plots, height maps, density and scatter plots as well as parametric functions can be easily visualised and viewed from different perspectives.

GP tree with a protein structure as background GP challenge data set used to evolve the energy function for protein structure prediction contains I-TASSER generated decoys and energy terms for 54 proteins. It also includes hires versions of scatter plots visualising correlation between I-TASSER/evolved energy function and the similarity to the native structure measured with RMSD, as well the diversity dynamics throughout the generations for individual GP runs.

The ultimate goal of systems biology is the development of executable in silico models of cells and organisms, while synthetic biology aims to implement, in vitro/vivo, organisms whose behaviour is engineered. P systems, computing with membranes, abstract the structure and function of the living cell into a formalism upon which we are building a multi-scale modelling environment. By applying discrete, numerical simulation algorithms to P system models of quorum sensing in the bacterial pathogen Pseudomonas aeruginosa and root development in Arabidopsis thaliana, we aim to understand the stochastic processes governing these model organisms, and guide laboratory experiments. In addition we use evolutionary optimisation and machine learning (Genetic Programming, Learning Classifier Systems, Support Vector Machines, etc) to estimate parameters and discover structures that match observed behaviour in cellular networks with the intention of isolating these modules for use in the design of synthetic organisms. Dissipative Particle Dynamics is used for simulating (proto-)membranes. These activities are complemented by other groups in the SynBioNT Synthetic Biology Network for Modelling and Programming Cell-Chell Interactions.


Recent Publications

Systems/Synthetic Biology:

  1. Blakes J, Romero-Campero FJ, Twycross J, Cao H, Krasnogor N. "An Integrated Development Environment for Synthetic Biology Models" presented at European Conference on Synthetic Biology (ECSB) II: Design, Programming and Optimisation of Biological Systems (ESF-UB Conference in Biomedicine), Sant Feliu de Guixols, Spain, 29 March - 03 April 2009. [pptx poster,pdf poster]

  2. Romero-Campero FJ, Twycross J, Cao H, Blakes J, Krasnogor N. "A Multiscale Modelling Framework Based On P Systems" Workshop on Membrane Computing 2008 (WMC9) Edinburgh, UK, July 28-31, Revised Selected and Invited Papers in Lecture Notes in Computer Science 5391: 63-77. [pdf chapter]

  3. Blakes J, Krasnogor N, Romero-Campero FJ, Twycross J. "An Executable Biology Methodology for Systems and Synthetic Biology" Proceedings of the ECCB Satellite Meeting on Probabilistic Modelling in Computational Biology, Cagliari, Sardinia, September 2008. [pdf extended abstract]

  4. Romero-Campero FJ, Twycross J, Bennett M , Camara M, Krasnogor N. "Modular Assembly of Cell Systems Biology Models Using P Systems" Prague International Workshop on Membrane Computing Prague, Czech Republic, 2nd June 2008. [pdf paper, pdf presentation]

  5. Romero-Campero FJ, Blakes J, Camara M, Krasnogor N. "A systems analysis of the AHL Quorum Sensing system in Pseudomonas aeruginosa" presented at Systems Biology (ESF-UB Conference in Biomedicine), Sant Feliu de Guixols, Spain, 12-17 April 2008. [ppt poster]

  6. Twycross J, Ubeda-Tomas S, Kramer E, Bennett M, Krasnogor N. "A Tissue-level Model of Auxin Transport in the Arabidopsis thaliana Root" ESF-UB Conference in Biomedicine - Systems Biology 12-17 April 2008, Sant Feliu de Guixols, Spain. [pdf poster]

  7. Romero-Campero FJ, Blakes J, Cao H, Camara M, Krasnogor N. "A modular and stochastic approach to the study of gene circuits using P systems" presented at Genomes to Systems Manchester, UK March 17-19 2008. [pdf poster]

  8. Smaldon J, Krasnogor N. "A New Method for Prototyping Systems Biology Designs with P systems" Genomes to Systems 2008 Manchester, UK. [pdf]

  9. Smaldon J, Blakes J, Lancet D, Krasnogor N. "A Multi-scaled Approach to Artificial Life Simulation With P Systems and Dissipative Particle Dynamics" Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2008 Atlanta, USA), 249-256, ACM Publisher, 2008. [pdf paper, bibtex]

  10. Romero-Campero FJ, Cao H, Camara M, Krasnogor N. "Structure and Parameter Estimation for Cell Systems Biology Models" Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2008 Atlanta, USA), 331-338, ACM Publisher, 2008. [pdf paper]

  11. Romero-Campero FJ, Blakes J, Camara M, Willams P, Perez-Jimenez MJ, Krasnogor N. "Formal informatics and machine learning for more principled systems and synthetic biology" presented at European Conference on Synthetic Biology (ESF-UB Conference in Biomedicine), Sant Feliu de Guixols, Spain, 24-29 November 2007. [pdf poster]

Protein Structure Comparison: Protein Structure Prediction:
  1. Stout M, Bacardit J, Hirst J D, Krasnogor N. "Prediction of Recursive Convex Hull Class Assignments for Protein Residues" [paper] in Bioinformatics 2008 24(7): 916-923.

  2. Bacardit J, Stout M, Hirst J D, Krasnogor N. "Data Mining in Proteomics with Learning Classifier Systems" paper in Bull L, Bernado Mansilla E, Holmes J. (eds) Learning Classifier Systems in Data Mining 2008 Springer in press.

  3. Stout M, Bacardit J, Hirst J D, Smith R E, Krasnogor, N. "Prediction of Topological Contacts in Proteins Using Learning Classifier Systems" paper in Special Issue on Evolutionary and Metaheuristic-based Data Mining Soft Computing Journal in press.

  4. Bacardit J, Stout M, Hirst J D, Sastry K, Llor X, Krasnogor N. "Automated Alphabet Reduction Method with Evolutionary Algorithms for Protein Structure Prediction" [paper] in Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (GECCO2007) 2007 346-353 ACM Press.

  5. Bacardit J, Stout M, Hirst J D, Krasnogor N, Blazewicz J. "Coordination number prediction using Learning Classifier Systems: Performance and interpretability" [paper] in Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation (GECCO2006) 2006 247-254 ACM Press.


Software

Models