Infobiotics is the synergy of executable biology, evolutionary and machine learning methods, mesoscopic simulation techniques and experimental data for a more principled practice of origins of life, bioinformatics, computational systems and synthetic biology research.
As part of our infobiotics efforts we have developed a series of bioinformatics servers:
ProCKSI is a
decision support system for protein structure comparison that
computes structural similarities using a variety of measures to
produce a consensus. It contains tools for visualising, analysing
and easily comparing all results, linking to external resources
for further information and literature about protein structures.
These services are part of a greater investigation of a suitable
framework/architecture for very large scale protein structure
comparison, clustering and analysis in parallel and distributed
environments, involving the evaluation and selection of, optimal
middleware software, database model, programming libraries, tools
and algorithms.
PSP server contains a
collection of web services that predict Protein Structure Prediction (PSP)
sub-problems such as coordination number, solvent accessibility
or recursive convex hull using Learning Classifier Systems.
These subproblems are structural features of protein residues
that contain information about the end product of the folding
process. These features are related to the density of packing
of different parts of a protein or how buried/exposed,
far/close to the surface are different residues within a protein. The
server also includes calculation services, where the actual values
for all these features are computed from a PDB file
The Infobiotics PSP benchmarks repository
contains an adjustable real-world family of benchmarks suitable
for testing the scalability of classification/regression methods.
When we test a machine learning method we usually choose a test
suite containing datasets with a broad set of characteristics,
as we are interested in knowing how the learning method reacts to
a veriety of scenarios. The PSP field provides us with a whole
family of real-world classification/regression problems that can be
adjusted almost arbitrarily in terms of number of variables,
number of classes, class balance, etc. Thus, these datasets are
an ideal benchmark suite for data mining methods.
ArrayMining.net is a server for automatic
analysis of DNA-microarray data. It provides modular combinations of five common
tasks in gene expression analysis: Cross-study normalisation, Feature selection,
Clustering,Prediction and Gene Set analysis. Array Mining uses
ensemble and consensus techniques and performs automaticparameter selection. Integration
of functional annotation data, a rule-based classification method and 3D VRML visualisation
of clustering results further simplify the analysis.
VRMLGen is a free R software
package to generate 3D representations of biological data in the
Virtual Reality Markup Language (VRML). Annotated charts, bar plots, height maps,
density and scatter plots as well as parametric functions can be easily visualised
and viewed from different perspectives.
GP challenge data set
used to evolve the energy function for protein structure prediction
contains I-TASSER generated decoys and energy terms for 54 proteins.
It also includes hires versions of scatter plots visualising
correlation between I-TASSER/evolved energy function and the similarity
to the native structure measured with RMSD, as well the diversity
dynamics throughout the generations for individual GP runs.
The ultimate goal of systems biology is the development of executable in silico models of cells and organisms, while synthetic biology aims to implement, in vitro/vivo, organisms whose behaviour is engineered. P systems, computing with membranes, abstract the structure and function of the living cell into a formalism upon which we are building a multi-scale modelling environment. By applying discrete, numerical simulation algorithms to P system models of quorum sensing in the bacterial pathogen Pseudomonas aeruginosa and root development in Arabidopsis thaliana, we aim to understand the stochastic processes governing these model organisms, and guide laboratory experiments. In addition we use evolutionary optimisation and machine learning (Genetic Programming, Learning Classifier Systems, Support Vector Machines, etc) to estimate parameters and discover structures that match observed behaviour in cellular networks with the intention of isolating these modules for use in the design of synthetic organisms. Dissipative Particle Dynamics is used for simulating (proto-)membranes. These activities are complemented by other groups in the SynBioNT Synthetic Biology Network for Modelling and Programming Cell-Chell Interactions.
Systems/Synthetic Biology:
Blakes J, Romero-Campero FJ, Twycross J, Cao H, Krasnogor N. "An Integrated Development Environment for Synthetic Biology Models" presented at European Conference on Synthetic Biology (ECSB) II: Design, Programming and Optimisation of Biological Systems (ESF-UB Conference in Biomedicine), Sant Feliu de Guixols, Spain, 29 March - 03 April 2009. [pptx poster,pdf poster]
Romero-Campero FJ, Twycross J, Cao H, Blakes J, Krasnogor N. "A Multiscale Modelling Framework Based On P Systems" Workshop on Membrane Computing 2008 (WMC9) Edinburgh, UK, July 28-31, Revised Selected and Invited Papers in Lecture Notes in Computer Science 5391: 63-77. [pdf chapter]
Blakes J, Krasnogor N, Romero-Campero FJ, Twycross J. "An Executable Biology Methodology for Systems and Synthetic Biology" Proceedings of the ECCB Satellite Meeting on Probabilistic Modelling in Computational Biology, Cagliari, Sardinia, September 2008. [pdf extended abstract]
Romero-Campero FJ, Twycross J, Bennett M , Camara M, Krasnogor N. "Modular Assembly of Cell Systems Biology Models Using P Systems" Prague International Workshop on Membrane Computing Prague, Czech Republic, 2nd June 2008. [pdf paper, pdf presentation]
Romero-Campero FJ, Blakes J, Camara M, Krasnogor N. "A systems analysis of the AHL Quorum Sensing system in Pseudomonas aeruginosa" presented at Systems Biology (ESF-UB Conference in Biomedicine), Sant Feliu de Guixols, Spain, 12-17 April 2008. [ppt poster]
Twycross J, Ubeda-Tomas S, Kramer E, Bennett M, Krasnogor N. "A Tissue-level Model of Auxin Transport in the Arabidopsis thaliana Root" ESF-UB Conference in Biomedicine - Systems Biology 12-17 April 2008, Sant Feliu de Guixols, Spain. [pdf poster]
Romero-Campero FJ, Blakes J, Cao H, Camara M, Krasnogor N. "A modular and stochastic approach to the study of gene circuits using P systems" presented at Genomes to Systems Manchester, UK March 17-19 2008. [pdf poster]
Smaldon J, Krasnogor N. "A New Method for Prototyping Systems Biology Designs with P systems" Genomes to Systems 2008 Manchester, UK. [pdf]
Smaldon J, Blakes J, Lancet D, Krasnogor N. "A Multi-scaled Approach to Artificial Life Simulation With P Systems and Dissipative Particle Dynamics" Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2008 Atlanta, USA), 249-256, ACM Publisher, 2008. [pdf paper, bibtex]
Romero-Campero FJ, Cao H, Camara M, Krasnogor N. "Structure and Parameter Estimation for Cell Systems Biology Models" Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2008 Atlanta, USA), 331-338, ACM Publisher, 2008. [pdf paper]
Romero-Campero FJ, Blakes J, Camara M, Willams P, Perez-Jimenez MJ, Krasnogor N. "Formal informatics and machine learning for more principled systems and synthetic biology" presented at European Conference on Synthetic Biology (ESF-UB Conference in Biomedicine), Sant Feliu de Guixols, Spain, 24-29 November 2007. [pdf poster]
ProCKSI related publications can be found here.
Stout M, Bacardit J, Hirst J D, Krasnogor N. "Prediction of Recursive Convex Hull Class Assignments for Protein Residues" [paper] in Bioinformatics 2008 24(7): 916-923.
Bacardit J, Stout M, Hirst J D, Krasnogor N. "Data Mining in Proteomics with Learning Classifier Systems" paper in Bull L, Bernado Mansilla E, Holmes J. (eds) Learning Classifier Systems in Data Mining 2008 Springer in press.
Stout M, Bacardit J, Hirst J D, Smith R E, Krasnogor, N. "Prediction of Topological Contacts in Proteins Using Learning Classifier Systems" paper in Special Issue on Evolutionary and Metaheuristic-based Data Mining Soft Computing Journal in press.
Bacardit J, Stout M, Hirst J D, Sastry K, Llor X, Krasnogor N. "Automated Alphabet Reduction Method with Evolutionary Algorithms for Protein Structure Prediction" [paper] in Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (GECCO2007) 2007 346-353 ACM Press.
Bacardit J, Stout M, Hirst J D, Krasnogor N, Blazewicz J. "Coordination number prediction using Learning Classifier Systems: Performance and interpretability" [paper] in Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation (GECCO2006) 2006 247-254 ACM Press.
The full list of PSP publications (and the project website) can be found here.