Proof of Concept funding from BioProNET has enabled Jim Warwicker and colleagues from the University to build a webtool that predicts protein solubility. Recombinant biologics often have low solubility, due to their high concentrations, sequence and three-dimensional structure. The accumulation of insoluble protein agglomerates can lead to the formation of aggregates, which can impact biological activity and immunogenicity of a biologic.
Therefore determining the solubility of a protein and its propensity of a protein to aggregate would be of great use to the biopharmaceutical industry and researchers.
The funding from BioProNET enabled Jim and colleagues to develop existing code into a user-friendly web format. Users (anyone!) can paste a single sequence of amino acids into the tool; the software compares this sequence to a benchmark dataset of proteins with known solubility, and then returns a set of calculations that predict solubility of the protein based on its sequence.
The programme calculates a variety of properties — such as amino acid composition, net predicted charge, predicted pI value, ratio of conservative amino acids, propensity for disorder, propensity for forming beta strands and sheets — that indicate how soluble the entered amino acid sequence is likely to be.
The webtool is available here:
http://www.protein-sol.manchester.ac.uk/
The project is already bearing fruit, as it has been used as part of a successful proposal to the EPSRC formulation call. The software is still under development and further improvements, including those based on user-feedback will be added.
Protein-Sol: A web tool for predicting protein solubility from sequence
Hebditch M, Alejandro Carballo-Amador M, Charonis S, Curtis R, Warwicker J.
Bioinformatics doi: 10.1093/bioinformatics/btx345 May 29 2017