New Protein Solubility Predictor Funded by PoC Award

Proof of Concept funding from BioProNET has enabled Jim Warwicker and colleagues from the University to build a webtool that predicts protein solubility. Recombinant biologics often have low solubility, due to their high concentrations, sequence and three-dimensional structure. The accumulation of insoluble protein agglomerates can lead to the formation of aggregates, which can impact biological activity and immunogenicity of a biologic.

Therefore determining the solubility of a protein and its propensity of a protein to aggregate would be of great use to the biopharmaceutical industry and researchers.

The funding from BioProNET enabled Jim and colleagues to develop existing code into a user-friendly web format. Users (anyone!) can paste a single sequence of amino acids into the tool; the software compares this sequence to a benchmark dataset of proteins with known solubility, and then returns a set of calculations that predict solubility of the protein based on its sequence.

The programme calculates a variety of properties — such as amino acid composition, net predicted charge, predicted pI value, ratio of conservative amino acids, propensity for disorder, propensity for forming beta strands and sheets — that indicate how soluble the entered amino acid sequence is likely to be.

The webtool is available here:

The project is already bearing fruit, as it has been used as part of a successful proposal to the EPSRC formulation call. The software is still under development and further improvements, including those based on user-feedback will be added.

Protein-Sol: A web tool for predicting protein solubility from sequence
Hebditch M, Alejandro Carballo-Amador M, Charonis S, Curtis R, Warwicker J.
Bioinformatics doi: 10.1093/bioinformatics/btx345 May 29 2017