Bioinformatics
Institute of Computer Science
University Freiburg
de

Constraint-based Protein Structure Prediction (CPSP)

CPSP-logo CPSP is a collection of tools related to the prediction of optimal structures in simple lattice-protein models like the HP-model. We support the standard backbone models as well as side chain models!

It is therefore embedded in our research field Simplified Protein Models.

For constraint solving the Gecode library is used and needed.

For a DIRECT USE check our CPSP web tools Server !!!

Available tools

Main Publications

Documentation

Dependencies

Downloads

Contributing group members

HPstruct - Optimal structure prediction and counting

HPstruct predicts optimal structures for simple 3D-lattice proteins (HP-model). It implements the final step of the CPSP-approach of Rolf Backofen and Sebastian Will.
For a given HP-sequence HPstruct computes a list of optimal structures (in absolute moves on the lattice) or counts them.

Within the latest extension of the CPSP-package (v2.2.*) we support the prediction of optimal structures in the HP side chain model.

It is possible to generate only one optimal, all optimal, all available structures (limited by the size of the H-core database).
For further H-core files (size 3-10 included) please use the download links provided or mail me.

To get a good sample set for high degenerated sequences one can constrain the solution structures to differ either in x absolute move string positions or lattice positions.

To see the full parameter list run the tool using '-help'.

Current status

  • Support of side chain models
  • Minimal distance for generated structures can be constrained in terms of minimal differences in absolute positions or moves (see documentation)
  • Symmetry breaking - no generation or counting of symmetric structures
  • Support of cubic and face centered cubic lattice
  • Binary neighboring constraints on the lattice
  • Global Alldifferent constraint
  • Minimal domain initialisation (hulls, P-singlet positions)
  • H-core access via file based database
  • H-core skipping due to insufficient P-singlet positions
  • Cubic H-core skipping due to wrong even/odd position ratio

HPrep - Equivalence class representatives with minimal energy

HPrep enables the enumeration of equivalence class representatives of optimal structures. It implements the definitions and methods introduced in Equivalence classes of optimal structures in HP protein models including side chains.
Here, two structures are defined to be equivalent if they do not differ in their H-monomer placement. Thus, the equivalence definition follows the HP energy function that does not constrain P-monomers.
HPrep enumerates one representative structure for each equivalence class among all optimal structures for a given HP-sequence. The maximal number of structures to enumerate can be restricted.

To see the full parameter list run the tool using '-help'.

HPdeg - Degeneracy of HP-sequences

HPdeg calculates the degeneracy of a given HP-sequence. This is the number of optimal structures the sequence can adopt in a specific lattice.
For the calculation, the final step of the CPSP-approach of Rolf Backofen and Sebastian Will is used as done for HPstruct.
To handle high degenerated sequences as well and to allow testing for a maximal degeneracy this can be constrained to an upper bound.

To see the full parameter list run the tool using '-help'.

HPoptdeg - Search for low degenerated HP-sequences

The degeneracy of HP-sequences forms funnel-like structures in the sequence space. Local search algorithms are therefore a possibility to find local minima.

HPoptdeg performs a Monte-Carlo search in the sequence space and finds low degenerated HP-sequences.

To see the full parameter list run the tool using '-help'.

HPdesign - HP-sequence design for given structure

The problem HPdesign is facing is about the design of HP-sequences that fold optimal into a given structure and have a degeneracy below a given upper bound.
The approach first uses a precalculated database of H-cores to detect sequences that can adopt the structure as an optimal one. Afterwards the degeneracy of the sequences is checked using the CPSP approach of R. Backofen and S. Will.
The level of suboptimal H-cores taken into account can be restricted to speed up the search. If no sequence is found you should increase this level to take more sequences for tests into account.
Additionally, the H-content of the sequence can be constrained in order to restrict the enumerated sequences.

To see the full parameter list run the tool using '-help'.

HPnnet - Neutral nets of HP-sequences

A neutral net for a given sequence S and its only optimal structure X includes all sequences S' that can adopt X as their only optimal structure too. Additionally, all sequences in S' have to be direct or indirect neighbors of S. Two sequences are neighbored if they differ only in one sequence position.
HPnnet uses for its calculation the CPSP approach of R. Backofen and S. Will in order to check the degeneracy of a sequence neighbor and to compare its optimal structure to X if degeneracy is 1. Per default symmetric structures are excluded but can be included on demand.
To weaken the degeneracy criteria one can increase the maximal value allowed.

To see the full parameter list run the tool using '-help'.

HPrand - Random HP-sequence generator

HPrand generates random HP-sequence of a given length that can be constrained in terms of H-monomer content.

To see the full parameter list run the tool using '-help'.

HPview - HP-model lattice structure viewer

HPview creates an output in CML-file format of a sequence/structure that c an be viewed with molecule viewers like Chime or Jmol. HPview can call such an external viewer directly.

The structure is NOT validated (if connected and selfavoiding). If it is invalid normal execution can not be guaranteed.

The move string representation follows the encoding:
  • F/B = +- x
  • L/R = +- y
  • U/D = +- z
Currently supported viewers for direct visualization are: To see the full parameter list run the tool using '-help'.

HPcompress - HP-sequence (de-)compression

HPcompress allows the conversion of HP-sequences between normal/expanded representation and a compressed one.

e.g. HHHHPPPPPH <--> 4H5PH

To see the full parameter list run the tool using '-help'.

HPconvert - Lattice structure representation conversion

HPconvert converts lattice structures between different formats.

Currently supported representation formats are:
  • Absolute move string
  • Relative move string
  • Absolute monomer positions given in XYZ-file format
The move string representation follows the encoding:
  • F/B = +- x
  • L/R = +- y
  • U/D = +- z
The given structure is not validated (check if connected and selfavoiding). For invalid structures a normal tool execution cant be guarantied.

The XYZ-file format looks like that:
# Beginning with '#' marks a comment line
# The lattice positions x,y and z of each point are given 
# in integer coding e.g.
0 1 0
1 1 0
1 1 -1 
# EOF #


To see the full parameter list run the tool using '-help'.

HPseq - Converts amino acid into HP sequences

HPseq implements the method introduced by Kyte and Doolittle (1982) to derive an HP sequence from a 20 letter code amino acid sequence using hydrophobicity tables. At each position the average hydrophobicity of a given span is calculated. Based on a certain threshold the position is than classified (H)ydrophobic or (P)olar.

We implemented a certain number of different hydrophobicity tables as listed in CLC bio.