Sequence Data - HP sequence classification via folding properties
We used a thermodynamic and kinetic feature based classificatin procedure
to identify protein-like sequences in the 3D-cubic HP-model. The following
properties are tested:
- non-degenerate ground state (unique minimum free energy structure)
- good folder (ability to adopt this structure in short time folding simulations
- sequential folding accessibility of this structure along low barrier pathes
These properties ensure a thermodynamically stable native structure (the
unique mfe) and the ability to fold into this functional conformation within
a short time interval as requested by short biomolecule life cycles.
Furthermore, the sequential assembly of proteins is considered. There is
evidence for a co-translational folding during elongation that should restrict
the accessible folding space. Thus we are only interested in sequences that
are able to form their native structure via sequential folding without high
energy barriers in the traversed energy landscape.
A sequence fulfilling all
criteria is called protein-like
. If the ground state is not reachable sequentially
but via global folding at high rate is is classified as a good folder
are not able to adopt the native structure in a short time interval.
All checked sequences are non-degenerate
, i.e. having a unique ground state.
HP in unrestricted 3D-cubic
Benchmark set for Protein Chain Lattice Fitting (PCLF) Problem
This is the benchmark set of high resolution protein structures
used for benchmarking tools solving the Protein Chain Lattice Fitting (PCLF)
problem (see publication below).
The test set was taken from the PISCES
web server (Wang and Dunbrack, 2005). We enforced 40% sequence identity cutoff,
chain length 50–300, R-factor ≤ 0.3, and resolution ≤ 1.5 A
to derive a high-quality set of proteins to model. Given our
requirement for side chains, C_alpha-only chains were ignored.
The resulting benchmark set contains 1198 proteins exhibiting
a mean length of 160.
In case of questions, comments or contributions to this page