|
|
Protein Folding
Introduction
Proteins are the most versatile macromolecules in biological
systems. They fulfill a multitude of different
tasks from storing other molecules like oxygen in human blood
to catalytic function. They are responsible for the stability of
macro-complexes, such as hair, and facilitate membrane transport or
the transmission of electric signals in the human brain.
In spite of this importance their properties and function are still
far from being fully understood by scientists. Their basic
composition is that of linear polymers composed of sequences
of amino acids. This amino acid sequence is encoded by DNA. It
already contains all necessary information about the unique
three-dimensional structure of a protein which is directly
correlated to its function. The formation of this three-dimensional
structure in their physiological environment, mostly water, out
of its linear amino acid sequence takes place in a complex process
called protein folding. Regardless of the starting point
or de-folding it by changing the environmental conditions many
proteins will finally always assume the same structure. Very
few proteins show an alternative native state under minimal changes in environment like, for example, prions.
This unique three-dimensional structure is called the
native state of a protein which can be obtained to
atomic resolution for many proteins by X-ray scattering or NMR.
For lack of suitable experimental techniques, time resolved information
on the the actual folding process is currently not available with
similar detail.
In order to theoretically understand the process
of protein folding scientists in different fields, such as biology,
information science, mathematics, biochemistry and physics, have
pursued various often interdisciplinary approaches. Knowledge based
approaches use databases of experimentally determined structural
information for proteins to predict structures for other
proteins. In recent years these
methods have made steady progress towards de-novo protein
structure prediction, although they require substantial sequence
similarity to yield usable results. Unfortunately they give only
indirect evidence regarding the mechanisms by which proteins assume
their unique three-dimensional structure.
The challenge for more sophisticated models motivated by physical ideas
is the size and complexity of the system. Proteins consist of
numerous different amino acids with hundreds of atoms but have no
exploitable higher symmetries. In order to gain more insight into
the folding process simple, but tractable lattice models
representing protein structure have been developed.
The necessary simplifications however leave a wide gap between these
models and actual protein structure. The simulation of the protein
folding process by molecular dynamics using existing biomolecular
forcefields like AMBER or CHARMM has yielded increasingly valuable
insights into the folding mechanisms. Such simulations permit the
understanding of the folding process by an interpretation
of the gained data, but are presently limited to very small proteins
by the extremely high computational demands.
One major source of complexity arises from the strong influence the
environment has on the folding process and protein structure. Thus the
appropriate inclusion of solvent effects has lead to controversial
debates. It is also presently unclear
which families of proteins are adequately folded using established
forcefields with molecular dynamical simulations. In some
simulations the forcefields fail to stabilize the native state.
My contribution to this field
In my PhD-thesis I investigated the validity of an alternative, atomically resolved
approach to the folding of proteins. It is based on models of the underlying
physical interactions. The thermodynamic hypothesis
postulates that most proteins are in thermodynamic equilibrium with
their environment. Therefore it should be possible to represent this
unique native state as the global minimum of an appropriate
free-energy model. I applied an all-atom free-energy-forcefield} based on physically
motivated interactions called PFF01
to represent the underlying interactions governing protein structure
formation. Starting from random initial conditions we predicted
protein structure de-novo, by identifying the global optimum of the
forcefield. The computational demands of this approach are significantly
less than molecular dynamical simulations while still allowing
insight into the forces responsible for stabilizing the native state
or driving the folding process.
We validated the forcefield against experimental data successfully,
i.e. the experimentally determined native state corresponded
to the global minimum of PFF01 by developing and testing
new stochastic optimization schedules.
Accordingly I could address two central questions of protein folding:
Protein folding is ultimately governed by complicated
quantum-mechanical effects, such as the formation of hydrogen bonds,
Fermi-repulsion of electronic
clouds and interaction of the protein surface with a complex environment:
Can the folding of a protein be understood and
represented by a classical free-energy-forcefield and, if yes, how
can it be done in a computationally treatable way?
Proteins have many degrees of freedom and no exploitable symmetries.
It is known that global minimization of rough and high-dimensional
energy landscapes, like those in spin-glass theory, is very
difficult. Therefore: Due to the complexity of
such a forcefield, are there optimization methods allowing
to find the global minimum and what about the efficiency of these methods?
Short answer (for more details
I must refer to my publications): The found global minima for different proteins
in PFF01, ranging from 20 to 60 amino-acids were in agreement
with experimental data underlining that this approach can
successfully address the problem of protein structure prediction. Therefore
current day computational resources are able to both describe the native
state of a protein as the global minimum of an appropriate free-energy
forcefield and find it successfully on a reliable basis
|
|