Title: Abstract
1(No Transcript)
2Abstract Voronoi tessellation has proved to be a
useful tool in protein structure analysis. But a
versatile, public-domain tool for calculating and
visualizing tessellations at various levels of
granularities is not available. To meet this
requirement, we developed PROVAT, a set of Python
scripts, which integrate freely available
specialized software (Qhull, Gromacs, Pymol etc.)
into a pipeline that can be easily manipulated at
command-line or web-server. A major feature of
the tool is flexible definition of sites required
as input to tessellation calculation. With
PROVAT, it is easy to specify one site per amino
acid residue or one site each for mainchain and
sidechain, or a site for any other arbitrary
atom-group. For each site, it is possible to
specify a physicochemical character which is
later used for coloring Voronoi faces. If 3 atoms
are specified for determining local reference
frame for a site, PROVAT can compute orientations
of each Voronoi neighbour in that frame.
Site-specific information is read from an XML
file, hence it is easy to experiment with
different tessellation strategies by using
different XML specifications. Solvation of a
system, vital for reasonable tessellation at the
solvent-exposed surface, can be done with Gromacs
or by a cubic grid paramterized on
protein-solvent and solvent-solvent interatomic
distance. The calculation component extracts
sites according to XML specification, computes
Voronoi polyhedra and neighbour lists, and stores
this as a text file and python pickle file.
Various styles of text files are provided. The
visualization component, a Pymol plug-in, offers
a GUI to render the pickle file and enables
visual exploration of tessellation. It is
possible to visualize individual polyhedra
colored according to their neighbours,
solvent-exposed surfaces and interfaces between
protein and other protein/ligand/DNA. PROVAT
source code can be downloaded from
http//raven.bioc.cam.ac.uk/swanand/Provat1,
which also provides a webserver for its
calculation component, documentation and
examples.
Introduction Voronoi tessellation has proved to
be a useful tool (Poupon, 2004) for estimating
atomic volumes, detecting cavities, analysis of
secondary structure and fold, derivation of
statistical potentials, study of interctions and
so on. Tessellation assigns a convex polyhedron
to each of the given sites in a space-filling
manner. For biological macromolecules, site
definition and solvation are two major issues.
Visualization of tessellation is also useful for
inspecting local environments, interactions etc.
Residue Environments PROVAT can be used to
analyze microenvironments of interesting sites in
the structure. e.g. His-153 (PDB 1a1f) in Zinc
finger binding protein interacts with protein
sidechains, DNA bases as well as zinc. Residue
environments can be quantified using interaction
areas between a given residue and its neighbours,
which can be used to derive statistical
potentials to assist protein structure prediction
and substitution tables to assist homology
recognition. Orientation information can be
derived for neighbourhood by defining a local
reference frame in terms of coordinates of 3
atoms. This could be useful in enhancing pairwise
statistical potentials.
Solvation In order to achieve reasonable
molecular surface and atomic volumes near the
surface, it is essential to place waters
correctly around the macromolecule. PROVAT
provides two options to solvate the
macromolecule. Gromacs, a popular molecular
mechanics package is used by creating a solvent
box followed by equilibration and
position-restrained MD. A simpler heuristic
(Zimmer et. al., 1998) is to place solvent
molecules such that there there is a minimum
distance between any two solvent molecules (Dll)
and any pair of solvent and protein atoms (Dpl).
Gromacs solvation is much slower than the latter.
Dpl and Dll values that produce similar Voronoi
polyhedral volumes as core polyhedra are 4 and 3
Ao respectively, but they can be altered by the
user.
Interaction networks within proteins Bioinformatic
ians are increasingly taking the network
perspective on proteins. With careful definition
of sites, it is possible to analyze and visualize
such networks. e.g. hydrogen bond networks formed
by mainchain carboxyl and amide groups is shown
for a portion of zinc finger protein.
Interaction interfaces PROVAT can be helpful in
analysis of interactions between protein-protein,
protein-drug and protein-DNA, by helping in
derivation of tessellation based statistical
potentials. Interesting patterns in interaction
networks at interfaces can be used to design
inhibitors.
Conclusion PROVAT will be used for deriving
neighbour relationships from protein families for
detecting neighbour-specific substitution
patterns, which will be used in
sequence-structure alignment and homology
recognition. It is possible to use this tool for
identifying shells of neighbourhood around
functional residues and studying the effect of
functional restraints in those shells. Protein
interfaces also can be studied with this tool.
From software point of view, this tool
demonstrates the utility of re-using existing
tools in a flexible manner.
Webserver and command line options Checking the
quality of structure It is important to check the
structure for missing atoms in order to obtain
expected results from tessellation. A missing
atom may cause adjacent atoms to have unusually
large volumes, PROVAT provides options for
detecting continuity of mainchain and missing
atoms. For missing sidechain atoms, PROVAT uses
SCWRL (Canutescu et al, 2003) and rebuilds those
sidechains. Structure filtering Waters and
het-atoms present in the given structure can be
used or ignored. Covalently bonded pairs of
metasites can be ignored while reporting the
neighbour relationships. For Gromacs-based
solvation, thickness of water box and durations
of energy-minimization and position-restrained MD
can be controlled. For heuristic solvation, the
grid parameters, Dpl and Dll , can be
controlled. Output Various data styles are
provided for text output. Orientation and area
for neighbours of each metasite is reported. All
neighbours can be reported individually or they
can be grouped by their physicochemical nature.
For viewing the output in Pymol, tessellation
information is pickled using an object
serialization protocol in Python.
References Poupon,A. (2004) Voronoi and
Voronoi-related tessellations in studies of
protein structure and interaction. Curr. Op.
Struct. Biol., 14, 233 241. Barber,C.B.,
Dobkin,D.P., and Huhdanpaa,H. (1996) The
quickhull algorithm for convex hulls. ACM Trans.
Math. Softw., 22 (4), 469 483. Canutescu,A.A.,
Shelenkov,A.A. and Jr.,R.L.D. (2003) A graph
theory algorithm for protein side-chain
prediction. Protein Science, 12, 2001
2014. Lindahl,E., Hess,B. and van der Spoel,D.
(2001) Gromacs 3.0 a package for molecular
simulation and trajectory analysis. J. Mol. Mod.,
7, 306 317. Zimmer,R., Wohler,M. and Thiele,R.
(1998) New scoring schemes for protein fold
recognition based on voronoi contacts.
Bioinformatics, 14 (3), 295 308. DeLano,W. (2002)
The PyMOL User s Manual. DeLano Scientific San
Carlos, CA, USA. Gore, S. P., Burke, D. F. and
Blundell, T. L. (2005) PROVAT a tool for Voronoi
tessellation analysis of protein structures and
complexes, Bioinformatics, doi10.1093/bioinformat
ics/bti523
Pymol plugin for Visualization Tessellation
information can be rendered using a Pymol plugin,
which can select metasites, render surfaces,
interfaces, polyhedra and show network
representation of neighbour relationships.
Acknowledgements SG thanks Cambridge Commonwealth
Trust and Universities UK for financial support.