[Molecularmechanics] Recycling CML
David
molecularmechanics@tddft.org
07 Nov 2003 20:57:10 +0100
On Fri, 2003-11-07 at 18:41, Konrad Hinsen wrote:
> On Friday 07 November 2003 17:25, Pengyu Ren wrote:
>
> > I just took a quick look at the two proposals by you and David. David's
> > certainly more specific to protein. In your proposal two however each
> > residue/fragment is a "molecule" (why can't we just call it fragment?).
> > It's then easy to make a new molecule by resuing fragment libraries.
>
I object to calling my proposal protein specific, as it could do any
polymer. However, Konrad's proposal allows for more reuse of fragments.
> They are indeed fragments, and the "templates" part of the file is a fragment
> library that could also reside in a separate file. The term "molecule" is
> used because that's what CML uses.
>
> > I noticed your concern about definition of fragment, which may be force
> > field dependent. For protein, my suggestion is to use the whole residue as
> > a fragment, which makes interchanging force field a _lot_ easier. For any
>
> The whole residue is a fragment in my file, the fact that it is itself divided
> into two fragments is not really a problem since expansion is easy - the
> tricky part is to know which fragments are residues and which aren't. BTW,
> the subdivision is quite useful in practice, e.g. for identifying the
> backbone. If you don't put the division into the definition file, it will end
> up as explicitly coded chemical knowledge inside many programs, which is a
> less flexible approach.
Indeed, this is what has to be avoided. However, the information should
still be easily accessible, i.e. deduce which atoms form the backbone of
your protein if you want to compute phi/psi angles. Would this be
possible using XSLT? i.e. I have my XML file of a protein and use XSLT
to transfer it into another XML file containing the quadruplets that
make up the torsion angles...
> That sounds like a reasonable compromise: standardize names and layouts for
> frequently used items, and leave the rest open, possibly to be standardized
> in the future if and when a need arises.
>
One would indeed still have to provide complete amino-acids, whether
they are built up out of one or more fragments does not matter for the
user.
Another feature that I had built into my proposal was information about
adding or removing hydrogens (protein crystal structures usually come
without them, NMR structures come with all hydrogens in either of four
different formats). That might also be possible using XSLT.
> > For protein modeler, I propose we also define a format for paramter file.
> > We only need one copy of it. It should contain all the amber, charmm, ,,,
> > paramters, fuctions in the format corresponding to the structure file
>
> We have discusses this a bit already. I would prefer to postpone this for now,
> because of the compatibility issues in the force field definitions. For
> example, a "partial charge" does not mean exactly the same in Charmm and
> Amber, because Amber a 1-4 factor on electrostatic interactions which affects
> the total electrostatic energy. I expect the definition of a universal set of
> parameters to be a lengthy task.
Yes, but the AMBER fudge factor (also in OPLS by the way) is just a
single number which could be stored along with the force field
information, alongside a description of the bonded potentials etc. I
still do not see any fundamental problem with describing force fields
using a generic XML format. If you specify a bond, you will have to
specify which bonded potential you use of course (e.g. GROMACS
implements four different ones).
--
David.
________________________________________________________________________
David van der Spoel, PhD, Assist. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596, 75124 Uppsala, Sweden
phone: 46 18 471 4205 fax: 46 18 511 755
spoel@xray.bmc.uu.se spoel@gromacs.org http://xray.bmc.uu.se/~spoel
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++