[Molecularmechanics] Recycling CML

Konrad Hinsen molecularmechanics@tddft.org
Fri, 7 Nov 2003 19:46:10 +0100


On Friday 07 November 2003 20:57, David wrote:

> I object to calling my proposal protein specific, as it could do any
> polymer. However, Konrad's proposal allows for more reuse of fragments.

Go ahead and change the description on the Wiki!

> Indeed, this is what has to be avoided. However, the information should
> still be easily accessible, i.e. deduce which atoms form the backbone of
> your protein if you want to compute phi/psi angles. Would this be
> possible using XSLT? i.e. I have my XML file of a protein and use XSLT
> to transfer it into another XML file containing the quadruplets that
> make up the torsion angles...

That should be possible. On the other hand, I am not sure if XSLT is the right 
tool for such things. In a typical setting, I would read a system description 
into my code and then do things with it, including the calculation of phi/psi 
angles. I don't want to run an XSLT transformation first and then read 
another file into a more specialized program that calculates just the angles. 
If there were programs around that needed such input, then XSLT would be a 
good way to generate those files.

> One would indeed still have to provide complete amino-acids, whether
> they are built up out of one or more fragments does not matter for the
> user.

Right. Amino acids, nucleic acids, lipids, etc.

> Another feature that I had built into my proposal was information about
> adding or removing hydrogens (protein crystal structures usually come
> without them, NMR structures come with all hydrogens in either of four
> different formats). That might also be possible using XSLT.

Indeed. This kind of specification is most useful for a generic fragment 
database from which variants at different levels of hydrogenation could then 
be constructed. For a list of fragments inside a file that specifies a 
complete system, such information would have to be optional (a program might 
not have it and certainly couldn't generate it). It could then be used to 
reduce the size even more by defining certain fragments as modifications of 
others. It could also be used to apply (de)hydrogenization to a whole system, 
provided that modification rules are provided everywhere.

> Yes, but the AMBER fudge factor (also in OPLS by the way) is just a
> single number which could be stored along with the force field

Sure, no problem. The difficulty is not to store the number, but to convey the 
significance of the number. A program that just reads the file and throws 
away unknown elements would end up doing electrostatics without that factor 
and give wrong results, without even a warning to the user.

> information, alongside a description of the bonded potentials etc. I
> still do not see any fundamental problem with describing force fields
> using a generic XML format. If you specify a bond, you will have to

Me neither, it's just a LOT of work.

> specify which bonded potential you use of course (e.g. GROMACS
> implements four different ones).

And program XYZ has three other ones - that makes seven. So we define those 
seven, and after that a new force field comes along which needs yet a 
different one - that's where I see the main difficulty.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------