[Molecularmechanics] Re: Some general remarks.
Martin Field
molecularmechanics@tddft.org
Fri, 21 Nov 2003 15:00:52 +1300
This is a multi-part message in MIME format.
------=_NextPart_000_00B3_01C3B040.41EC0C10
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Hi,
I'll try and put some ideas up on the Wiki concerning
force-fields next week as David suggested - just to keep
the ball rolling.
In reply to Konrad:
> > Clearly both problems are important, but
> > it would seem to me that tackling the force-field
> > problem is in some ways easier than the composition
> > problem and that the latter could have a more immediate
> > impact in the MM area.
>=20
> I guess that depends on what everyone is doing. Those who are into =
developping=20
> and comparing force fields would benefit most from a unified force =
field=20
> representation. Those who are (like me) more into analyzing =
simulations would=20
> benefit most from being able to read other people's system =
descriptions and=20
> trajectories.
>=20
> Anyway, there is no opposition between the two, and those who are =
interested=20
> in force field descriptions should certainly go ahead and work on =
that.
Agreed. Both need doing.
> > 1. Specifying composition is not a problem unique
> > to MM simulations. Conventions to do so already exist
> > and no doubt there are ongoing efforts as well.
> > How much overlap is there with these?
>=20
> The biggest effort is certainly CML, which we are trying to build on.=20
> Otherwise, all I know of is conventions from crystallography. Perhaps =
Peter=20
> knows what the current state of affairs is in that community.
>=20
> > 2. Is a unique convention desirable? Thus, for example,
> > a PDB or a CIF-like convention may be most appropriate
> > for biomolecule people but other conventions may be
> > more suitable in other areas.
>=20
> I do think that a unique convention is desirable, provided that it is=20
> realistic. Again, I think CML goes a long way already.
>
> I used to be a CML skeptic, but that was CML1. CML2 looks like a very =
good=20
> basis to me.
Agreed with "realistic" being the key word. I think it's important
for the fsatom/MM effort to come up with, or borrow, something that
is accepted by the wider chemical community. Building upon
CML would seem a logical approach.
However, a common approach must be flexible enough so that it
can treat reasonably a whole range of systems. For example,
calculations on small molecule often need only a list of atoms - it
is unnecessary (and bothersome) to have to partition it into
fragments. And vice-versa for macromolecule work.
Likewise, it would be a pity to not be able to make use of much
existing data and the ways it is arranged. Take as examples
the PDB conventions. Is this data to be "CMLized" directly
or is another convention to be adopted and then rules devised
for translating between the two formats?
> > 3. Most simulation programs that I am familiar
> > with generate explicit lists of atoms, force-field
> > terms, force-field parameters, etc. when they
> > generate a force-field for a particular system.
> > Most of these can also write and reread this
> > information so that the force-field generation
> > step does not have to be repeated.
>
> Indeed, and a description at that level might be easier to unify than =
a=20
> description at the level of an abstract force field specification =
("the=20
> CHARMM force field"), because what really differs greatly among force =
fields=20
> is the algorithm for assigning parameters in a particular system.
My point was that many existing programs already make use of force-field =
files
which have no or little fragment information. A common format for these =
files
would, I suggest, be relatively straightforward to devise and require =
little
modification of the programs. Being able to use common files between
programs, even if they are "flat", would already be a step forward.
As a slight digression, I guess that most force-fields in the =
biomolecule
area have fragment libraries with force-field parameter and term =
definitions
and then have rules for joining fragments together to generate the =
force-field
for the simulation system. (As opposed to the approach based uniquely =
upon
element type and bond connectivity - i.e. the local chemical =
environment).
Thus, once a fragment notation has been developed, I don't see that it =
would
be so difficult to come up with a representation for such fragment =
libraries
and the rules in which fragments could be combined.
> However, my preference would be to have such information as an add-on =
(either=20
> in the same file or in a separate one) to a system specification.
I think the choice of how to combine the data should be flexible - in =
the
same files, as an add-on or in separate files.
> In case you worry about the complexity of the fragment stuff, note =
that it is=20
> entirely optional. The proposal I made on the Wiki concentrates on =
fragments=20
> because that's where the problems are that need to be discussed. =
Nothing=20
> stops any program from producing an empty "template" section and =
writing out=20
> all atoms and bonds explicitly, which may indeed be simpler to =
implement.
I don't think it too complicated but I think it should be optional. A
hierachical structure should not be imposed as it may be inappropriate =
or
unavailable.
> However, a file that does contain a well-defined fragment list is both =
more=20
> compact and richer in information, so I think it's worth working on =
that=20
> approach.
It is important to have several representations of a system. A richer
representation that includes fragment data is maybe more appropriate for
analysis but a flat representation is maybe more useful for performing
a simulation.
Cheers,
Martin.
------=_NextPart_000_00B3_01C3B040.41EC0C10
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2800.1106" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff><FONT face=3DArial size=3D2>
<DIV><BR>Hi,</DIV>
<DIV> </DIV>
<DIV> I'll try and put some ideas up on the Wiki=20
concerning<BR>force-fields next week as David suggested - just to =
keep<BR>the=20
ball rolling.</DIV>
<DIV> </DIV>
<DIV> In reply to Konrad:</DIV>
<DIV> </DIV>
<DIV>> > Clearly both problems are important,=20
but<BR>> > it would seem to me that tackling the =
force-field<BR>> >=20
problem is in some ways easier than the composition<BR>> > problem =
and=20
that the latter could have a more immediate<BR>> > impact in the =
MM=20
area.<BR>> <BR>> I guess that depends on what everyone is doing. =
Those who=20
are into developping <BR>> and comparing force fields would benefit =
most from=20
a unified force field <BR>> representation. Those who are (like me) =
more into=20
analyzing simulations would <BR>> benefit most from being able to =
read other=20
people's system descriptions and <BR>> trajectories.<BR>> <BR>> =
Anyway,=20
there is no opposition between the two, and those who are interested =
<BR>> in=20
force field descriptions should certainly go ahead and work on =
that.</DIV>
<DIV> </DIV>
<DIV>Agreed. Both need doing.</DIV>
<DIV> </DIV>
<DIV>> > 1. Specifying composition is not a problem unique<BR>> =
> to MM simulations. Conventions to do so already=20
exist<BR>> > and no doubt there are ongoing =
efforts as=20
well.<BR>> > How much overlap is there with=20
these?<BR>> <BR>> The biggest effort is certainly CML, which we =
are trying=20
to build on. <BR>> Otherwise, all I know of is conventions from=20
crystallography. Perhaps Peter <BR>> knows what the current state of =
affairs=20
is in that community.<BR>> <BR>> > 2. Is a unique convention =
desirable?=20
Thus, for example,<BR>> > a PDB or a CIF-like =
convention=20
may be most appropriate<BR>> > for biomolecule =
people=20
but other conventions may be<BR>> > more =
suitable in=20
other areas.<BR>> <BR>> I do think that a unique convention is =
desirable,=20
provided that it is <BR>> realistic. Again, I think CML goes a long =
way=20
already.<BR>><BR>> I used to be a CML skeptic, but that was CML1. =
CML2=20
looks like a very good <BR>> basis to me.</DIV>
<DIV> </DIV>
<DIV>Agreed with "realistic" being the key word. I think it's =
important<BR>for=20
the fsatom/MM effort to come up with, or borrow, something that<BR>is =
accepted=20
by the wider chemical community. Building upon<BR>CML would seem a =
logical=20
approach.</DIV>
<DIV> </DIV>
<DIV>However, a common approach must be flexible enough so that =
it<BR>can treat=20
reasonably a whole range of systems. For example,<BR>calculations on =
small=20
molecule often need only a list of atoms - it<BR>is unnecessary (and =
bothersome)=20
to have to partition it into<BR>fragments. And vice-versa for =
macromolecule=20
work.</DIV>
<DIV> </DIV>
<DIV>Likewise, it would be a pity to not be able to make use of =
much<BR>existing=20
data and the ways it is arranged. Take as examples<BR>the PDB =
conventions. Is=20
this data to be "CMLized" directly<BR>or is another convention to be =
adopted and=20
then rules devised<BR>for translating between the two formats?</DIV>
<DIV> </DIV>
<DIV>> > 3. Most simulation programs that I am familiar<BR>>=20
> with generate explicit lists of atoms,=20
force-field<BR>> > terms, force-field =
parameters, etc.=20
when they<BR>> > generate a force-field for a =
particular=20
system.<BR>> > Most of these can also write and =
reread=20
this<BR>> > information so that the force-field=20
generation<BR>> > step does not have to be=20
repeated.<BR>><BR>> Indeed, and a description at that level might =
be=20
easier to unify than a <BR>> description at the level of an abstract =
force=20
field specification ("the <BR>> CHARMM force field"), because what =
really=20
differs greatly among force fields <BR>> is the algorithm for =
assigning=20
parameters in a particular system.</DIV>
<DIV> </DIV>
<DIV>My point was that many existing programs already make use of =
force-field=20
files<BR>which have no or little fragment information. A common format =
for these=20
files<BR>would, I suggest, be relatively straightforward to devise and =
require=20
little<BR>modification of the programs. Being able to use common files=20
between<BR>programs, even if they are "flat", would already be a step=20
forward.</DIV>
<DIV> </DIV>
<DIV>As a slight digression, I guess that most force-fields in the=20
biomolecule<BR>area have fragment libraries with force-field parameter =
and term=20
definitions<BR>and then have rules for joining fragments together to =
generate=20
the force-field<BR>for the simulation system. (As opposed to the =
approach based=20
uniquely upon<BR>element type and bond connectivity - i.e. the local =
chemical=20
environment).<BR>Thus, once a fragment notation has been developed, I =
don't see=20
that it would<BR>be so difficult to come up with a representation for =
such=20
fragment libraries<BR>and the rules in which fragments could be =
combined.</DIV>
<DIV> </DIV>
<DIV>> However, my preference would be to have such information as an =
add-on=20
(either <BR>> in the same file or in a separate one) to a system=20
specification.</DIV>
<DIV> </DIV>
<DIV>I think the choice of how to combine the data should be flexible - =
in=20
the<BR>same files, as an add-on or in separate files.</DIV>
<DIV> </DIV>
<DIV>> In case you worry about the complexity of the fragment stuff, =
note=20
that it is <BR>> entirely optional. The proposal I made on the Wiki=20
concentrates on fragments <BR>> because that's where the problems are =
that=20
need to be discussed. Nothing <BR>> stops any program from producing =
an empty=20
"template" section and writing out <BR>> all atoms and bonds =
explicitly,=20
which may indeed be simpler to implement.</DIV>
<DIV> </DIV>
<DIV>I don't think it too complicated but I think it should be optional. =
A<BR>hierachical structure should not be imposed as it may be =
inappropriate=20
or<BR>unavailable.</DIV>
<DIV> </DIV>
<DIV>> However, a file that does contain a well-defined fragment list =
is both=20
more <BR>> compact and richer in information, so I think it's worth =
working=20
on that <BR>> approach.</DIV>
<DIV> </DIV>
<DIV>It is important to have several representations of a system. A=20
richer<BR>representation that includes fragment data is maybe more =
appropriate=20
for<BR>analysis but a flat representation is maybe more useful for=20
performing<BR>a simulation.</DIV>
<DIV> </DIV>
<DIV>Cheers,</DIV>
<DIV> </DIV>
<DIV>Martin.<BR></FONT></DIV></BODY></HTML>
------=_NextPart_000_00B3_01C3B040.41EC0C10--