[Molecularmechanics] coincidence?
Konrad Hinsen
molecularmechanics@tddft.org
Mon, 9 Feb 2004 18:20:35 +0100
On Monday 09 February 2004 17:37, Kaihsu Tai wrote:
> Are there any 'deliverables' that we can use on topology
> (beyond the MMTK API)?
Nothing from this discussion, until now.
> This sounds excellent. I have only been able to locate MMTK
> API documentation on the website you cited, but not a 'raw'
> description of the format. It appears our team may want to
There is none yet, but it is sufficient to take a look at a trajectory with
ncdump to see the structure. There are only two non-obvious points:
1) There are two ways for handling the time coordinate. The old/simple
style uses time as the first dimension of all arrays. The new/complex
one uses two dimensions for time, the first and the last of each array,
with the first "unlimited" and the second (varying faster) defined with
a fixed length.
The reason for this added complication is efficiency in reading
a trajectory for all time steps but only one atoms, as needed e.g.
in the computation of time correlation functions. This operation
becomes rather slow with the netCDF library when files get big.
The new layout speeds this up significantly. The length of the
last dimension is typically set to 100 - 500, depending on the
expected length of the trajectory.
2) The specification of the system topology. It is stored in a string
(attribute "description") that uses a compact but not very readable
format. Moreover, it avoids repeating group definitions by making
references (by name) to entries in the MMTK database. Turning
this into a standard file format would thus require a convention
about these names for standard groups.
One could perhaps replace this specification by some XML format,
but that would come at a huge cost in size. The current format
also has the advantage of being easy to parse - all the more for
Python programs, as it is a valid Python expression.
> convert 'any' trajectory format into this MMTK format, then
> use the resulting MMTK objects to write both the topology
> and the trajectory into the BioSimGrid database.
That would be rather straightforward to do.
> I suppose this means we are on our own with the 'input file'
> issue, and will have to write our own schema. Any advice?
I think there is no way around providing the original input file for whatever
program was used as the ultimate reference. A unified format looks close to
impossible (at least not worth the effort trying), considering that some
codes (CHARMM, MMTK, perhaps others) use a full scripting language for input
definitions.
What do you expect those files to be used for? Wouldn't it be better to store
any machine-readable information in the trajectory file?
Konrad.
--
-------------------------------------------------------------------------------
Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron | Fax: +33-2.38.63.15.17
45071 Orleans Cedex 2 | Deutsch/Esperanto/English/
France | Nederlands/Francais
-------------------------------------------------------------------------------