[Molecularmechanics] CML update

Peter Murray-Rust molecularmechanics@tddft.org
Mon, 22 Dec 2003 12:23:28 +0000


[Crossposted to FSATOM and eminerals lists.]

I have been compiling an inventory of CML components and developing 
software to manage them. I hope the strategy will be useful for FSATOM 
and/or eminerals and I'd be grateful for feedback. I had a very valuable 
visit to UCL and some of this mail reflects ideas from that visit.

The design of CML is now a "menu" of about 100 components which are 
assembled for particular applications. A designer picks from the list and 
then an XMLSchema is automatically created and software generated. [Some of 
the CML elements are only relevant to macroscopic concepts and I have 
omitted them here.] I have also omitted infrastructure (containers 
(fooList), dictionaries, metadata, etc) although these will be important. 
The approximate list of interest to the lists is:

constituents:
CMLAtom
CMLAtomParity
CMLAtomSet
CMLAtomType
CMLBond
CMLBondSet
CMLBondStereo
CMLBondType
CMLCrystal
CMLElectron
CMLFormula
CMLIsotope
CMLMolecule
CMLParticle

geometry of constituents
CMLAngle
CMLLength
CMLTorsion
CMLZMatrix

computational structure
CMLModule

identifiers
CMLIdentifier
CMLLabel
CMLName

STM datatypes
CMLArray
CMLEigen
CMLGradient
CMLList
CMLMatrix
CMLParameter
CMLScalar
CMLTable

units
CMLUnit
CMLUnitType

QM
CMLAtomicBasisFunction
CMLBasisSet

forcefields
CMLArg
CMLExpression
CMLOperator
CMLPotential
CMLPotentialForm
CMLPotentialList

extended structures
CMLLattice
CMLLatticeVector
CMLRegion
CMLSystem

geometrical objects
CMLLine3
CMLPlane3
CMLPoint3
CMLSphere3
CMLVector3
CMLSymmetry

properties
CMLBand
CMLConditionList
CMLProperty

reactions
CMLMechanism
CMLMechanismComponent
CMLProduct
CMLReactant
CMLReaction
CMLReactionScheme
CMLReactionStepList
CMLReactiveCentre
CMLSpectator
CMLTransitionState

spectra
CMLPeak
CMLPeakGroup
CMLPeakList
CMLSpectrum
CMLSpectrumData
CMLXaxis
CMLYaxis

The approach requires that all objects be context-free and I am therefore 
interested in knowing what additional components are in regular use and 
would be of interest. These objects should be standalone - i.e. it makes 
sense to create software independently of where they occur.  At present my 
list includes:

TODO:

transformations and reference frames
multipoles
3D surfaces (e.g. convex hulls, molecular surfaces)
molecular fragments (partially done)
chemical queries and bitscreens

possible:

character tables
scripting and job control

out of scope:

pseudopotentials (being done by FSATOM)
graphical display and rendering (not primary content)
subatomic particles, nuclear transformations ("not chemistry")
macromolecular structure, e.g. mmCIF (CML supports a PDB-like approach but 
leaves hierarchies to the macromolecular community)

Even without details an overall design defining the scope would be valuable.

P.


Peter Murray-Rust
Unilever Centre for Molecular Informatics
Chemistry Department, Cambridge University
Lensfield Road, CAMBRIDGE, CB2 1EW, UK
Tel: +44-1223-763069