From baaden at smplinux.de Tue Apr 4 09:08:29 2006 From: baaden at smplinux.de (Marc Baaden) Date: Tue Apr 4 09:07:12 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models Message-ID: <200604040808.k3488TEw032432@apex.ibpc.fr> Hi, I hope that this message is not off-topic and that this list is not dead (last message from February 2005?). I was wondering whether there is any kind of standard for treating molecular systems and molecules when it comes to object oriented approaches. I was thinking along the lines of a generic UML model which would be applicable to any kind of object oriented language. As I am new to these approaches (object oriented in general and UML in particular), I wonder whether it is better to use (if it exists) such a model or alternatively to develop a new one more specifically tailored to a given application. The aim of the application(s) I have in mind is to manipulate molecular data as well as simulation output (trajectories) in various phases of preparation&setup, production and post-production/analysis, which is why I am looking for a rather general model. Thanks in advance for any hints, Marc Baaden From pm286 at cam.ac.uk Tue Apr 4 09:46:08 2006 From: pm286 at cam.ac.uk (peter murray-rust) Date: Tue Apr 4 09:44:56 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: <200604040808.k3488TEw032432@apex.ibpc.fr> References: <200604040808.k3488TEw032432@apex.ibpc.fr> Message-ID: <7.0.1.0.0.20060404094115.02087e28@cam.ac.uk> At 09:08 04/04/2006, Marc Baaden wrote: >Hi, > >I hope that this message is not off-topic and that this list is not dead (last >message from February 2005?). > >I was wondering whether there is any kind of standard for treating >molecular systems >and molecules when it comes to object oriented approaches. I was >thinking along the >lines of a generic UML model which would be applicable to any kind >of object oriented >language. >As I am new to these approaches (object oriented in general and UML >in particular), >I wonder whether it is better to use (if it exists) such a model >or alternatively to develop a new one more specifically tailored to >a given application. > >The aim of the application(s) I have in mind is to manipulate >molecular data as well >as simulation output (trajectories) in various phases of >preparation&setup, production and >post-production/analysis, which is why I am looking for a rather >general model. I don't believe that anyone has managed to address all the points you mention - it is much more ambitious than it appears. We have developed an OO approach to many of the static aspects of small molecular systems with CML and by translating the XSD schema automatically to Java. There are ca 100 significant object classes. See cml.sf.net including the Wiki and www.sf.net/projects/cml for the CVS repository My personal feeling is that extending OO to dynamics and trajectories may be more effort than it is worth unless we can standardise our communal representation of the systems - I don't know whether this is possible. There are many hidden semantics in these systems. P. Peter Murray-Rust Unilever Centre for Molecular Sciences Informatics University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK +44-1223-763069 From spoel at xray.bmc.uu.se Tue Apr 4 10:00:22 2006 From: spoel at xray.bmc.uu.se (David van der Spoel) Date: Tue Apr 4 09:59:13 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: <7.0.1.0.0.20060404094115.02087e28@cam.ac.uk> References: <200604040808.k3488TEw032432@apex.ibpc.fr> <7.0.1.0.0.20060404094115.02087e28@cam.ac.uk> Message-ID: <443235A6.90701@xray.bmc.uu.se> peter murray-rust wrote: > At 09:08 04/04/2006, Marc Baaden wrote: > >> Hi, >> >> I hope that this message is not off-topic and that this list is not >> dead (last >> message from February 2005?). >> >> I was wondering whether there is any kind of standard for treating >> molecular systems >> and molecules when it comes to object oriented approaches. I was >> thinking along the >> lines of a generic UML model which would be applicable to any kind of >> object oriented >> language. >> As I am new to these approaches (object oriented in general and UML in >> particular), >> I wonder whether it is better to use (if it exists) such a model >> or alternatively to develop a new one more specifically tailored to a >> given application. >> >> The aim of the application(s) I have in mind is to manipulate >> molecular data as well >> as simulation output (trajectories) in various phases of >> preparation&setup, production and >> post-production/analysis, which is why I am looking for a rather >> general model. > > I don't believe that anyone has managed to address all the points you > mention - it is much more ambitious than it appears. > > We have developed an OO approach to many of the static aspects of small > molecular systems with CML and by translating the XSD schema > automatically to Java. There are ca 100 significant object classes. See > cml.sf.net including the Wiki and www.sf.net/projects/cml for the CVS > repository > > My personal feeling is that extending OO to dynamics and trajectories > may be more effort than it is worth unless we can standardise our > communal representation of the systems - I don't know whether this is > possible. There are many hidden semantics in these systems. > We are still considering doing something like this for GROMACS, but at a minimum-effort kind of way. Design criteria - simplicity: for developers and users - compatibility: we can not expect our users to write XSLT files so we have to built in compatibility with older (GROMACS) formats and other "Industry Standard" formats such as pdb. Although we should be able to use some of the CML stuff, much of the MD trajectories etc. is too specific to add into a CML itself. I welcome a discussion just before we start coding... > P. > > > > Peter Murray-Rust > Unilever Centre for Molecular Sciences Informatics > University of Cambridge, > Lensfield Road, Cambridge CB2 1EW, UK > +44-1223-763069 > _______________________________________________ > Molecularmechanics mailing list > Molecularmechanics@tddft.org > http://www.tddft.org/mailman/listinfo/molecularmechanics -- David. ________________________________________________________________________ David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group, Dept. of Cell and Molecular Biology, Uppsala University. Husargatan 3, Box 596, 75124 Uppsala, Sweden phone: 46 18 471 4205 fax: 46 18 511 755 spoel@xray.bmc.uu.se spoel@gromacs.org http://folding.bmc.uu.se ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ From baaden at smplinux.de Tue Apr 4 10:59:07 2006 From: baaden at smplinux.de (Marc Baaden) Date: Tue Apr 4 10:57:50 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: Your message of "Tue, 04 Apr 2006 11:00:22 +0200." <443235A6.90701@xray.bmc.uu.se> Message-ID: <200604040959.k349x7Ba025008@apex.ibpc.fr> Dear Peter and David, thanks for your comments and suggestions. I was aware of CML and of the Gromacs developers' endeavour to use XML-based data formats in the future. However it is not clear in my mind how equivalent an XML-based description like CML and a more application/implementation oriented UML model are related. I had the impression that although a CML description of eg trajectory data might be very difficult (and maybe overkill), a UML description would be more straighforward (somehow abstracting from existing classes in MD code for example). >> Peter Murray-Rust [..] >> We have developed an OO approach to many of the static aspects of >> small molecular systems with CML and by translating the XSD schema >> automatically to Java. There are ca 100 significant object classes. [..] >> My personal feeling is that extending OO to dynamics and trajectories >> may be more effort than it is worth unless we can standardise our >> communal representation of the systems [..] Could this be addressed via a minimium common denominator? Eg extending static structures by saying that there might be various sets (evtl. time dependent) of their coordinates. (Maybe that's already implemented, I haven't checked the CML schema yet, sorry). >> David van der Spoel >> We are still considering doing something like this for GROMACS, but >> at a >> minimum-effort kind of way. Design criteria >> - simplicity: for developers and users >> - compatibility: we can not expect our users to write XSLT files so >> we >> have to built in compatibility with older (GROMACS) formats and other >> "Industry Standard" formats such as pdb. [..] I agreee. That's why I was thinking that UML could be more "user friendly" in the sense that it uses graphical views of the object model that might be easier to approach and comprehend by the common user (ok, some very optimistic speculation here). >> Although we should be able to use some of the CML stuff, much of the >> MD >> trajectories etc. is too specific to add into a CML itself. I also think UML would leave an option to use binary formats or high-level data formats like HDF or netCDF for the data that is better suited by these models. Thank you very much for your input. >> I welcome a discussion just before we start coding... In the (again optimistic) case that I'll get up to speed with UML until then, I'd be glad to participate in the discussion :) Cheers, Marc From konrad.hinsen at laposte.net Tue Apr 4 12:08:30 2006 From: konrad.hinsen at laposte.net (Konrad Hinsen) Date: Tue Apr 4 12:07:32 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: <200604040808.k3488TEw032432@apex.ibpc.fr> References: <200604040808.k3488TEw032432@apex.ibpc.fr> Message-ID: <0A93DA8B-F071-44AC-8CA4-3B5C0A2F37AF@laposte.net> On Apr 4, 2006, at 10:08, Marc Baaden wrote: > I hope that this message is not off-topic and that this list is not > dead (last > message from February 2005?). You just started a revival! > I was wondering whether there is any kind of standard for treating > molecular systems > and molecules when it comes to object oriented approaches. I was > thinking along the > lines of a generic UML model which would be applicable to any kind > of object oriented > language. I am not aware of any such standard. The closest attempt was probably the workshop "Computational Representation of Biomolecules" held in 2003. Its goal was to work towards a common object-oriented API for working with biomolecules. Although UML was not mentioned, it would be one way to defined a specification for such an API. For more information, look at the workshop report: https://mgldev.scripps.edu/CRBM/report Unfortunately, there has been no follow-up until now. A second workshop was planned for March 2006, but apparently hasn't taken place. > As I am new to these approaches (object oriented in general and UML > in particular), > I wonder whether it is better to use (if it exists) such a model > or alternatively to develop a new one more specifically tailored to > a given application. > > The aim of the application(s) I have in mind is to manipulate > molecular data as well > as simulation output (trajectories) in various phases of > preparation&setup, production and > post-production/analysis, which is why I am looking for a rather > general model. What exactly is your goal? Do you want to write software, find software ready to use, glue together different software, define an interchange standard, or yet something else? I agree with Peter that a full formal description of molecular simulation would be a very ambitious project, considering the many different approaches already published and the even more numerous techniques and models still under development. I am not sure that formal approaches such as UML, which were developed for big software projects, are well adapted to scientific computing, which is characterized by fast change and small development groups. If you are new to OO techniques, you might want to start by looking at existing OO approaches to molecular simulation. The ones I am aware of are (in alphabetical order): - Adun: http://diana.imim.es/Adun - mmLib: http://pymmlib.sourceforge.net/ - NAMD: http://www.ks.uiuc.edu/Research/namd/ - OOMPAA: http://mccammon.ucsd.edu/~oompaa/ - OOPSE: http://oopse.org/ and my very own one: - MMTK: http://dirac.cnrs-orleans.fr/MMTK/ Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Laboratoire L?on Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: konrad.hinsen@cea.fr --------------------------------------------------------------------- From pm286 at cam.ac.uk Tue Apr 4 13:40:28 2006 From: pm286 at cam.ac.uk (peter murray-rust) Date: Tue Apr 4 13:39:26 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: <200604040959.k349x7Ba025008@apex.ibpc.fr> References: <200604040959.k349x7Ba025008@apex.ibpc.fr> Message-ID: <7.0.1.0.0.20060404132037.02093430@cam.ac.uk> At 10:59 04/04/2006, Marc Baaden wrote: >Dear Peter and David, > >thanks for your comments and suggestions. I was aware of CML and of the >Gromacs developers' endeavour to use XML-based data formats in the >future. There is considerable overlap with the forthcoming CECAM meeting on: Data representation and code interoperability for computational materials physics and chemistry http://www.cecam.fr/index.php?content=activities/workshop&action=details&wid=50 in which at least KonradH and myself are participants. There is a strong emphasis on XML and currently a considerable number of the codes are XMLised in some way. The meeting will act to try to find commonalities. I shall be able to comment more after the meeting but the prejudices I take to it are: * it is important to have running code. Designs without code are of limited value. They are usually overcomplex. * in physical science we can probably create context-independent primitives. Thus , , , etc. can have semantics which is independent of the application. Of course the *data* are application-dependent. * the next level up - for example describing molecular dynamics trajectories - is current too complex and fluid to be described formally. There are a number of approaches which can be used to add semantics - RDF, AgentX, scripting languages etc. to bind the primitives. These must be regarded as experimental at present and it will take real experimentation with real users to define their scope and success. * Complicated systems will only be adopted if they can be shown to produce real value and involved little effort. This is very hard to achieve. * many of the initial systems will be code-dependent. Thus there is a GAMESS-US markup language, while a number of the others codes (GULP, SIESTA, DL_POLY, CASTEP are being CMLised). This is probably a matter of taste and convenience - I expect these will anneal to a common set of semantics where possible and this will be represented in a community-wide set of specifications. IMO the first thing to solve is the representation of the data objects. Thus how do you represent "melting point". It requires units, error estimates, and for many scientists knowledge of the pressure. That requires careful semantic support. Even "atomic velocities" require units - we cannot force the community to adopt SI everywhere. So as part of the solid state work we are extending CML to address primitive structures that may be useful in - say - molecular dynamics. The atoms, their positions, delta positions, velocities, accelerations, constraints, etc. all have to be carefully represented. After that we can then construct something - probably not in CML - which describes a trajectory. And it should be done fairly succinctly - people are frightened by data structures that do not look like FORTRAN tables. For that reason we introduced a syntax in CML that can hold exactly that and could hold a giant molecule or a simple trajectory. P. Peter Murray-Rust Unilever Centre for Molecular Sciences Informatics University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK +44-1223-763069 From konrad.hinsen at laposte.net Tue Apr 4 14:35:29 2006 From: konrad.hinsen at laposte.net (Konrad Hinsen) Date: Tue Apr 4 14:33:56 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: <7.0.1.0.0.20060404132037.02093430@cam.ac.uk> References: <200604040959.k349x7Ba025008@apex.ibpc.fr> <7.0.1.0.0.20060404132037.02093430@cam.ac.uk> Message-ID: <6C7841D5-BEC0-47F2-89B6-29A48C817DEF@laposte.net> On Apr 4, 2006, at 14:40, peter murray-rust wrote: > * it is important to have running code. Designs without code are of > limited value. They are usually overcomplex. I think this is a very important point. Experience has shown that people are much more willing to debate (and thus complexify) specifications than to implement them. That has also been a major weakness of this list: in spite of lengthy discussions about a common format, no proposal has ever been implemented in more than one program, and therefore no test of usability in data exchange has ever been made. > * Complicated systems will only be adopted if they can be shown to > produce real value and involved little effort. This is very hard to > achieve. Indeed! > CML - which describes a trajectory. And it should be done fairly > succinctly - people are frightened by data structures that do not > look like FORTRAN tables. For that There is also the argument of human readability, a weak spot of many XML-based formats. For an illustration, try to find some specific information in a PDBML file. Good old PDB is much more user-friendly there - though it remains very deficient in many other respects. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Laboratoire L?on Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: konrad.hinsen@cea.fr --------------------------------------------------------------------- From jensj at fysik.dtu.dk Tue Apr 4 15:12:18 2006 From: jensj at fysik.dtu.dk (Jens Jorgen Mortensen) Date: Tue Apr 4 15:11:01 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: <0A93DA8B-F071-44AC-8CA4-3B5C0A2F37AF@laposte.net> References: <200604040808.k3488TEw032432@apex.ibpc.fr> <0A93DA8B-F071-44AC-8CA4-3B5C0A2F37AF@laposte.net> Message-ID: <44327EC2.3070009@fysik.dtu.dk> Konrad Hinsen wrote: > If you are new to OO techniques, you might want to start by looking at > existing OO approaches to molecular simulation. The ones I am aware of > are (in alphabetical order): > > - Adun: http://diana.imim.es/Adun > - mmLib: http://pymmlib.sourceforge.net/ > - NAMD: http://www.ks.uiuc.edu/Research/namd/ > - OOMPAA: http://mccammon.ucsd.edu/~oompaa/ > - OOPSE: http://oopse.org/ > > and my very own one: > > - MMTK: http://dirac.cnrs-orleans.fr/MMTK/ Another Python framework: - ASE: http://wiki.fysik.dtu.dk/ase -- Jens J?rgen From pm286 at cam.ac.uk Tue Apr 4 15:22:20 2006 From: pm286 at cam.ac.uk (peter murray-rust) Date: Tue Apr 4 15:21:10 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: <6C7841D5-BEC0-47F2-89B6-29A48C817DEF@laposte.net> References: <200604040959.k349x7Ba025008@apex.ibpc.fr> <7.0.1.0.0.20060404132037.02093430@cam.ac.uk> <6C7841D5-BEC0-47F2-89B6-29A48C817DEF@laposte.net> Message-ID: <7.0.1.0.0.20060404151633.03310a70@cam.ac.uk> At 14:35 04/04/2006, Konrad Hinsen wrote: >On Apr 4, 2006, at 14:40, peter murray-rust wrote: > > >>CML - which describes a trajectory. And it should be done fairly >>succinctly - people are frightened by data structures that do not >>look like FORTRAN tables. For that > >There is also the argument of human readability, a weak spot of many >XML-based formats. For an illustration, try to find some specific >information in a PDBML file. Good old PDB is much more user-friendly >there - though it remains very deficient in many other respects. addresses these. It allows you to create tables with any numbers of columns and for those columns to be any or other hardcoded CML value (e.g. x3, formalCharge, etc.) In this way it is trivial to emulate a PDB but with completely extensible semantics. You can design any new concept and add it through the dictionary mechanism. The primary requirement is that the table is rectangular and that values are whitespace (or other delimiter) separated . It can even be pretty printed For more complex data structures you have to resort to nested XML (which is at least more human friendly than relational tables). It is implemented in JUMBO. P. >Konrad. >-- >--------------------------------------------------------------------- >Konrad Hinsen >Laboratoire L?on Brillouin, CEA Saclay, >91191 Gif-sur-Yvette Cedex, France >Tel.: +33-1 69 08 79 25 >Fax: +33-1 69 08 82 61 >E-Mail: konrad.hinsen@cea.fr >--------------------------------------------------------------------- > > > >_______________________________________________ >Molecularmechanics mailing list >Molecularmechanics@tddft.org >http://www.tddft.org/mailman/listinfo/molecularmechanics Peter Murray-Rust Unilever Centre for Molecular Sciences Informatics University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK +44-1223-763069 From baaden at smplinux.de Tue Apr 4 15:39:50 2006 From: baaden at smplinux.de (Marc Baaden) Date: Tue Apr 4 15:38:35 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: Your message of "Tue, 04 Apr 2006 15:22:20 BST." <7.0.1.0.0.20060404151633.03310a70@cam.ac.uk> Message-ID: <200604041439.k34EdpQE024312@apex.ibpc.fr> Thanks for the wealth of comments! Konrad Hinsen said: >> What exactly is your goal? Do you want to write software, find >> software ready to use, glue together different software, define an >> interchange standard, or yet something else? The initial goal is to glue together different software (some widely used packages as well as in-house code) in a modular way by defining and implementing an interchange standard. This should happen at a fairly low (computationnally efficient) level, so the intended implementation is likely to be in C++ (for several reasons that I won't go into right now). In a second phase we might want to add new modules/software. Some of this might be "lighter" code and the OO framework should be accessible by a language of choice (eg python, java, ..). This was one of the aspects that made UML appear very appealing to me: use one model and then decline for the different implementations. >> I am not sure that formal approaches such as UML, which were >> developed >> for big software projects, are well adapted to scientific computing, >> which is characterized by fast change and small development groups. I was hoping that there could be a common "subset" of primitives (like mentioned by Peter, see below) where no fast change happens as the physics won't change. But you are right, there is also the application specific part of the model which has to be included. I guess in my particular case this makes still sense as I am aiming at a general framework rather than a specific implementation. Peter Murray-Rust: >> it is important to have running code. Designs without code are of >> limited value. They are usually overcomplex. I agree. But I'm coming from a slightly different direction: trying to setup new code I hope to find out whether I can build on something existing rather than add another incantation of a "Yet Another Personal View on How Molecules Can be Represented on a Computer". I am sort of looking for an RFC or "Current Best Practice" but maybe the field is just not there yet. It would also be very encouraging if there were approaches that allow to gradually adapt the complexity of the underlying model. What I mean is that for the start I'll only need to manipulate basic things like coordinates and forces (ignoring eg atom types and such details as this is already handled by the applications that I try to glue together) but at some point there will certainly be a need to extend this. It would be ideal to only have to adopt the relevant subset of objects/classes/primitives and be able to add complexity later on. I am not sure if this is straightforward with eg CML (which intrinsically seems a very appealing approach to me so far). >> in physical science we can probably create context-independent >> primitives. Thus , , , etc. can have >> semantics which is independent of the application. Right, and my feeling is that having a UML model for these primitives could provide some generic standards for any OO implementations. >> Even "atomic velocities" require units - we cannot force the >> community >> to adopt SI everywhere. Hm, why not -- officially the community is supposed to already have accepted these units... If one decided to store SI units as default, one could always write a "GetVelocityInAtomicUnits" public method to access the actual velocity in the desired units. Or am I missing an important point here? >> So as part of the solid state work we are extending CML to address >> primitive structures that may be useful in - say - molecular >> dynamics. The atoms, their positions, delta positions, velocities, >> accelerations, constraints, etc. all have to be carefully >> represented. After that we can then construct something - probably >> not >> in CML - which describes a trajectory That sounds very promising. I wish you a successful meeting! Marc Marc Baaden From konrad.hinsen at laposte.net Tue Apr 4 17:01:03 2006 From: konrad.hinsen at laposte.net (Konrad Hinsen) Date: Tue Apr 4 16:59:27 2006 Subject: [Molecularmechanics] Recommendations for object oriented approaches / UML models In-Reply-To: <200604041439.k34EdpQE024312@apex.ibpc.fr> References: <200604041439.k34EdpQE024312@apex.ibpc.fr> Message-ID: <7741AB57-7005-4990-B35D-45343C25A4F6@laposte.net> On Apr 4, 2006, at 16:39, Marc Baaden wrote: > The initial goal is to glue together different software (some widely > used packages as well as in-house code) in a modular way by defining > and implementing an interchange standard. This should happen at a That sounds like a good project, but already a major one, considering that it implies extending someone else's code. > fairly low (computationnally efficient) level, so the intended > implementation is likely to be in C++ (for several reasons that I In other words, you want to use a common in-memory representation? As soon as you exchange data in files, the efficiency of the programming language becomes secondary. > java, ..). This was one of the aspects that made UML appear very > appealing to me: use one model and then decline for the different > implementations. In my experience, language-independence of OO interfaces is not always a useful goal. OO languages vary widely in type checking, inheritance rules, attribute access control etc. A class hierarchy that is reasonable in one language is often clumsy to absurd in another one. One example is XML handling in Python. Initially, people implemented DOM (the Document Object Model, originally designed for Java) in Python. But every Python programmer who got interested in XML and looked at DOM quickly found it "unpythonic". As a consequence, more Python-oriented interfaces were developed and are now rapidly gaining popularity (see for example ElementTree, http://effbot.org/zone/ element-index.htm, which will be included in the Python standard library starting with Python 2.5). > I was hoping that there could be a common "subset" of primitives (like > mentioned by Peter, see below) where no fast change happens as the > physics won't change. But you are right, there is also the application The physics won't change, but simulations deal with models, not with physics. Even within a rather narrow field such as molecular mechanics, models evolve. In a force field like CHARMM or AMBER, an atom has two parameters: an atom type for bonded and LJ interactions, and a charge for electrostatics. But in force fields that are now under development, we see polarizabilities and electronegativities in an attempt to improve the description of electrostatics beyond the old constant-point-charge model. Any object model that provides a "charge" attribute for an atom is therefore already too restrictive. > Personal View on How Molecules Can be Represented on a Computer". I am > sort of looking for an RFC or "Current Best Practice" but maybe the > field is just not there yet. It isn't, unfortunately. But you are welcome to join those of us who want to get there! > It would also be very encouraging if there were approaches that allow > to gradually adapt the complexity of the underlying model. What I mean That is not particularly difficult in an OO approach. In fact, this is one of the strong points of OO. > am not sure if this is straightforward with eg CML (which > intrinsically seems a very appealing approach to me so far). It is my understanding that nearly any information item in CML is optional, so this should be straightforward. > Hm, why not -- officially the community is supposed to already have > accepted these units... If one decided to store SI units as default, SI units are not very practical at the atomic scale, as you end up with huge powers of ten everywhere (don't forget human readability!). But I agree that one could well define a single unit (ideallly SI- derived) for each quantity as part of the specification for a file format. That is easier to handle for everyone than a flexible system in which any unit is permissible and every code needs to implement extensive unit conversion for reading files. My personal favourite unit system is what I call "atomic-scale SI": nm, ps, g/mol (=amu), and their combinations (i.e. kJ/mol for energy). All these units are derived from SI units, and they can be used in SI-conforming formulas without the introduction of conversion factors. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Laboratoire L?on Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: konrad.hinsen@cea.fr ---------------------------------------------------------------------