The STAR format
The Self-Defining Text Archival and Retrieval (STAR) format has
become a standard in structural biology. Several
scientific databases (e.g. PDB, CCDC, ICDD, BioMagResBank) use the
STAR format to store structural, crystallographic diffraction and NMR data.
A growing number of programs (e.g. CNS, NMRView, MODELFREE)
can utilize the STAR format for their respective data output.
References
STAR specification publications:
-
The STAR File: A new Format for Electronic Data Transfer and Archiving
S. R. Hall
J. Chem. Inf. Comput. Sci. 31, 326-333 (1990)
-
The STAR File: Detailed Specifications
S. R. Hall and N. Spadaccini
J. Chem. Inf. Comput. Sci. 34, 505-508 (1994)
-
STAR Dictionary Definition Language: Initial Specification
S. R. Hall and A. P. F. Cook
J. Chem. Inf. Comput. Sci. 35, 819-825 (1995)
mmCIF specification publications:
- The Crystallographic Information File (CIF): a New Standard Archive File for Crystallography
S. R. Hall, F. H. Allen and I. D. Brown
Acta Cryst. A47, 655-685 (1991)
-
Macromolecular Crystallographic Information File
P. E. Bournce, H. M. Berman, B. McMahon, K. D. Watenpaugh, J. D. Westbrook and P. M. D. Fitzgerald
Methods in Enzymology 277, 571-590 (1997)
XML
The eXtensible Markup Language (XML) is a standard for
semantic markup of data independent of a particular application
domain. This independence implies that many parties develop different
parsers; software is thoroughly tested across specific problem
domains. Besides, parser implementations exist in a wealth of
programming languages (including scripting languages), which means
more freedom for the scientist wishing to analyze certain data.
Here's a list of the advantages of XML compared to STAR:
- Standard XML parsers:
As XML is used in a broad spectrum of application
domains, many different parser implementations are available.
This enables the programmer to choose a well-tested parser for
a given problem.
- Standard XML viewers/editors:
With the advent of the next generation of Web
browsers, XML will be supported as a standard format for
data exchange over the web. Hierarchical information contained
in a web file can be displayed and edited in general-purpose
XML viewers (such as Microsoft's Internet Explorer 5) or
editors (such as
IBM's Xeena)
- XML query languages:
Currently, various query languages to extract
information from XML sources have been proposed. These query
languages enable the users to formulate ad-hoc queries in a
structured way.
Two proposed standards are
XQL
and XML-QL.
For both proposals, working prototype implementations
are available.
- Validity of documents:
XML documents can be validated against a Document Type
Definition (DTD). In this way, the integrity of the data can be
checked as the document is generated.
Further information
Here's a collection of links relevant to XML:
- W3C XML site is where you'll
find all the standards and official activities.
- XML.com is a commercial portal
containing good articles and news about XML.
- Cafe con Leche is a
weblog/portal sort of thing for your inofficial (read: where
the work is done) XML activities update.
- The Python XML topic guide
is the starting point for XML processing in Python.
|