[OpenMadrigal-developers] Creating/editting Cedar files with python

William Rideout brideout at haystack.mit.edu
Mon Apr 4 14:17:34 EDT 2005


As the final feature of Madrigal 2.4, I plan to add a class that will allow easy
creation and editing of Cedar files from python.  This python class is meant to
make working with Cedar files easier by hiding the following details of the
Cedar format from the user:

1. The underlying use of 16 bit integers as a storage format, along with the
underlying use of "additional increment" parameters.  Users will store and
retrieve all data as doubles, and the python API will be responsible for
converting to 16 bit integers, and using any "additional increment" parameters
if they exist.  Trying to input a value that overflows the limited dynamic range
set by the Cedar format will raise an exception.

2. All parameters may be referred to through either Cedar mnemonics or cedar codes.

3. Ordering of parameters (and other details) within a Cedar file will be hidden
from users.

4. Warnings will be raised if a user tries to include any time parameters in a
Cedar record that directly conflict with prolog time parameters.  While strictly
legal according to the Cedar standard, including time information in two
independent fields leads to an "Alice in Wonderland" data format of "data means
just what I intend it to mean", and should be avoided.

Here's my suggested API:

***********************************

MadrigalCedarFile(fullFilename, createFlag=False)

The class initializer takes a fullFilename as argument.  This is either the
existing Cedar file (in any allowed Cedar format), or a file to be created.  The
second argument, createFlag, tells whether this is a file to be created.  If
False and fullFilename cannot be read, an error is raised.  If True and
fullFilename already exists, or fullFilename cannot be created, an error is raised.

If createFlag == False, then the initializer reads the entire Cedar file into
memory, and creates a list of MadrigalCatalogRecords, MadrigalHeaderRecords, and
MadrigalDataRecords (described below). The MadrigalCedarFile will be derived
from the python list class, so its public methods are exactly those of a python
list.  The only limitation will be the natural one - any object added to the
list must be either a MadrigalCatalogRecord, a MadrigalHeaderRecord, or a
MadrigalDataRecord.

MadrigalCedarFile will have one additional public method:

	write(format="Madrigal", newFilename=None)

which will persist the object to file in a Cedar format.  The default format is
Madrigal, but also allowed will be "BlockedBinary", "UnblockedBinary", "Cbf",
and "Ascii".  The default newFilename is None, which means write to file
originally opened or created, but if given, write to newFilename.

***********************************

MadrigalCatalogRecord(kinst, modexp, startTimestamp, endTimestamp)

The MadrigalCatalogRecord initializer takes the following arguments:

	kinst - the kind of instrument code
	modexp - the mode of the experiment identifier
	startTimestamp - start of experiment in seconds since 1/1/1970 (int or double)
	endTimestamp - end of experiment in seconds since 1/1/1970 (int or double)

A warning will be raised if kinst is not an instrument listed in instTab.txt.
An exception will be raised if startTimestamp > endTimestamp.

MadrigalCatalogRecord has the following public methods/attributes:

getKinst()
getModexp()
getStartTimestamp()
getEndTimestamp()
lines

lines is a list of 80 character or less lines of ascii text.  This list may be
changed to modify or create a MadrigalCatalogRecord.  The only limitation is
that each new line added must be ascii text of 80 characters or less.

***********************************

MadrigalHeaderRecord(kinst, kindat, startTimestamp, endTimestamp)

The MadrigalCatalogRecord initializer takes the following arguments:

	kinst - the kind of instrument code
	kindat - the kind of data code
	startTimestamp - start of experiment in seconds since 1/1/1970 (int or double)
	endTimestamp - end of experiment in seconds since 1/1/1970 (int or double)
	jpar - number of 1D parameters in following records
	mpar - number of 2D parameters in following records

A warning will be raised if kinst is not an instrument listed in instTab.txt.
An exception will be raised if startTimestamp > endTimestamp.

MadrigalCatalogRecord has the following public methods/attributes:

getKinst()
getKindat()
getStartTimestamp()
getEndTimestamp()
getJpar()
getMpar()
lines

lines is a list of 80 character or less lines of ascii text.  This list may be
changed to modify or create a MadrigalCatalogRecord.  The only limitation is
that each new line added must be ascii text of 80 characters or less.

***********************************

MadrigalDataRecord(kinst, kindat, startTimestamp, endTimestamp, oneDList,
twoDList, nrow)


The MadrigalDataRecord initializer takes the following arguments:

	kinst - the kind of instrument code
	kindat - the kind of data code
	startTimestamp - start of record in seconds since 1/1/1970 (int or double)
	endTimestamp - end of record in seconds since 1/1/1970 (int or double)
	oneDList - list of one-dimensional parameters in record (mnemonic or code)
	twoDList - list of two-dimensional parameters in record (mnemonic or code)
	nrow - number of rows of 2D data to create.  Until set, all values default to
missing.

MadrigalDataRecord has the following public attributes/methods:

getKinst()
getKindat()
getStartTimestamp()
getEndTimestamp()
getOneDParmCodes()
getOneDParmMnemonics()
getNrow()
set(parm, row, value) - parm is mnemonic or code, row starts at 0, value is
double. Value may also be 'missing'.  If error parameter, value may also be
'assumed' or 'knownbad'.
get(parm, row) - parm is mnemonic or code, row starts at 0 - returns a double.


***********************************

In fact this API is meant to abstract the Cedar Data Model away from the Cedar
database format.  The essence of the Cedar Data Model is simply that a file is
an ordered list of records, where each record has the required fields
(startTimestamp, endTimestamp, kinst, kindat, and lists of 1D and 2D
parameters), along with values for the 1D and 2D parameters.  It is possible a
future version of this API would allow writing and reading from a non-Cedar format.

Bill

-- 
Bill Rideout
MIT Haystack Observatory
Email: brideout at haystack.mit.edu
Phone: 781 981-5624



More information about the OpenMadrigal-developers mailing list