Introducing obsinfo
Philosophy and comparison to other systems
obsinfo
is a syntax and system to create standard seismological metadata
files (currently StationXML) as well as processing flows specific to ocean
bottom seismometer (OBS) data. It’s basic philosophy is:
break down every component of the system into “atomic”, non-repetitive units.
Follow StationXML structure where possible, but:
Add entities missing from StationXML where necessary
use appropriate units for each component (for example, specifying the
offset
for a digital filter, not thedelay
, which depends on the sampling rate)allow full specification of a deployment using text files, for repeatibility and provenance
File formats
Compared to StationXML files
Minimizes repeated information
for example, in StationXML
each channel could have the same datalogger but all of the
datalogger specifications are repeated for each channel.
Within a channel’s response itself, several of the stages may be identical (except for the
offset
).
Eliminate fields that can be calculated from other fields, such as:
The
<InstrumentSensitivity>
field, which depends on theStage
s that followThe
<Delay>
for a digital filter stage, which can be calculated from<Offset>
*<Factor>
/<InputSampleRate>
Compared to RESP files
RESP files (mostly used in the Nominal Reference Library) are just text representations of the Dataless SEED files that preceded the StationXML standard, so they share the repetitive nature of StationXML files and add the complexity of a non-standard text format.
Compared to AROL
The Atomic Response Objects Library (AROL)
replaces the RESP-based Nominal Response Library in the new YASMINE system.
Files use the same atomic concept and YAML structure as obsinfo
, in fact
the AROL format was based on a previous version of obsinfo and we try to keep
the two compatible.
AROL lacks the subnetwork
, station
and instrumentation
levels as
these are assembled by YASMINE.
Metadata creation systems
Compared to PDCC
PDCC is a graphical user interface allowing one to assemble different components (sensors, dataloggers, amplifiers) and then add in deployment information. Components can be added from the Nominal Response Library (NRL), which combines RESP files with textual configuration files which allow the user to select the exact component and configuration they used. obsinfo uses a fully textual description of instruments and deployments rather than a graphical user interface.
Compared to IRIS DMC IRISWS
I don’t know much about this, it looks like a webservice to obtain component responses but I’m not sure how you’re supposed to assemble them. It might just be a more modern way to access the NRL components that is supposed to be used by newer systems.
Compared to YASMINE
YASMINE is a new StationXML metadata creation tool.
It’s major difference from PDCC is its use of atomic response files,
which should be compatible with obsinfo files.
It provides a graphical user interface (YASMINE-EDITOR) and a command-line
interface (YASMINE-CLI).
The major differences from obsinfo
are the lack of instrumentation
,
station
and subnetwork
levels, as well as processing information such
as instrument clock drift
File formats
All information files can be written in YAML or JSON format.
Use whichever you prefer.
YAML
is generally easier to write and read by humans, whereas JSON
is easier for computers.
The tutorial includes a section describing YAML
files as used in obsinfo
(tutorial:tutorial-1).
There are many sites for converting from one format to the other and for
validating either format: including this json-to-yaml-convertor and this
yaml-validator.
The Tutorial
This training course is meant to accompany an instructor. The tutorial provides a more detailed step-by-step explanation and we refer to sections of the Tutorial throughout this training course.
Structural units
A full obsinfo
subnetwork description consists of the following entities
(starred fields are optional):
format_version: {}
*revision: {}
*notes: []
subnetwork:
operators: []
*restricted_status: <string>
*comments: []*
*extras: {}*
campaign_ref_name: <string>
network: {}
stations:
<STATIONNAME1>:
site: string
start_date: string
end_date: string
location_code: <string>
*serial_number: <string>
*operators: []]
instrumentation:
equipment: {}
channels:
default:
*orientation: <string or {}>
datalogger:
<< GENERIC_COMPONENT
sample_rate: <number>
*correction: <number>
*preamplifier:*
*<< GENERIC_COMPONENT*
sensor:
<< GENERIC_COMPONENT
seed_codes:
*preamplifier_configuration: <string>
*sensor_configuration: <string>
*datalogger_configuration: <string>
*location_code: <string> # otherwise inherits from station
*comments: []
*extras: {}
<SPECIFIC-CHANNEL1>: {}
<SPECIFIC-CHANNEL2>: {}
...
*channel_modifications: {}
locations: {}
*notes: []
*comments: []
*extras: {}
*processing:
- *clock_correction_linear: {}
- *clock_correction_leapsecond: {}
<STATIONNAME2>:
...
Where GENERIC_COMPONENT
is:
equipment: {}
*configuration_default: <string>
*configuration_definitions: {}
*stage_modifications: {}
*notes: []
*stages:*
- stage:
input_units: <string>
output_units <string>
gain: <number>
*name: <string>
*description: <string>
*decimation_factor: <integer>
*delay: <number>
*calibration_date: <string>
*polarity: '+' or '-' # default is '+'
*input_sample_rate: <number>
*filter:
type: <string>
<fields depending on type>
- stage:
- ...
And FILTER
is:
type: <string> # one of "PolesZeros", "FIR", "Coefficients",
# "ResponseList", "Polynomial", "ADConversion",
# "Analog", "Digital"
*description: <string>
*delay.samples: <number> # for all except "Analog" and "PolesZeros"
*delay.seconds: <number> # for "Analog" and "PolesZeros"
# other parameters specific to the specified type
This could all be in one file, in which case there would be little benefit over
StationXML. The power of obsinfo
comes from the ability to put any
sub-entity into a separate file, which is called from the parent file
using the $ref
field.
Standard file levels are: subnetwork
, instrumentation
, {datalogger
,
preamplifier
, sensor
}, stage
and filter
.
The schema files are defined at these same levels, allowing the command-line
tool obsinfo-validate`
to validate any file ending with
{one of the above}.{yaml,json}.
Other elements often put into separate files are author
, location_base
,
network_info
and operator
.
A common file structure is then (this time showing only the required fields):
a subnetwork file:
format_version: {} subnetwork: operators: [] network: {$ref: networks/xxx.network.yaml#network} stations: <STATIONNAME1>: site: string start_date: string end_date: string location_code: string instrumentation: base: {$ref: instrumentation/xxx.instrumentation.yaml#instrumentation} locations: {} <STATIONNAME2>: ... <STATIONNAME3>: ... ...
instrumentation_base files:
format_version: {} instrumentation_base: equipment: {} channels: default: datalogger: {$ref: dataloggers/xxx.datalogger.yaml#datalogger} sensor: {$ref: sensors/xxx.sensor.yaml#sensor} <SPECIFIC-CHANNEL1>: {} <SPECIFIC-CHANNEL2>: {} ...
datalogger_base files:
format_version: {} datalogger_base: << GENERIC_COMPONENT sample_rate: float
sensor_base files:
format_version: {} sensor_base: << GENERIC_COMPONENT seed_codes:
stage_base files: (see examples in classes/stage)
filter files: There are 4 filter types corresponding directly to their StationXML analogues:
PoleZeros
,FIR
,Coefficients
andResponseList
(thePolynomials
filter type has not yet been implemented). 3 other filter types allow simpler information entry:Analog: An analog stage with no filtering (translated to StationXML PoleZero without any poles or zeros)
Digital/: A digital stage with no filtering (translated to StationXML Coefficients stage without any coefficients)
ADConversion: like an analog stage, plus information about input voltage and output counts limits
(see examples in classes/filter)
You don’t actually need to put the information in each file under a field with the filetype name: in fact if you didn’t you would save a little typing, as you could specify, for example,
{$ref: xxx.datalogger_base.yaml}
instead of:
{$ref: xxx.datalogger_base.yaml#datalogger_base}
But the second style is preferred as it allows the files to contain useful
provenance and version information at the base level.
To incite you to use the second style, obsinfo-validate
only accepts this style.
Configurations, channel modifications and shortcuts
components can have pre-defined configurations and their internal values can be modified from higher levels.
The simplest and most common example is specifying each station’s sampling rate, which is done as follows:
modifications:
datalogger: {configuration: "125sps"}
Configurations
Configurations modify parameters in a given component according
to an existing configuration_definition
in the component’s information file.
Allowed fields are:
datalogger_configuration
sensor_configuration
preamplifier_configuration
Configurations can be specified at the following levels, in order of priority:
station:channel_modifications
instrumentation:channels:{CHNAME}
instrumentation:channels:default
Configurations are defined in the the component information files under the
configuration_definition
field.
Channel Modifications
channel_modifications
directly modify one or more parameters in a given element.
This gives complete control to the user but assumes knowledge of the obsinfo
hierarchy.
Details of channel_modifications
are provided in the Advanced Topics
section advanced/chan_mods
Shortcuts
datalogger_configuration
, preamplifier_configuration
and
sensor_configuration
are actually shortcuts for common channel_modifications
.
Shortcuts are hard-coded into obsinfo to allow simpler representation of
common configurations or modifications.
Other ones may be added, including XX_serial_number
, where XX could be
datalogger
, sensor
, preamplifier
or instrumentation
Other sources
Channel modifications are described briefly in /tutorial/tutorial-3:channel modifications and in detail in Channel modifications
Component configurations are described in /tutorial/tutorial-4:configurations and /tutorial/tutorial-5:configuration definitions and /tutorial/tutorial-6:datalogger configuration definitions
Details
Referenced files referenced are searched for starting at the paths given in the
~/.obsinforc
file
delay, offset, and correction
One area where obsinfo differs from StationXML is in its handling of delays in digital filters. StationXML (and RESP) have three parameters in each stage, relating to the time delay created by the stage, in each Stage’s Decimation section:
- offset:
Sample offset chosen for use. If the first sample is used, set this field to zero. If the second sample, set it to 1, and so forth.
- delay:
The estimated pure delay for the stage (in seconds). This value will almost always be positive to indicate a delayed signal.
- correction:
The time shift, if any, applied to correct for the delay at this stage. The sign convention used is opposite the <Delay> value; a positive sign here indicates that the trace was corrected to an earlier time to cancel the delay caused by the stage and indicated in the <Delay> element.
StationXML specifies the delay for each stage, leaving the offset equal to zero. A digital filter’s true delay is in samples, not seconds, meaning that the delay will depend on the sampling rate.
obsinfo’s atomic philosphy does not allow a variable delay (in
seconds) when there is a constant delay (in samples).
obsinfo puts delay
in the stage
level but offset
in the filter
level. For digital filters, offset` should be filled with the delay
samples and ``delay
should not be provided.
Details
Referenced files referenced are searched for starting at the paths given in the
~/.obsinforc
file
Command-line files
all of the command line files start with obsinfo-, so if you have a decent shell you should be able to see them by typing obsinfo<TAB>
obsinfo-makeStationXML
makes stationXML files from an obsinfo subnetwork file and its dependenciesobsinfo-validate
validates subnetwork, instrumentation, datalogger, sensor, preamplifier, stage and filter filesobsinfo-print
obsinfo-print_version
obsinfo-setup
creates the .obsinforc file and can also create an example database.obsinfo-test
runs a series of validation tests
The different obsinfo-makescripts-*
command-line scripts are used for making IPGP-specific data processing flows, as described below. They could be used as a basis for creating your own data processing flows.
The directory obsinfo/obsinfo/addons/
contains programs to create
processing scripts using the information in the subnetwork files.
This is addressed in more detail in the training_course/4_advanced module
Comments, notes and extras
Comments and notes are both lists of text.
comments
will be transformed in to StationXML comments. They can be entered at thesubnetwork
,station
andchannel
level and will be transformed into StationXML comments at the same level.notes
will not go into the StationXML file, they are for your information only. They can be entered at thebase
,station
, andcomponent
levels.extras
is a free object-based field. It can be used to add fields that may be useful in a future version of obsinfo. Nothing there is put into the StationXML code unless the obsinfo software is specifically updated to do so ( which allows new fields without breaking compatibilty or schema rules). They can be entered at thesubnetwork
,station
orchannel
level