Table Definitions

The following SQLAlchemy ORM classes represent the MSNoise database schema. Column attributes are described in each class docstring.

Job

class msnoise.msnoise_table_def.Job(day=None, pair=None, flag=None, step_id=None, jobtype=None, priority=0, lastmod=None, **kwargs)

Enhanced Job Object with workflow support

Workflow-aware job record linking a station-pair day to a WorkflowStep. Jobs are now linked to specific workflow steps and their associated config sets.

Parameters:
  • ref (int) – The Job ID in the database

  • day (str) – The day in YYYY-MM-DD format

  • pair (str) – the name of the pair (STATION1:STATION2)

  • flag (str) – Status of the Job: “T”odo, “I”n Progress, “D”one.

  • step_id (int) – Foreign key to WorkflowStep table

  • jobtype (str) – Job type string (= step_name, used as join key to WorkflowStep)

  • priority (int) – Job priority (higher number = higher priority)

property config_category

Get the config category for this job

property config_set_number

Get the config set number for this job

property lineage

The lineage string, resolved through the Lineage FK.

property step_name

Get the step name for this job

Station

class msnoise.msnoise_table_def.Station(net=None, sta=None, X=None, Y=None, altitude=None, coordinates=None, used=True, data_source_id=None, **kwargs)

Station Object

Parameters:
  • ref (int) – The Station ID in the database

  • net (str) – The network code of the Station

  • sta (str) – The station code

  • X (float) – The X coordinate of the station

  • Y (float) – The Y coordinate of the station

  • altitude (float) – The altitude of the station

  • coordinates (str) – The coordinates system. “DEG” is WGS84 latitude/ longitude in degrees. “UTM” is expressed in meters.

  • used (bool) – Whether this station must be used in the computations.

chans()

Get list of channel names

locs()

Get list of location codes

Config

class msnoise.msnoise_table_def.Config(name=None, value=None, category='global', set_number=None, param_type='str', description=None, **kwargs)

Unified configuration parameter storage for both global and workflow-specific settings.

This class replaces the old separate Config and ConfigSets tables, providing a unified approach to configuration management.

Parameters:
  • ref (int) – The configuration parameter ID in the database

  • name (str) – The parameter name (e.g., ‘maxlag’, ‘dtt_minlag’)

  • category (str) – The parameter category (‘global’, ‘mwcs’, ‘mwcs_dtt’, ‘stretching’, etc.)

  • set_number (int) – Configuration set number (NULL for global, number for workflow sets)

  • value (str) – The parameter value as string

  • param_type (str) – The parameter type (‘str’, ‘int’, ‘float’, ‘bool’)

  • default_value (str) – The default value for this parameter

  • description (str) – Description of the parameter

  • units (str) – Units of measurement for the parameter

  • possible_values (str) – slash-separated list of possible values

classmethod get_config_dict(session, category, set_number=None)

Get configuration as a dictionary

classmethod get_global_config(session, name)

Get a global configuration parameter

classmethod get_workflow_config(session, category, set_number)

Get all parameters for a workflow configuration set

get_typed_value()

Convert string value to the appropriate type

is_global_config()

Check if this is a global configuration parameter

is_workflow_config()

Check if this is a workflow-specific configuration parameter

validate_value()

Validate the value against possible_values if defined

DataAvailability

class msnoise.msnoise_table_def.DataAvailability(net=None, sta=None, loc=None, chan=None, path=None, file=None, starttime=None, endtime=None, data_duration=None, gaps_duration=None, samplerate=None, flag=None, data_source_id=None, **kwargs)

DataAvailability Object

Parameters:
  • ref (int) – The Station ID in the database

  • net (str) – The network code of the Station

  • sta (str) – The station code

  • chan (str) – The component (channel)

  • path (str) – The full path to the folder containing the file

  • file (str) – The name of the file

  • starttime (datetime) – Start time of the file

  • endtime (datetime) – End time of the file

  • data_duation – Cumulative duration of available data in the file

  • gaps_duration (float) – Cumulative duration of gaps in the file

  • samplerate (float) – Sample rate of the data in the file (in Hz)

  • flag (str) – The status of the entry: “N”ew, “M”odified or “A”rchive

DataSource

class msnoise.msnoise_table_def.DataSource(name=None, uri='', data_structure='SDS', auth_env='MSNOISE', archive_format=None, network_code='*', channels='*', **kwargs)

DataSource Object — defines where raw waveform data comes from.

Parameters:
  • ref (int) – Primary key

  • name (str) – Human label e.g. “local”, “IRIS”, “GEOFON”, “EIDA”

  • uri (str) – Data location. Schemes: - bare path or sds:///path → local SDS archive - fdsn://http://... → FDSN web service - eida://http://... → EIDA routing client

  • data_structure (str) – SDS sub-path format for local/SDS sources (e.g. “SDS”, “BUD”). Ignored for FDSN/EIDA.

  • auth_env (str) – Environment variable prefix for credentials. Worker looks up {auth_env}_FDSN_USER, {auth_env}_FDSN_PASSWORD, {auth_env}_FDSN_TOKEN. Default “MSNOISE”.

Lineage

class msnoise.msnoise_table_def.Lineage(lineage_str)

Normalised lineage-string table.

Each distinct lineage path (e.g. "preprocess_1/cc_1/filter_1/stack_1/refstack_1/mwcs_1/mwcs_dtt_1" (mwcs lineage encodes both stack and refstack parents)) is stored exactly once. Job rows reference it via a small integer foreign key instead of repeating the full string across potentially millions of rows (~20× storage saving for the column).

Attr lineage_id:

Auto-incremented primary key.

Attr lineage_str:

Slash-separated step-name path, unique.

WorkflowStep

class msnoise.msnoise_table_def.WorkflowStep(**kwargs)

Simple workflow step container - just wraps a config category+set_number

get_config_params(session)

Get configuration parameters for this step