Reading outputs with MSNoiseResult
MSNoiseResult — user-facing class for loading computed results.
MSNoiseResult is the recommended entry point for reading any output
produced by the MSNoise pipeline from a notebook or script. You do not need
to know the on-disk path layout or the low-level core.io functions: just
build a result object for your lineage and call the appropriate get_*
method.
Two construction paths exist depending on whether a database is available:
DB-connected (active project): use
from_ids(),from_names(), orlist().DB-free (portable bundle): use
from_bundle()— no database, nomsnoise db init, works anywhere the msnoise package is installed.
For the full reference see msnoise.results.MSNoiseResult.
Dynamic method gating
Methods are dynamically gated: only methods whose required step category is
present in the lineage are exposed. Accessing an unavailable method raises
AttributeError with a clear message, and the method is absent from
dir() and tab-completion, so Jupyter’s autocomplete only shows what is
actually readable.
Constructing a result object (DB-connected)
Use MSNoiseResult.from_ids() with the integer config-set numbers for
each step you want to include. You only need to go as far down the pipeline
as the data you want to read:
from msnoise.results import MSNoiseResult
from msnoise.core.db import connect
db = connect()
# ----- read stacked CCFs and everything downstream -----
r = MSNoiseResult.from_ids(db, preprocess=1, cc=1, filter=1,
stack=1, refstack=1)
da = r.get_ccf("BE.UCC..HHZ:BE.MEM..HHZ", "ZZ", ("1D", "1D"))
# ----- read raw (pre-stack) CC outputs only -----
# Initialise only to filter level — get_ccf_raw is still available.
r_cc = MSNoiseResult.from_ids(db, preprocess=1, cc=1, filter=1)
da = r_cc.get_ccf_raw("BE.UCC..HHZ:BE.MEM..HHZ", "ZZ",
date="2023-01-01", kind="all")
# ----- read all the way to dv/v -----
r_dvv = MSNoiseResult.from_ids(db, preprocess=1, cc=1, filter=1,
stack=1, refstack=1,
mwcs=1, mwcs_dtt=1, mwcs_dtt_dvv=1)
ds = r_dvv.get_dvv(pair_type="CC", components="ZZ", mov_stack=("1D", "1D"))
# Only methods valid for this lineage are visible
r_cc.get_ccf(...) # AttributeError — 'stack' not in lineage
r_cc.get_ccf_raw(...) # works — 'cc' is in lineage
Available get_* methods
Method |
Requires in lineage |
Returns |
|---|---|---|
|
|
Raw per-window or daily-stacked CCFs written by |
|
|
Moving-stacked CCFs written by |
|
|
Reference stacks written by |
|
|
MWCS results written by |
|
|
MWCS dt/t results written by |
|
|
Stretching results written by |
|
|
Wavelet coherence results written by |
|
|
WCT dt/t results written by |
|
any dvv step |
Aggregated dv/v written by |
|
|
Daily PSD written by |
|
|
PSD RMS written by |
Reading raw CC outputs (pre-stack)
get_ccf_raw reads the files written directly by s03_compute_no_rotation
before any stacking. Two storage layouts exist depending on the config:
kind="all"— per-window CCFs under_output/all/with dims(times, taxis)wheretimesis the window start time.kind="daily"— daily-stacked CCFs under_output/daily/with dim(taxis,)per file.
The method only requires cc in the lineage, so you can call it on a
filter-level result without needing stack or anything further:
r = MSNoiseResult.from_ids(db, preprocess=1, cc=1, filter=1)
# Single day — per-window DataArray with dims (times, taxis)
da = r.get_ccf_raw("BE.UCC..HHZ:BE.MEM..HHZ", "ZZ",
date="2023-01-01", kind="all")
# All days for one pair, daily stacks — dict keyed by date string
d = r.get_ccf_raw("BE.UCC..HHZ:BE.MEM..HHZ", "ZZ", kind="daily")
# Concatenate all days into a single DataArray (times, taxis)
import xarray as xr
da_all = xr.concat(d.values(), dim="times").sortby("times")
Iterating over all completed results
list() returns every MSNoiseResult for which at least one Done job
exists in the given category:
for r in MSNoiseResult.list(db, "mwcs_dtt"):
ds = r.get_mwcs_dtt("BE.UCC..HHZ:BE.MEM..HHZ", "ZZ", ("1D", "1D"))
Exporting dv/v with provenance
export_dvv() writes dv/v NetCDF files that embed full provenance
(lineage string, all config parameters, station metadata):
r = MSNoiseResult.list(db, "mwcs_dtt_dvv")[0]
written = r.export_dvv("exports/")
for f in written:
print(f) # dvv_CC_ZZ__pre1-cc1-f1-stk1-ref1-mwcs1-dtt1-dvv1__m1D-1D.nc
# Reload and inspect the embedded params
import xarray as xr
from msnoise.params import MSNoiseParams
ds = xr.open_dataset(written[0])
params = MSNoiseParams.from_yaml_string(ds.attrs["msnoise_params"])
print(params.mwcs.mwcs_wlen)
Portable bundles — HPC to laptop, or journal supplementary data
export_bundle() packages a selectable slice of the computed _output/
tree into a self-describing directory (or .zip), together with
params.yaml (full lineage config) and MANIFEST.json (sha256 per file,
station list, original output path). The bundle can be read on any machine
with MSNoise installed — no database required.
Selecting what to include with the from_step= argument:
|
What is copied |
|---|---|
|
Moving-stacked CCFs + all downstream steps (large) |
|
Reference stacks + all dv/v branches (moderate; default for CC workflows) |
|
Only the final MWCS dv/v aggregates (small; ideal for paper supplements) |
|
PSD daily files + RMS (moderate; default for PSD-only workflows) |
|
Raw per-window CCFs + everything downstream (very large; opt-in) |
|
Earliest step with |
The full lineage directory nesting is preserved verbatim inside the bundle
root, so from_bundle() can set output_folder = bundle_root and all
xr_get_*() calls resolve paths identically — nothing
in core/io.py needed to change.
Typical HPC → laptop workflow:
# ── On the HPC (has DB) ───────────────────────────────────────────────
from msnoise.results import MSNoiseResult
from msnoise.core.db import connect
db = connect()
r = MSNoiseResult.list(db, "mwcs_dtt_dvv")[0]
# Export reference stacks + all dv/v branches as a zip:
bundle = r.export_bundle(
"belgium_2023/",
from_step="refstack",
compress=True, # → belgium_2023.zip (~50 MB typical network)
)
# Export only the final dv/v for a journal data-availability statement:
r.export_bundle("paper_SI/", from_step="mwcs_dtt_dvv", compress=True)
# ── rsync to local machine ────────────────────────────────────────────
# rsync -avz hpc:/scratch/user/msnoise/belgium_2023.zip ./
# ── On the laptop / at the reviewer's machine (no DB needed) ─────────
from msnoise.results import MSNoiseResult
r2 = MSNoiseResult.from_bundle("belgium_2023.zip")
# Optional integrity check (recommended for published data):
r2.verify()
# OK — 347 files verified.
# Full MSNoiseResult API works immediately:
ds = r2.get_dvv("CC", "ZZ", ("1D", "1D"))
da = r2.get_ccf("BE.UCC..HHZ:BE.MEM..HHZ", "ZZ", ("1D", "1D"))
ref = r2.get_ref("BE.UCC..HHZ:BE.MEM..HHZ", "ZZ")
# Navigate branches (folder scan — no DB):
for branch in r2.branches():
print(branch)
# MSNoiseResult(category='mwcs_dtt_dvv', lineage='...', available_methods=[...])
# MSNoiseResult(category='stretching_dvv', lineage='...', available_methods=[...])
# Inspect the exact parameters that produced the result:
print(r2.params.mwcs.mwcs_wlen) # → 10.0
print(r2.params.cc.cc_sampling_rate) # → 20.0
# params.yaml is also preserved in MANIFEST.json alongside the original
# HPC output_folder path, msnoise version, and generation timestamp —
# sufficient for a data-availability or reproducibility statement.
- class msnoise.results.MSNoiseResult(db, lineage_names: list, _params=None)
A resolved pipeline branch with dynamically gated data-loading methods.
Only
get_*methods whose required step category appears inlineage_namesare accessible. Accessing a gated method raisesAttributeErrorwith a helpful message and the method does not appear indir()or tab-completion.Do not instantiate directly — use
from_ids(),from_names(), orlist().- Attributes:
- lineage_nameslist[str]
Ordered step-name strings, e.g.
['preprocess_1', 'cc_1', 'filter_1', 'stack_1', 'refstack_1'].- categorystr
Category of the terminal step, e.g.
'refstack'.- paramsMSNoiseParams
Merged configuration parameters for this lineage.
- output_folderstr
Root output folder.
- classmethod from_names(db, names: list) MSNoiseResult
Build from an explicit list of step-name strings.
- classmethod from_ids(db, preprocess=None, cc=None, psd=None, psd_rms=None, filter=None, stack=None, refstack=None, mwcs=None, mwcs_dtt=None, mwcs_dtt_dvv=None, stretching=None, stretching_dvv=None, wavelet=None, wavelet_dtt=None, wavelet_dtt_dvv=None) MSNoiseResult
Build from integer configset IDs in canonical workflow order.
- classmethod list(db, category: str, include_empty: bool = False) list
Return all done MSNoiseResult objects for a step category.
- classmethod from_bundle(path: str) MSNoiseResult
Load a read-only
MSNoiseResultfrom a bundle directory or.zipfile produced byexport_bundle().No database connection is required. All
get_*methods work immediately after construction.branches()uses a folder scan rather than the DB.- Parameters:
path – Path to a bundle directory or a
.zipfile.- Returns:
MSNoiseResultwith_db=None.- Raises:
FileNotFoundError – if
params.yamlis absent from the bundle.
Example — directory bundle:
r = MSNoiseResult.from_bundle("/data/msnoise_bundle/") ds = r.get_dvv("CC", "ZZ", ("1D", "1D"))
Example — zip bundle:
r = MSNoiseResult.from_bundle("/data/bundle.zip") da = r.get_ccf("BE.UCC..HHZ:BE.MEM..HHZ", "ZZ", ("1D", "1D")) r.verify()
- branches(include_empty: bool = False) list
Return downstream MSNoiseResult objects one step below this lineage.
In bundle / DB-free mode (
_db is None) the DAG topology is taken fromget_workflow_chains()and child directories are checked for the presence of an_outputsubdirectory to confirm they were actually computed.
- verify(verbose: bool = False) bool
Verify bundle integrity against
MANIFEST.json.Re-hashes every file listed in the manifest and compares to the stored sha256 digest. Prints a summary line; returns
Trueif all pass.- Parameters:
verbose – If
True, print every file path as it is checked.- Returns:
Trueif all checksums match,Falseotherwise.- Raises:
FileNotFoundError – if
MANIFEST.jsonis absent (not a bundle, or loaded viafrom_ids()/from_names()).
Example:
r = MSNoiseResult.from_bundle("/data/bundle/") assert r.verify(), "Bundle integrity check failed"
- get_ccf(pair=None, components=None, mov_stack=None)
Load CCFs from the stack step. Requires ‘stack’ in lineage.
- Returns:
xarray.DataArray(single pair/comp/ms) or dict of DataArrays keyed by(pair, comp, mov_stack).
- get_ccf_raw(pair=None, components=None, date=None, kind='all')
Load raw CC step outputs (per-window or daily-stacked CCFs).
Reads the files written by
s03_compute_no_rotationbefore stacking. Requires only'cc'in the lineage, so a result initialised down tofilter_N(e.g.MSNoiseResult.from_ids(db, preprocess=1, cc=1, filter=1)) exposes this method.Path layout on disk:
<o> / preprocess_1 / cc_1 / filter_1 / _output / all|daily / <comp> / <sta1>_<sta2> / <YYYY-MM-DD>.nc
- Parameters:
- pairstr or None
Station pair in
"NET.STA.LOC:NET.STA.LOC"format. When None all available pairs are returned.- componentsstr or None
Component string, e.g.
"ZZ". None = all.- datestr or None
ISO date
"YYYY-MM-DD". None = all available dates.- kind{“all”, “daily”}
"all"— per-window CCFs (_output/all/), dims(times, taxis)per file."daily"— daily-stacked CCFs (_output/daily/), dim(taxis,)per file, expanded with atimescoordinate so results can be concatenated by the caller.
- Returns:
- xarray.DataArray
When pair, components and date are all specified.
- dict
Otherwise: keys are
(pair_key, comp, date_str)tuples, collapsed to 1-tuples for any dimension fixed by the caller.
- Raises:
- ValueError
If no
cc_*or filter step is found in the lineage.- Example::
r = MSNoiseResult.from_ids(db, preprocess=1, cc=1, filter=1)
# Single file — per-window DataArray (times, taxis) da = r.get_ccf_raw(“BE.UCC..HHZ:BE.MEM..HHZ”, “ZZ”,
date=”2023-01-01”, kind=”all”)
# All dates for one pair/comp — daily stacks d = r.get_ccf_raw(“BE.UCC..HHZ:BE.MEM..HHZ”, “ZZ”, kind=”daily”) # d is {date_str: DataArray(times=[T], taxis=…)}
- get_ref(pair=None, components=None)
Load reference stacks. Requires ‘refstack’ in lineage.
- Returns:
xarray.DataArray(single pair/comp) or dict of DataArrays keyed by(pair, comp).
- get_mwcs(pair=None, components=None, mov_stack=None)
Load MWCS results. Requires ‘mwcs’ in lineage.
- Returns:
xarray.Datasetor dict of Datasets.
- get_mwcs_dtt(pair=None, components=None, mov_stack=None)
Load MWCS-DTT results. Requires ‘mwcs_dtt’ in lineage.
- Returns:
xarray.Datasetor dict of Datasets.
- get_stretching(pair=None, components=None, mov_stack=None)
Load stretching results. Requires ‘stretching’ in lineage.
- Returns:
xarray.Datasetor dict of Datasets.
- get_dvv(pair_type='ALL', components=None, mov_stack=None)
Load DVV aggregate results. Requires a DVV step in lineage.
- Returns:
xarray.Datasetor dict of Datasets keyed by(pair_type, comp, mov_stack).
- get_dvv_pairs(pair_type='ALL', components=None, mov_stack=None)
Load per-pair dv/v time series. Requires a DVV step in lineage.
Returns a Dataset with dims
(pair, times)and variablesdvvanderr, containing the standardised dv/v for every pair that was included in the aggregate — useful for per-pair QC, selection, and rejection.Same signature as
get_dvv(). ReturnsNonefor any combination where the pairs file was not yet produced (e.g. results computed before this feature was added).- Returns:
xarray.Datasetor dict of Datasets keyed by(pair_type, comp, mov_stack).
Example:
r = MSNoiseResult.from_ids(db, preprocess=1, cc=1, filter=1, stack=1, refstack=1, mwcs=1, mwcs_dtt=1, mwcs_dtt_dvv=1) ds = r.get_dvv_pairs("CC", "ZZ", ("1D", "1D")) # ds["dvv"].sel(pair="PF.FOR.00:PF.CSS.00").plot() # mask bad pairs and re-aggregate: good = ds["dvv"].where(ds["err"] < 0.01).mean("pair")
- get_wct(pair=None, components=None, mov_stack=None)
Load WCT results. Requires ‘wavelet’ in lineage.
- Returns:
xarray.Datasetor dict of Datasets.
- get_wct_dtt(pair=None, components=None, mov_stack=None)
Load WCT dt/t results. Requires ‘wavelet_dtt’ in lineage.
- Returns:
xarray.Datasetor dict of Datasets.
- get_psd(seed_id=None, day=None)
Load PSD results. Requires ‘psd’ in lineage.
- Returns:
xarray.Datasetor dict of Datasets keyed by(seed_id, day).
- get_psd_rms(seed_id=None)
Load PSD RMS results. Requires ‘psd_rms’ in lineage.
- Returns:
xarray.Datasetor dict of Datasets keyed by seed_id.
- export_bundle(dest: str, from_step: str | None = None, compress: bool = False, overwrite: bool = False) str
Export a portable, self-describing bundle of computed results.
Copies the selected portion of the
_output/tree together withparams.yamlandMANIFEST.json(sha256 per file) into dest. The result can be read on any machine with MSNoise installed usingfrom_bundle()— no database connection required.The full lineage directory nesting is preserved verbatim, so that
from_bundle()can setoutput_folder = bundle_rootand all existingxr_get_*()calls resolve paths identically.- Parameters:
- dest:
Output directory path. Created if absent. If compress is
True, a<dest>.zipfile is written instead and the temporary directory is removed on success.- from_step:
Category name of the first step whose
_output/to include. Must be present in the lineage. All downstream steps are included automatically; ancestor directory names are created as empty path components to preserve the full nesting.Accepted values (examples):
"stack","refstack","mwcs_dtt_dvv","psd"."cc"advances automatically to"filter"because raw CCFs are physically stored under the filter-step directory.None(default) picks the earliest step in the lineage that has an_outputdirectory ("stack"for CC workflows,"psd"for PSD-only workflows).- compress:
If
True, zip the bundle directory after writing and return the.zippath. The directory is removed on success.- overwrite:
If
True, overwrite an existing dest directory or.zip.
- Returns:
- str
Absolute path to the bundle directory or
.zipfile written.
- Raises:
- ValueError
If from_step is not found in the lineage.
- FileExistsError
If dest (or
<dest>.zip) already exists and overwrite isFalse.- FileNotFoundError
If the source directory for from_step does not exist on disk.
Examples
Export from refstack downwards, compressed:
db = connect() r = MSNoiseResult.list(db, "mwcs_dtt_dvv")[0] path = r.export_bundle( "belgium_2023/", from_step="refstack", compress=True, ) # → belgium_2023.zip
Read on another machine:
r2 = MSNoiseResult.from_bundle("belgium_2023.zip") r2.verify() ds = r2.get_dvv("CC", "ZZ", ("1D", "1D"))
Journal supplementary data (dvv only):
r.export_bundle( "paper_SI/", from_step="mwcs_dtt_dvv", compress=True, )
- static to_dataframe(ds)
Convert an xarray Dataset or DataArray returned by any
get_*method to aDataFrame.This is the recommended escape hatch for users who need pandas for custom analysis, plotting, or export. All
get_*methods return xarray objects; call this helper when you need a DataFrame:r = MSNoiseResult.from_ids(db, mwcs=1, mwcs_dtt=1) ds = r.get_mwcs_dtt("BE.UCC:BE.MEM", "ZZ", ("1D", "1D")) df = MSNoiseResult.to_dataframe(ds)
For Datasets with multiple variables the result is a DataFrame with a
MultiIndexon the columns. For DataArrays the result is a flat DataFrame indexed bytimes.- Parameters:
ds –
xarray.Datasetorxarray.DataArray.- Returns:
- export_dvv(path: str, pair_type: str = 'CC', components: str | None = None, mov_stack: tuple | None = None) list
Export dv/v time series bundled with full parameter provenance.
Writes one NetCDF file per
(components, mov_stack)combination. Each file embeds the dv/v statistics (mean,std,median,n_pairs, and any weighted / trimmed variants) plus global attributes for full reproducibility:lineageSlash-separated step-name path.
msnoise_paramsFull YAML dump of
params— one block per config category. Load offline withyaml.safe_load(ds.attrs["msnoise_params"]).msnoise_versionPackage version string.
generatedISO-8601 UTC timestamp.
pair_type,components,mov_stackProvenance of the specific slice exported.
- Parameters:
- path:
Output directory (created if absent) or a full
.ncpath. When a directory is given, files are nameddvv_<pair_type>_<comp>__<lineage_tag>__m<ms>.nc. A full path requires exactly one(components, mov_stack)to be specified.- pair_type:
Pair-type filter (default
"CC").- components:
Component string, e.g.
"ZZ".Noneexports all found.- mov_stack:
Tuple
(window, step).Noneexports all found.
- Returns:
- list of str
Paths of files written.
- Example::
db = connect() r = MSNoiseResult.list(db, “mwcs_dtt_dvv”)[0] written = r.export_dvv(“exports/”, pair_type=”CC”) for f in written:
print(f) # dvv_CC_ZZ__pre1-cc1-f1-stk1-ref1-mwcs1-dtt1-dvv1__m1D-1D.nc
# Reload and inspect provenance import xarray as xr, yaml ds = xr.open_dataset(written[0]) params = yaml.safe_load(ds.attrs[“msnoise_params”]) print(params[“mwcs”][“mwcs_wlen”])