Unified project access (MSNoiseProject)

MSNoiseProject — unified entry point for accessing MSNoise results.

MSNoiseProject is the single entry point for reading MSNoise results — regardless of whether data lives in a local live project, a project archive on disk, or a paper from the MSNoise Reproducible Papers registry.

All three paths converge on the same API:

# A — live project (cwd contains db.ini)
from msnoise.project import MSNoiseProject
project = MSNoiseProject.from_current()

# B — local project archive
project = MSNoiseProject.from_archive("level_stack.tar.zst")

# C — MSNoise Reproducible Papers (auto-download)
from msnoise.papers import MRP
project = MRP().get_paper("2016_DePlaen_PitonDeLaFournaise").get_project("stack")

# identical from here — all three paths
for result in project.list("stack"):
    ds = result.get_ccf()

Project archives vs result bundles

Two distinct archive types exist in MSNoise 2.x:

  • Project archive (.tar.zst) — full multi-lineage project at a given entry level, containing all filter / stack branches. Produced by msnoise project export, consumed by msnoise project import or MSNoiseProject.from_archive().

  • Result bundle (directory or .zip) — single-lineage portable export: params.yaml + _output/. Produced and consumed by export_bundle() / from_bundle().

Entry levels

A project archive is created at a specific entry level — the lowest pipeline step whose outputs are included.

Level

What is bundled

Resume from …

preprocess

SDS waveform cache

cc onwards

cc

Raw CCF NetCDFs

stack + refstack

stack

Stacked CCFs + reference stacks

mwcs, stretching, wavelet

mwcs

MWCS + DTT outputs

mwcs_dtt_dvv

stretching

Stretching outputs

stretching_dvv

wavelet

WCT + WCT-DTT outputs

wavelet_dtt_dvv

dvv

Final dv/v aggregates + per-pair series

Notebooks only

stack and refstack outputs are always bundled together.

Exporting a project archive

Run from the project root after the pipeline has finished:

msnoise project export --level stack --output /data/level_stack.tar.zst

The command prints the archive SHA-256 to paste into bundle_pointer.yaml. No database connection is needed — only the _output/ tree and project.yaml are read.

Python equivalent:

from msnoise.core.project_io import export_project
sha = export_project("/path/to/project", "stack", "/data/level_stack.tar.zst")

Importing a project archive

Download, verify, extract, and initialise the database in one step:

msnoise project import \
    --from bundle_pointer.yaml \
    --level stack \
    --project-dir ./my_project \
    --with-jobs

--with-jobs reconstructs flag=D jobs from the _output/ tree so the pipeline can be resumed immediately afterwards:

msnoise new_jobs --after stack

Reading results without a database

MSNoiseProject.list() is always filesystem-based — no database needed:

project = MSNoiseProject.from_project_dir("/path/to/extracted")
results = project.list("stack")

for result in results:
    print(result.lineage_names)   # ['global_1', ..., 'stack_1']
    ds = result.get_ccf(component="ZZ", mov_stack=("1D", "1D"))

# Traverse to child steps (folder scan, no DB)
for result in results:
    for branch in result.branches():
        print(branch.category, branch.lineage_names[-1])

Resuming the pipeline

To continue running the pipeline after importing an archive:

# via CLI (recommended)
msnoise project import --from bundle_pointer.yaml --level stack \
    --project-dir ./my_project --with-jobs

# via Python
project = MSNoiseProject.from_archive("level_stack.tar.zst",
                                      project_dir="./my_project")
project.init_db(with_jobs=True)
db = project.db   # SQLAlchemy session now available
class msnoise.project.MSNoiseProject(project_dir: str | Path, _db=None, _tmpdir=None, _imported_levels: list[str] | None = None)

Unified entry point for accessing MSNoise results.

Attributes:
project_dirstr

Absolute path to the project root directory. Contains project.yaml and (after init_db()) db.ini.

classmethod from_current(project_dir: str | Path = '.') MSNoiseProject

Load a live project from project_dir (default: current directory).

Reads db.ini and connects to the existing database. The returned object has _db populated and is ready for pipeline operations as well as result access.

Parameters:

project_dir – Path containing db.ini. Defaults to ".".

Raises:

FileNotFoundError – if db.ini is absent from project_dir.

classmethod from_archive(path: str | Path | list, project_dir: str | Path | None = None) MSNoiseProject

Load a project from one or more .tar.zst project archives.

If project_dir is None the archive(s) are extracted into a temporary directory kept alive for the lifetime of the returned object. Pass an explicit project_dir for a persistent extraction.

When a list of archives is supplied they are all extracted into the same project_dir — their _output/ trees never overlap, so the result is a composite project equivalent to having run every bundled level locally.

Parameters:
  • path – Path or list of paths to .tar.zst archive(s).

  • project_dir – Destination directory. None → auto temp dir.

Returns:

MSNoiseProject with _db=None.

classmethod from_project_dir(project_dir: str | Path) MSNoiseProject

Point at an already-extracted project directory (no DB).

Use when the archive has already been extracted manually or by a previous call to from_archive() with a persistent project_dir.

Parameters:

project_dir – Path containing project.yaml.

Raises:

FileNotFoundError – if project.yaml is absent.

property db

SQLAlchemy session. Raises RuntimeError if not initialised.

Call init_db() first, or use from_current() which connects automatically.

init_db(with_jobs: bool = False) None

Initialise the project database from project.yaml.

Runs msnoise db init --from-yaml project.yaml in project_dir, then connects to the created database.

Parameters:

with_jobs – If True, also reconstruct flag=D jobs by scanning the extracted _output/ tree. Only needed when continuing the pipeline after importing a project archive.

Raises:

NotImplementedError – if with_jobs=True but meta.yaml is absent from the project directory.

list(category: str) list

Return all computed MSNoiseResult objects for category.

Always filesystem-based — no database required. Scans project_dir for <category>_N directories that contain an _output/ subdirectory. A params.yaml is written alongside _output/ on first access (Option A) and reused on subsequent calls.

The returned objects share the same interface as those obtained via from_bundle() — all get_* methods work immediately. branches() uses a folder scan and is fully functional.

Parameters:

category – Category name without set number, e.g. "stack".

Returns:

List of MSNoiseResult (_db=None), sorted by lineage path.

Raises:

FileNotFoundError – if project.yaml is absent from project_dir.

get_stations() list

Return station list as obspy.core.util.attribdict.AttribDict objects.

If a DB session is available (from_current()), queries the Station table. Otherwise parses project.yaml from project_dir — works in DB-free archive mode.

Each item exposes net, sta, X, Y, altitude, coordinates attributes, compatible with get_interstation_distance().

get_distance(pair: str) float

Return interstation distance in km for pair.

Parameters:

pair – Station pair in "NET.STA.LOC:NET.STA.LOC" format.

Returns:

Distance in kilometres.

Raises:

KeyError – if either station is not found.

See also