MSNoise Reproducible Papers — Analysis Notebook
This notebook demonstrates the three entry paths into MSNoiseProject and then shows canonical analyses that work identically regardless of how the project was loaded:
Section |
What it does |
|---|---|
0 — Load |
Path A (live), B (local archive), C (MRP registry) |
1 — CCF matrix |
|
2 — REF stacks |
|
3 — dv/v timeseries |
|
Prerequisites
MSNoise ≥ 2.0 installed (including
pooch,zstandard,platformdirs).For Path A: a live project directory with
db.ini.For Path B: a
.tar.zstproject archive (produced bymsnoise project export).For Path C: network access to the MRP registry; the paper must have a
bundle_pointer.yamlwith the requested level.
0 · Load the project
Run exactly one of the three cells below, then continue from Section 1.
Path A — live project (cwd has db.ini)
[1]:
# from msnoise.project import MSNoiseProject
# project = MSNoiseProject.from_current(".") # or pass an explicit path
# print(project.project_dir)
Path B — local project archive
[2]:
# from msnoise.project import MSNoiseProject
# project = MSNoiseProject.from_archive("level_stack.tar.zst")
# print(project.project_dir)
Path C — MSNoise Reproducible Papers registry (default for this notebook)
[3]:
from msnoise.papers import MRP
mrp = MRP()
mrp.list_papers()
ID Year Net Levels ✓
----------------------------------------------------------------------------------------------------
2014_Lecocq_MSNoiseUndervolc 2014 YA...... stack, dvv ✅
2016_DePlaen_PitonDeLaFournaise 2016 PF......
2019_DePlaen_Etna 2019 IV......
2019_Yates_WhiteIsland 2019 NZ......
2022_Wang_Hikurangi 2022 YH......
2023_Yates_ClusteringCCFs 2023 YA, NZ..
2024_Yates_RuapehuSnow 2024 NZ......
[4]:
# +
PAPER_ID = "2016_DePlaen_PitonDeLaFournaise" # ← change as needed
LEVEL = "stack" # "stack" | "dvv" | …
paper = mrp.get_paper(PAPER_ID)
paper.info()
# -
100% 0.0 / 0.0 MB
100% 0.0 / 0.0 MB
Paper: 2016_DePlaen_PitonDeLaFournaise
journal_abbrev: GRL
region: Piton de la Fournaise, La Réunion
network: PF
short_description: Single-station SC + AC, 4 frequency bands
msnoise_version_min: 2.0.0
levels_available: []
data_open: True
uses_msnoise: True
validated: False
bundle_levels_available: (none)
[5]:
project = paper.get_project(LEVEL)
print("Project dir:", project.project_dir)
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[5], line 1
----> 1 project = paper.get_project(LEVEL)
2 print("Project dir:", project.project_dir)
File D:\PythonForSource\MSNoise_Stack\MSNoise\msnoise\papers.py:414, in MRPPaper.get_project(self, level, project)
412 available_levels = self._pointer.get("levels", {})
413 if not available_levels:
--> 414 raise FileNotFoundError(
415 f"No bundle_pointer.yaml found for {self.paper_id!r}."
416 )
418 # Resolve level list
419 if level == "all":
FileNotFoundError: No bundle_pointer.yaml found for '2016_DePlaen_PitonDeLaFournaise'.
1 · CCF matrix overview
Load all stacked CCF results for the requested filter / mov_stack combination and plot an overview matrix: one row per pair, lag time on the x-axis.
[6]:
# +
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
# Parameters — adjust to match the paper's filter / mov_stack settings
COMPONENT = "ZZ"
MOV_STACK = ("1D", "1D") # (duration, step) as stored in the NetCDF
results = project.list("stack")
print(f"Found {len(results)} stack lineage(s).")
fig, axes = plt.subplots(
len(results), 1,
figsize=(12, 2.5 * max(len(results), 1)),
squeeze=False,
)
for ax, result in zip(axes[:, 0], results):
try:
ds = result.get_ccf(component=COMPONENT, mov_stack=MOV_STACK)
except Exception as exc:
ax.set_visible(False)
print(f" Skipped {result.lineage_names[-1]}: {exc}")
continue
pairs = ds.coords["pair"].values if "pair" in ds.coords else ds.coords.get("pairs", [None]).values
times = ds.coords.get("times", None)
taxis = ds.coords.get("taxis", None)
if taxis is None or times is None:
ax.set_visible(False)
continue
data = ds["ccf"].values if "ccf" in ds else ds[list(ds.data_vars)[0]].values
# data shape: (times, pairs, taxis) or (pairs, taxis) for REF
if data.ndim == 3:
data = data.mean(axis=0) # collapse times → (pairs, taxis)
t = taxis.values
for i, (row, pair) in enumerate(zip(data, pairs)):
norm = np.abs(row).max() or 1.0
ax.plot(t, row / norm + i, lw=0.8, color="steelblue")
ax.text(t[-1] + 0.1, i, str(pair), fontsize=6, va="center")
ax.set_title(f"{result.lineage_names[-1]} ({COMPONENT}, {MOV_STACK})", fontsize=9)
ax.set_xlabel("Lag (s)")
ax.set_ylabel("Pair index")
ax.xaxis.set_minor_locator(mticker.AutoMinorLocator())
if hasattr(ds, "close"):
ds.close()
fig.tight_layout()
plt.show()
# -
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[6], line 10
7 COMPONENT = "ZZ"
8 MOV_STACK = ("1D", "1D") # (duration, step) as stored in the NetCDF
---> 10 results = project.list("stack")
11 print(f"Found {len(results)} stack lineage(s).")
13 fig, axes = plt.subplots(
14 len(results), 1,
15 figsize=(12, 2.5 * max(len(results), 1)),
16 squeeze=False,
17 )
NameError: name 'project' is not defined
2 · REF stack comparison
For each stack lineage, traverse to the sibling refstack(s) and overlay their reference waveforms for a selected pair.
[7]:
# +
PAIR = None # set to e.g. "BE.UCC..HHZ:BE.MEM..HHZ" or leave None for first pair
fig2, ax2 = plt.subplots(figsize=(12, 4))
for result in results:
try:
branches = result.branches()
except Exception:
branches = []
for branch in branches:
if branch.category != "refstack":
continue
try:
ds_ref = branch.get_ref(component=COMPONENT)
except Exception:
continue
pairs_in = ds_ref.coords.get("pair", ds_ref.coords.get("pairs", None))
if pairs_in is None:
continue
target_pair = PAIR or str(pairs_in.values[0])
taxis_ref = ds_ref.coords.get("taxis", None)
if taxis_ref is None:
continue
idx = list(pairs_in.values).index(target_pair) if target_pair in pairs_in.values else 0
waveform = ds_ref["ref"].values[idx] if "ref" in ds_ref else ds_ref[list(ds_ref.data_vars)[0]].values[idx]
norm = np.abs(waveform).max() or 1.0
ax2.plot(taxis_ref.values, waveform / norm,
label=f"{branch.lineage_names[-1]}", alpha=0.8, lw=1.2)
if hasattr(ds_ref, "close"):
ds_ref.close()
ax2.set_title(f"REF stack comparison — pair: {PAIR or '(first available)'} — {COMPONENT}", fontsize=10)
ax2.set_xlabel("Lag (s)")
ax2.set_ylabel("Normalised amplitude")
ax2.legend(fontsize=8, loc="upper right")
ax2.xaxis.set_minor_locator(mticker.AutoMinorLocator())
fig2.tight_layout()
plt.show()
# -
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 6
2 PAIR = None # set to e.g. "BE.UCC..HHZ:BE.MEM..HHZ" or leave None for first pair
4 fig2, ax2 = plt.subplots(figsize=(12, 4))
----> 6 for result in results:
7 try:
8 branches = result.branches()
NameError: name 'results' is not defined
3 · dv/v timeseries
Load the final DVV aggregates (requires a project loaded at level "dvv"). If the current project was loaded at the "stack" level the cell below gracefully skips with a message.
[8]:
# +
dvv_results = project.list("mwcs_dtt_dvv")
if not dvv_results:
print(
"No mwcs_dtt_dvv outputs found in this project.\n"
"Load at level='dvv' to access the final dv/v aggregates:\n"
" project = paper.get_project('dvv')"
)
else:
fig3, ax3 = plt.subplots(figsize=(12, 4))
for dvv_result in dvv_results:
try:
ds_dvv = dvv_result.get_dvv(component=COMPONENT, mov_stack=MOV_STACK)
except Exception as exc:
print(f" Skipped {dvv_result.lineage_names[-1]}: {exc}")
continue
if "times" not in ds_dvv.coords or "dvv" not in ds_dvv:
ds_dvv.close()
continue
t = ds_dvv.coords["times"].values.astype("datetime64[D]")
dvv_mean = float(ds_dvv["dvv"].mean("pair")) if "pair" in ds_dvv.dims else ds_dvv["dvv"].values
err_mean = float(ds_dvv["err"].mean("pair")) if ("err" in ds_dvv and "pair" in ds_dvv.dims) else None
ax3.plot(t, dvv_mean * 100, label=dvv_result.lineage_names[-1], lw=1.2)
if err_mean is not None:
ax3.fill_between(t,
(dvv_mean - err_mean) * 100,
(dvv_mean + err_mean) * 100,
alpha=0.2)
ds_dvv.close()
ax3.axhline(0, color="k", lw=0.6, ls="--")
ax3.set_title(f"dv/v — {COMPONENT} {MOV_STACK}", fontsize=10)
ax3.set_xlabel("Date")
ax3.set_ylabel("dv/v (%)")
ax3.legend(fontsize=8)
fig3.tight_layout()
plt.show()
# -
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 2
1 # +
----> 2 dvv_results = project.list("mwcs_dtt_dvv")
4 if not dvv_results:
5 print(
6 "No mwcs_dtt_dvv outputs found in this project.\n"
7 "Load at level='dvv' to access the final dv/v aggregates:\n"
8 " project = paper.get_project('dvv')"
9 )
NameError: name 'project' is not defined