# Core Functions¶

## Whitening¶

move2obspy.whiten(data, Nfft, delta, freqmin, freqmax, plot=False)

This function takes 1-dimensional data timeseries array, goes to frequency domain using fft, whitens the amplitude of the spectrum in frequency domain between freqmin and freqmax and returns the whitened fft.

Parameters: data (numpy.ndarray) – Contains the 1D time series to whiten Nfft (int) – The number of points to compute the FFT delta (float) – The sampling frequency of the data freqmin (float) – The lower frequency bound freqmax (float) – The upper frequency bound plot (bool) – Whether to show a raw plot of the action (default: False) numpy.ndarray The FFT of the input trace, whitened between the frequency bounds

## Correlation¶

move2obspy.myCorr(data, maxlag, plot=False, nfft=None)

This function takes ndimensional data array, computes the cross-correlation in the frequency domain and returns the cross-correlation function between [-maxlag:maxlag].

Parameters: data (numpy.ndarray) – This array contains the fft of each timeseries to be cross-correlated. maxlag (int) – This number defines the number of samples (N=2*maxlag + 1) of the CCF that will be returned. numpy.ndarray The cross-correlation function between [-maxlag:maxlag]

## Moving-Window Cross-Spectral method¶

move2obspy.mwcs(current, reference, freqmin, freqmax, df, tmin, window_length, step, smoothing_half_win=5)

The current time series is compared to the reference. Both time series are sliced in several overlapping windows. Each slice is mean-adjusted and cosine-tapered (85% taper) before being Fourier- transformed to the frequency domain. $$F_{cur}(\nu)$$ and $$F_{ref}(\nu)$$ are the first halves of the Hermitian symmetric Fourier-transformed segments. The cross-spectrum $$X(\nu)$$ is defined as $$X(\nu) = F_{ref}(\nu) F_{cur}^*(\nu)$$

in which $${}^*$$ denotes the complex conjugation. $$X(\nu)$$ is then smoothed by convolution with a Hanning window. The similarity of the two time-series is assessed using the cross-coherency between energy densities in the frequency domain:

$$C(\nu) = \frac{|\overline{X(\nu))}|}{\sqrt{|\overline{F_{ref}(\nu)|^2} |\overline{F_{cur}(\nu)|^2}}}$$

in which the over-line here represents the smoothing of the energy spectra for $$F_{ref}$$ and $$F_{cur}$$ and of the spectrum of $$X$$. The mean coherence for the segment is defined as the mean of $$C(\nu)$$ in the frequency range of interest. The time-delay between the two cross correlations is found in the unwrapped phase, $$\phi( u)$$, of the cross spectrum and is linearly proportional to frequency:

$$\phi_j = m. u_j, m = 2 \pi \delta t$$

The time shift for each window between two signals is the slope $$m$$ of a weighted linear regression of the samples within the frequency band of interest. The weights are those introduced by [Clarke2011], which incorporate both the cross-spectral amplitude and cross-coherence, unlike [Poupinet1984]. The errors are estimated using the weights (thus the coherence) and the squared misfit to the modelled slope:

$$e_m = \sqrt{\sum_j{(\frac{w_j \nu_j}{\sum_i{w_i \nu_i^2}})^2}\sigma_{\phi}^2}$$

where $$w$$ are weights, $$\nu$$ are cross-coherences and $$\sigma_{\phi}^2$$ is the squared misfit of the data to the modelled slope and is calculated as $$\sigma_{\phi}^2 = \frac{\sum_j(\phi_j - m \nu_j)^2}{N-1}$$

The output of this process is a table containing, for each moving window: the central time lag, the measured delay, its error and the mean coherence of the segment.

Warning

The time series will not be filtered before computing the cross-spectrum! They should be band-pass filtered around the freqmin-freqmax band of interest beforehand.

Parameters: current (numpy.ndarray) – The “Current” timeseries reference (numpy.ndarray) – The “Reference” timeseries freqmin (float) – The lower frequency bound to compute the dephasing (in Hz) freqmax (float) – The higher frequency bound to compute the dephasing (in Hz) df (float) – The sampling rate of the input timeseries (in Hz) tmin (float) – The leftmost time lag (used to compute the “time lags array”) window_length (float) – The moving window length (in seconds) step (float) – The step to jump for the moving window (in seconds) smoothing_half_win (int) – If different from 0, defines the half length of the smoothing hanning window. numpy.ndarray [time_axis,delta_t,delta_err,delta_mcoh]. time_axis contains the central times of the windows. The three other columns contain dt, error and mean coherence for each window.

## WLS¶

move2obspy.linear_regression(xdata, ydata, weights=None, p0=None, intercept_origin=True, **kwargs)

Use linear least squares to fit a function, f, to data. This method is a generalized version of scipy.optimize.minpack.curve_fit(); allowing for Ordinary Least Square and Weighted Least Square regressions:

• OLS through origin: linear_regression(xdata, ydata)
• OLS with any intercept: linear_regression(xdata, ydata, intercept_origin=False)
• WLS through origin: linear_regression(xdata, ydata, weights)
• WLS with any intercept: linear_regression(xdata, ydata, weights, intercept_origin=False)

If the expected values of slope (and intercept) are different from 0.0, provide the p0 value(s).

Parameters: xdata – The independent variable where the data is measured. ydata – The dependent data - nominally f(xdata, …) weights – If not None, the uncertainties in the ydata array. These are used as weights in the least-squares problem. If None, the uncertainties are assumed to be 1. In SciPy vocabulary, our weights are 1/sigma. p0 – Initial guess for the parameters. If None, then the initial values will all be 0 (Different from SciPy where all are 1) intercept_origin – If True: solves y=a*x (default); if False: solves y=a*x+b.

Extra keword arguments will be passed to scipy.optimize.minpack.curve_fit().

Return type: tuple (slope, std_slope) if intercept_origin is True; (slope, intercept, std_slope, std_intercept) if False.

## Preprocessing¶

preprocessing.preprocess(db, stations, comps, goal_day, params, responses=None)

Fetches data for each stations and each comps using the data_availability table in the database.

To correct for instrument responses, make sure to set remove_response to “Y” in the config and to provide the responses DataFrame.

>>> from msnoise.api import connect, get_params, preload_instrument_responses
>>> from msnoise.preprocessing import preprocess
>>> db = connect()
>>> params = get_params(db)

Parameters: db (sqlalchemy.orm.session.Session) – A Session object, as obtained by msnoise.api.connect(). stations (list of str) – a list of station names, in the format NET.STA. comps (list of str) – a list of component names, in Z,N,E,1,2. goal_day (str) – the day of data to load, ISO 8601 format: e.g. 2016-12-31. params (class) – an object containing the config parameters, as obtained by msnoise.api.get_params(). responses (pandas.DataFrame) – a DataFrame containing the instrument responses, as obtained by msnoise.api.preload_instrument_responses(). obspy.core.stream.Stream A Stream object containing all traces.