Quality Control API
Quality-control modules build trace inventories, evaluate waveform and metric QC rules, create summaries, and prepare manual-review queues.
Package Entry Point
Quality-control workflow modules.
- class spatial_vtk.qc.InventoryBandSpec(label: str, period_min: float, period_max: float)
Describe one passband present in a trace-inventory CSV.
- class spatial_vtk.qc.TraceInventoryLookup(rows: dict[tuple[str, str, str, str, str], dict[str, str]] | None = None, *, csv_path: Path | None = None, available_bands: list[InventoryBandSpec] | None = None, disabled_requested_bands: set[str] | None = None)
Dictionary lookup with trace-inventory metadata.
- spatial_vtk.qc.build_comparison_eligibility(qc_summary: DataFrame | str | Path) DataFrame
Return rows where observed and synthetic QC both pass.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific QC summary table.
qc_summary – Required function argument.
- Returns:
Comparison-eligible event/station/component/passband/metric rows.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_event_station_pair_retention_table(qc_summary: DataFrame | str | Path) DataFrame
Summarize comparison-pair retention for each event-station pair.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific metric QC table with observed and synthetic rows.
qc_summary – Required function argument.
- Returns:
One row per event/station with total comparison pairs, retained pairs, and retained percentage across all components, passbands, and metrics.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_metric_pair_retention_table(qc_summary: DataFrame | str | Path, *, group_cols: Sequence[str] = ('metric', 'passband')) DataFrame
Summarize post-QC observed/synthetic pair retention.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific metric QC table with observed and synthetic rows.
group_cols (collections.abc.Sequence, optional) – Columns used to group retention percentages, usually metric and passband.
qc_summary – Required function argument.
group_cols – Optional function argument. Defaults to
('metric', 'passband').
- Returns:
Pair-retention rows with total pairs before QC, retained pairs after QC, and retained percentage.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_metric_qc_summary(event_station_records: DataFrame | str | Path, *, metrics: Sequence[str], components: Sequence[str], passbands: Sequence[str | Sequence[float]], spectral_periods_s: Sequence[float] = (), sources: Sequence[str] = ('observed', 'synthetic'), synthetic_max_frequency_hz: float | None = None, observed_available: bool = True, synthetic_available: bool = True, trace_qc_summary: DataFrame | str | Path | None = None, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25, return_result: bool = True) DataFrame
Build a side-specific metric QC summary from event-station records.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with at least
event_idandstationcolumns.metrics (collections.abc.Sequence) – Requested metrics, components, and period bands.
components (collections.abc.Sequence) – Requested metrics, components, and period bands.
passbands (collections.abc.Sequence) – Requested metrics, components, and period bands.
spectral_periods_s (collections.abc.Sequence, optional) – Periods to check for spectral metrics.
sources (collections.abc.Sequence, optional) – Sources to include, usually
observedandsynthetic.synthetic_max_frequency_hz (float | None, optional) – Maximum valid synthetic frequency. Synthetic spectral periods shorter than
1 / synthetic_max_frequency_hzfail QC.observed_available (bool, optional) – Default availability values when the input table does not include source-specific availability columns.
synthetic_available (bool, optional) – Default availability values when the input table does not include source-specific availability columns.
trace_qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional side-specific waveform QC table. When provided, failed source/event/station/component/passband rows fail matching metric rows.
verbose (bool, optional) – Print progress messages while building QC rows.
progress_interval (int, optional) – Number of event-station records between progress messages.
checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where intermediate QC rows are written.
resume (bool, optional) – When true and
checkpoint_pathexists, skip event/station records already present in that checkpoint.checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.
return_result (bool, optional) – When false, append checkpoint rows on disk and return an empty dataframe instead of loading/returning the full QC inventory. This is intended for large Slurm jobs.
event_station_records – Required function argument.
metrics – Required function argument.
components – Required function argument.
passbands – Required function argument.
spectral_periods_s – Optional function argument. Defaults to
().sources – Optional function argument. Defaults to
('observed', 'synthetic').synthetic_max_frequency_hz – Optional function argument. Defaults to
None.observed_available – Optional function argument. Defaults to
True.synthetic_available – Optional function argument. Defaults to
True.trace_qc_summary – Optional function argument. Defaults to
None.verbose – Optional function argument. Defaults to
False.progress_interval – Optional function argument. Defaults to
25.checkpoint_path – Optional function argument. Defaults to
None.resume – Optional function argument. Defaults to
True.checkpoint_interval – Optional function argument. Defaults to
25.return_result – Optional function argument. Defaults to
True.
- Returns:
Standard metric QC rows.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_post_qc_record_table(event_station_records: DataFrame | str | Path, *, events: DataFrame | str | Path | None = None, qc_summary: DataFrame | str | Path | None = None) DataFrame
Build station-event records for post-QC map figures.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with station coordinates.
events (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional event metadata with event coordinates.
qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional QC summary used to assign pass/fail status per event/station.
event_station_records – Required function argument.
events – Optional function argument. Defaults to
None.qc_summary – Optional function argument. Defaults to
None.
- Returns:
Records with
sta_lat,sta_lon, event coordinates, andqc_status.- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_qc_availability_table(event_station_records: DataFrame | str | Path, *, qc_summary: DataFrame | str | Path | None = None, qc_aggregate: str = 'all_pass', observed_root: str | Path | None = None, synthetic_root: str | Path | None = None, observed_inventory: DataFrame | str | Path | None = None, synthetic_inventory: DataFrame | str | Path | None = None, cfg: SpatialVTKConfig | None = None) DataFrame
Build observed/synthetic availability rows for QC figures.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table.
qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional side-specific QC summary. When provided, availability is based on post-QC retained rows instead of file presence.
qc_aggregate (str, optional) –
"all_pass"marks a side available only if all matching QC rows pass."any_pass"marks a side available when at least one matching row passes.observed_root (str | pathlib.Path | None, optional) – Optional roots used to inventory files when inventory tables are not already available. When omitted, the active config’s
paths.observed_rootandpaths.synthetic_rootare used when present.synthetic_root (str | pathlib.Path | None, optional) – Optional roots used to inventory files when inventory tables are not already available. When omitted, the active config’s
paths.observed_rootandpaths.synthetic_rootare used when present.observed_inventory (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional file inventory tables.
synthetic_inventory (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional file inventory tables.
cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object used to resolve default roots.
event_station_records – Required function argument.
qc_summary – Optional function argument. Defaults to
None.qc_aggregate – Optional function argument. Defaults to
'all_pass'.observed_root – Optional function argument. Defaults to
None.synthetic_root – Optional function argument. Defaults to
None.observed_inventory – Optional function argument. Defaults to
None.synthetic_inventory – Optional function argument. Defaults to
None.cfg – Optional function argument. Defaults to
None.
- Returns:
Availability table with one row per event/station.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_qc_waveform_comparison_records(event_station_records: DataFrame | str | Path, qc_summary: DataFrame | str | Path | None = None, *, comparison_eligible: DataFrame | str | Path | None = None, component: str = 'Z', passband: str | None = None, event_id: str | list[str] | tuple[str, ...] | None = None, max_distance_km: float | None = 50.0, max_records: int | None = 12, observed_waveform_col: str = 'observed_processed_waveform', synthetic_waveform_col: str = 'synthetic_processed_waveform') DataFrame
Build post-QC waveform rows for observed/synthetic visual inspection.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Prepared event-station table with waveform paths.
qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Side-specific QC table. Used to build comparison-eligible rows when
comparison_eligibleis not supplied.comparison_eligible (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional precomputed output from
build_comparison_eligibility().component (str, optional) – Component to load for the visual comparison.
passband (str | None, optional) – Optional retained passband to select.
event_id (str | list[str] | tuple[str, ...] | None, optional) – Optional event ID or IDs to select before loading waveforms.
max_distance_km (float | None, optional) – Optional distance limit in kilometers.
max_records (int | None, optional) – Optional maximum retained event-station rows to load.
observed_waveform_col (str, optional) – Waveform path columns in
event_station_records.synthetic_waveform_col (str, optional) – Waveform path columns in
event_station_records.event_station_records – Required function argument.
qc_summary – Optional function argument. Defaults to
None.comparison_eligible – Optional function argument. Defaults to
None.component – Optional function argument. Defaults to
'Z'.passband – Optional function argument. Defaults to
None.event_id – Optional function argument. Defaults to
None.max_distance_km – Optional function argument. Defaults to
50.0.max_records – Optional function argument. Defaults to
12.observed_waveform_col – Optional function argument. Defaults to
'observed_processed_waveform'.synthetic_waveform_col – Optional function argument. Defaults to
'synthetic_processed_waveform'.
- Returns:
Rows with observed and synthetic trace objects, sample intervals, event-origin offsets, and distance metadata.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_retention_figure_table(qc_summary: DataFrame | str | Path) DataFrame
Prepare QC rows for retention summary figures.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Metric QC summary table.
qc_summary – Required function argument.
- Returns:
Copy with
stageset to passband labels.- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_waveform_qc_summary(event_station_records: DataFrame | str | Path, *, sources: Sequence[str] = ('observed', 'synthetic'), waveform_path_columns: dict[str, str] | None = None, components: Sequence[str] | None = None, passbands: Sequence[str | Sequence[float]] | None = None, preprocessing: WaveformPreprocessing | None = None, apply_config_preprocessing_to_processed_files: bool = False, cfg: SpatialVTKConfig | None = None, min_record_length_s: float | None = None, min_end_after_origin_s: float | None = None, snr_threshold: float | None = None, arrival_pick_catalog: DataFrame | str | Path | None = None, onset_phase: str = 'P', min_onset_pick_probability: float = 0.0, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25, return_result: bool = True) DataFrame
Build observed/synthetic waveform QC rows from event-station records.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with waveform path columns.
sources (collections.abc.Sequence, optional) – Source labels to inspect.
waveform_path_columns (dict[str, str] | None, optional) – Optional mapping from source label to waveform path column.
components (collections.abc.Sequence[str] | None, optional) – Optional component and period-band selections. When omitted, the active metric settings are used.
passbands (collections.abc.Sequence[str | collections.abc.Sequence[float]] | None, optional) – Optional component and period-band selections. When omitted, the active metric settings are used.
preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional preprocessing applied before QC. When omitted and a processed waveform column is used, no extra filtering is applied.
apply_config_preprocessing_to_processed_files (bool, optional) – Whether processed waveform columns should still use config preprocessing during QC.
cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active config is used.
min_record_length_s (float | None, optional) – Optional QC threshold overrides. Missing values are read from
qc.automaticin the config.min_end_after_origin_s (float | None, optional) – Optional QC threshold overrides. Missing values are read from
qc.automaticin the config.snr_threshold (float | None, optional) – Optional QC threshold overrides. Missing values are read from
qc.automaticin the config.arrival_pick_catalog (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional PhaseNet-style pick catalog used to anchor QC windows.
onset_phase (str, optional) – Pick phase used as the QC onset when available.
min_onset_pick_probability (float, optional) – Minimum picker probability accepted for the QC onset pick.
verbose (bool, optional) – Print progress messages while loading waveforms and building QC rows.
progress_interval (int, optional) – Number of event-station records between progress messages.
checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where the combined waveform QC summary is written. Per-source intermediate checkpoints are written next to this path.
resume (bool, optional) – When true, existing per-source checkpoints are used to skip completed event/station/component groups.
checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.
return_result (bool, optional) – When false, write per-source checkpoints and combine them on disk into
checkpoint_pathinstead of keeping all source QC rows in memory. This is intended for Slurm workers on large inventories.event_station_records – Required function argument.
sources – Optional function argument. Defaults to
('observed', 'synthetic').waveform_path_columns – Optional function argument. Defaults to
None.components – Optional function argument. Defaults to
None.passbands – Optional function argument. Defaults to
None.preprocessing – Optional function argument. Defaults to
None.apply_config_preprocessing_to_processed_files – Optional function argument. Defaults to
False.cfg – Optional function argument. Defaults to
None.min_record_length_s – Optional function argument. Defaults to
None.min_end_after_origin_s – Optional function argument. Defaults to
None.snr_threshold – Optional function argument. Defaults to
None.arrival_pick_catalog – Optional function argument. Defaults to
None.onset_phase – Optional function argument. Defaults to
'P'.min_onset_pick_probability – Optional function argument. Defaults to
0.0.verbose – Optional function argument. Defaults to
False.progress_interval – Optional function argument. Defaults to
25.checkpoint_path – Optional function argument. Defaults to
None.resume – Optional function argument. Defaults to
True.checkpoint_interval – Optional function argument. Defaults to
25.return_result – Optional function argument. Defaults to
True.
- Returns:
Side-specific waveform QC rows that can be passed to
build_metric_qc_summary(trace_qc_summary=...).- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build_waveform_trace_qc_summary(event_station_records: DataFrame | str | Path, *, source: str = 'observed', waveform_path_col: str = 'observed_pickle', components: tuple[str, ...] | list[str] = ('Z',), passbands: tuple[str | tuple[float, float], ...] | list[str | tuple[float, float]] | None = None, preprocessing: WaveformPreprocessing | None = None, min_record_length_s: float = 60.0, min_end_after_origin_s: float = 60.0, snr_threshold: float = 3.0, noise_window_min_s: float = 1.0, signal_window_min_s: float = 10.0, noise_gap_s: float = 0.5, signal_gap_s: float = 0.5, origin_tolerance_s: float = 0.5, pre_origin_signal_ratio_threshold: float = 0.5, arrival_pick_catalog: DataFrame | str | Path | None = None, onset_phase: str = 'P', min_onset_pick_probability: float = 0.0, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25) DataFrame
Build side-specific trace QC rows from waveform files.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station records with event IDs, station codes, event origin times, and waveform paths.
source (str, optional) – Source label copied to the output, usually
"observed"or"synthetic".waveform_path_col (str, optional) – Column containing waveform files for this source.
components (tuple[str, ...] | list[str], optional) – Components to inspect.
passbands (tuple[str | tuple[float, float], ...] | list[str | tuple[float, float]] | None, optional) – Period bands to report. When omitted, the public inventory standard bands are used.
preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional waveform preprocessing applied before QC calculations.
min_record_length_s (float, optional) – Minimum trace duration in seconds.
min_end_after_origin_s (float, optional) – Minimum required trace end time relative to event origin.
snr_threshold (float, optional) – Minimum RMS signal-to-noise ratio.
noise_window_min_s (float, optional) – Minimum noise-window length in seconds.
signal_window_min_s (float, optional) – Minimum signal-window length in seconds.
noise_gap_s (float, optional) – Gap between detected onset and the noise window.
signal_gap_s (float, optional) – Gap between detected onset and the signal window.
origin_tolerance_s (float, optional) – Half-width of the origin energy check window.
pre_origin_signal_ratio_threshold (float, optional) – Maximum origin/pre-origin to signal RMS ratio.
arrival_pick_catalog (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional PhaseNet-style pick catalog. When a finite pick exists for the requested onset phase, QC uses it as the signal onset; otherwise QC falls back to the waveform-envelope onset.
onset_phase (str, optional) – Pick phase used to anchor QC noise and signal windows.
min_onset_pick_probability (float, optional) – Minimum picker probability accepted for the QC onset pick.
verbose (bool, optional) – Print progress messages while loading waveform files and building rows.
progress_interval (int, optional) – Number of event-station records between progress messages.
checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where intermediate QC rows are written.
resume (bool, optional) – When true and
checkpoint_pathexists, skip event/station/component groups already present in that checkpoint.checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.
event_station_records – Required function argument.
source – Optional function argument. Defaults to
'observed'.waveform_path_col – Optional function argument. Defaults to
'observed_pickle'.components – Optional function argument. Defaults to
('Z',).passbands – Optional function argument. Defaults to
None.preprocessing – Optional function argument. Defaults to
None.min_record_length_s – Optional function argument. Defaults to
60.0.min_end_after_origin_s – Optional function argument. Defaults to
60.0.snr_threshold – Optional function argument. Defaults to
3.0.noise_window_min_s – Optional function argument. Defaults to
1.0.signal_window_min_s – Optional function argument. Defaults to
10.0.noise_gap_s – Optional function argument. Defaults to
0.5.signal_gap_s – Optional function argument. Defaults to
0.5.origin_tolerance_s – Optional function argument. Defaults to
0.5.pre_origin_signal_ratio_threshold – Optional function argument. Defaults to
0.5.arrival_pick_catalog – Optional function argument. Defaults to
None.onset_phase – Optional function argument. Defaults to
'P'.min_onset_pick_probability – Optional function argument. Defaults to
0.0.verbose – Optional function argument. Defaults to
False.progress_interval – Optional function argument. Defaults to
25.checkpoint_path – Optional function argument. Defaults to
None.resume – Optional function argument. Defaults to
True.checkpoint_interval – Optional function argument. Defaults to
25.
- Returns:
Metric-QC-compatible rows with one row per source/event/station/component/passband.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.classify_station_family(network: str, station: str) str
Classify one station as broadband, strong-motion, or unknown.
- Parameters:
network (str) – Required function argument.
station (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str
- spatial_vtk.qc.companion_rows_from_master(master_rows: list[dict[str, object]] | DataFrame, inventory_bands: list[tuple[str, float, float]] | None = None) list[dict[str, object]]
Build per-event QC companion rows from master inventory rows.
- Parameters:
master_rows (list[dict[str, object]] | pandas.DataFrame) – Full master inventory rows.
inventory_bands (list[tuple[str, float, float]] | None, optional) – Inventory passbands as
(label, period_min, period_max)tuples.master_rows – Required function argument.
inventory_bands – Optional function argument. Defaults to
None.
- Returns:
Per-event, per-variant, per-band summaries with distance-bin counts.
- Return type:
list of dict
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.determine_available_components(stream: Any, station: str, components: tuple[str, ...] = ('N', 'E', 'Z', 'R', 'T')) list[str]
Determine which requested components are available for one station.
- Parameters:
stream (Any) – ObsPy stream-like iterable with trace
stats.stationandstats.channelattributes.station (str) – Station code to inspect.
components (tuple, optional) – Component suffixes to search for.
stream – Required function argument.
station – Required function argument.
components – Optional function argument. Defaults to
('N', 'E', 'Z', 'R', 'T').
- Returns:
Components present in the stream for the station.
- Return type:
list of str
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.discover_event_ids(*roots: str | Path) list[str]
Discover candidate event IDs from one or more observed-data roots.
- Parameters:
*roots – Directories containing event files or event subdirectories.
roots (str | pathlib.Path) – Additional positional arguments passed to the function.
- Returns:
Sorted unique event identifiers.
- Return type:
list of str
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.export_manual_review_queue(qc_summary: DataFrame | str | Path, output_path: str | Path | None = None, *, cfg: SpatialVTKConfig | None = None) Path
Write a manual-review queue from QC summary rows.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Metric QC summary table.
output_path (str | pathlib.Path | None, optional) – CSV or JSON output path. When omitted, the standard
manual_review_queueoutput path is resolved from the active config.cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object used when resolving the default output path.
qc_summary – Required function argument.
output_path – Optional function argument. Defaults to
None.cfg – Optional function argument. Defaults to
None.
- Returns:
Written queue path.
- Return type:
pathlib.Path
- Returns:
Return value produced by the function.
- Return type:
pathlib.Path
- spatial_vtk.qc.filter_trace_summary(df: DataFrame, *, event_id: str | None = None, station: str | None = None, component: str | None = None, accepted: bool | None = None, reject_reason_contains: str | None = None) DataFrame
Filter a trace-summary table using common review fields.
- Parameters:
df (pandas.DataFrame) – Required function argument.
event_id (str | None, optional) – Optional function argument. Defaults to
None.station (str | None, optional) – Optional function argument. Defaults to
None.component (str | None, optional) – Optional function argument. Defaults to
None.accepted (bool | None, optional) – Optional function argument. Defaults to
None.reject_reason_contains (str | None, optional) – Optional function argument. Defaults to
None.
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.global_trace_reject_reasons(*, record_length_s: float, end_rel_s: float, onset_reasons: list[str], min_end_after_origin_s: float, min_record_length_s: float) tuple[bool, list[str]]
Apply station-level reject rules shared by every passband.
- Parameters:
record_length_s (float) – Required function argument.
end_rel_s (float) – Required function argument.
onset_reasons (list) – Required function argument.
min_end_after_origin_s (float) – Required function argument.
min_record_length_s (float) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple
- spatial_vtk.qc.load_trace_inventory_lookup(csv_path: Path | str | None) TraceInventoryLookup
Load one master inventory CSV into a normalized lookup dictionary.
- Parameters:
csv_path (pathlib.Path | str | None) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
- spatial_vtk.qc.queue_rows_from_filtered_trace_df(df: DataFrame, *, key_columns: tuple[str, ...] = ('event_id', 'station', 'component'), status: str = 'pending') list[dict[str, object]]
Convert a filtered trace table into manual-review queue rows.
- Parameters:
df (pandas.DataFrame) – Required function argument.
key_columns (tuple, optional) – Optional function argument. Defaults to
('event_id', 'station', 'component').status (str, optional) – Optional function argument. Defaults to
'pending'.
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.reject_passband(*, global_reasons: list[str], snr_rms: float, snr_threshold: float, noise_window_valid: bool, signal_window_valid: bool, pre_origin_window_valid: bool, pre_origin_signal_ratio: float, pre_origin_signal_ratio_threshold: float, origin_window_valid: bool, origin_signal_ratio: float) tuple[bool, list[str]]
Apply passband-specific reject rules and merge shared global reasons.
- Parameters:
global_reasons (list) – Required function argument.
snr_rms (float) – Required function argument.
snr_threshold (float) – Required function argument.
noise_window_valid (bool) – Required function argument.
signal_window_valid (bool) – Required function argument.
pre_origin_window_valid (bool) – Required function argument.
pre_origin_signal_ratio (float) – Required function argument.
pre_origin_signal_ratio_threshold (float) – Required function argument.
origin_window_valid (bool) – Required function argument.
origin_signal_ratio (float) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple
- spatial_vtk.qc.trace_passband_is_accepted(lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str, station: str, component: str, passband_label: str | None = None, period_min: float | None = None, period_max: float | None = None) bool
Return whether one trace is accepted for one passband request.
- Parameters:
lookup (dict) – Required function argument.
observed_variant (str) – Required function argument.
event_id (str) – Required function argument.
station (str) – Required function argument.
component (str) – Required function argument.
passband_label (str | None, optional) – Optional function argument. Defaults to
None.period_min (float | None, optional) – Optional function argument. Defaults to
None.period_max (float | None, optional) – Optional function argument. Defaults to
None.
- Returns:
Return value produced by the function.
- Return type:
bool
Build
Inventory lookup and filtering helpers for QC workflows.
- class spatial_vtk.qc.build.filtering.InventoryBandSpec(label: str, period_min: float, period_max: float)
Describe one passband present in a trace-inventory CSV.
- class spatial_vtk.qc.build.filtering.TraceInventoryLookup(rows: dict[tuple[str, str, str, str, str], dict[str, str]] | None = None, *, csv_path: Path | None = None, available_bands: list[InventoryBandSpec] | None = None, disabled_requested_bands: set[str] | None = None)
Dictionary lookup with trace-inventory metadata.
- spatial_vtk.qc.build.filtering.band_key_from_label(label: str) str
Convert a period-band label to a stable CSV column suffix.
- Parameters:
label (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str
- spatial_vtk.qc.build.filtering.band_label_from_key(key: str) str | None
Convert one inventory CSV suffix to a period-band label.
- Parameters:
key (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str | None
- spatial_vtk.qc.build.filtering.event_station_has_any_accepted_component(lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str, station: str, components: tuple[str, ...] | list[str], period_min: float | None = None, period_max: float | None = None) bool
Return whether any requested component survives inventory QC.
- Parameters:
lookup (dict) – Required function argument.
observed_variant (str) – Required function argument.
event_id (str) – Required function argument.
station (str) – Required function argument.
components (tuple[str, ...] | list[str]) – Required function argument.
period_min (float | None, optional) – Optional function argument. Defaults to
None.period_max (float | None, optional) – Optional function argument. Defaults to
None.
- Returns:
Return value produced by the function.
- Return type:
bool
- spatial_vtk.qc.build.filtering.filter_stream_by_inventory(stream: Any, lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str) Any
Filter a stream to traces with at least one accepted inventory band.
- Parameters:
stream (Any) – Required function argument.
lookup (dict) – Required function argument.
observed_variant (str) – Required function argument.
event_id (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
Any
- spatial_vtk.qc.build.filtering.inventory_lookup_key(observed_variant: str, event_id: str, station: str, component: str, passband_label: str) tuple[str, str, str, str, str]
Build one normalized inventory key tuple.
- Parameters:
observed_variant (str) – Required function argument.
event_id (str) – Required function argument.
station (str) – Required function argument.
component (str) – Required function argument.
passband_label (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple
- spatial_vtk.qc.build.filtering.load_trace_inventory_lookup(csv_path: Path | str | None) TraceInventoryLookup
Load one master inventory CSV into a normalized lookup dictionary.
- Parameters:
csv_path (pathlib.Path | str | None) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
- spatial_vtk.qc.build.filtering.normalize_band_label(label: str | None) str
Normalize one inventory band label.
- Parameters:
label (str | None) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str
- spatial_vtk.qc.build.filtering.normalize_component(component: str | None) str
Normalize one component code.
- Parameters:
component (str | None) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str
- spatial_vtk.qc.build.filtering.normalize_event_id(event_id: object) str
Normalize one event identifier for inventory lookup keys.
- Parameters:
event_id (object) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str
- spatial_vtk.qc.build.filtering.normalize_observed_variant(text: str | None) str
Normalize one observed-variant label.
- Parameters:
text (str | None) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str
- spatial_vtk.qc.build.filtering.normalize_station_code(station: object) str
Normalize one station code for inventory lookup keys.
- Parameters:
station (object) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str
- spatial_vtk.qc.build.filtering.parse_period_band_label(label: str) tuple[float, float] | None
Parse a period-band label into
(period_min, period_max).- Parameters:
label (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple[float, float] | None
- spatial_vtk.qc.build.filtering.period_band_label(period_min: float | None, period_max: float | None) str | None
Return the canonical period-band label for one requested band.
- Parameters:
period_min (float | None) – Required function argument.
period_max (float | None) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str | None
- spatial_vtk.qc.build.filtering.relevant_inventory_bands(period_min: float | None, period_max: float | None, *, available_bands: list[InventoryBandSpec] | None = None) list[str]
Resolve which inventory bands are relevant for one requested band.
- Parameters:
period_min (float | None) – Required function argument.
period_max (float | None) – Required function argument.
available_bands (list[spatial_vtk.qc.build.filtering.InventoryBandSpec] | None, optional) – Optional function argument. Defaults to
None.
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.build.filtering.trace_has_any_accepted_passband(lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str, station: str, component: str) bool
Return whether one trace has at least one accepted inventory passband.
- Parameters:
lookup (dict) – Required function argument.
observed_variant (str) – Required function argument.
event_id (str) – Required function argument.
station (str) – Required function argument.
component (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
bool
- spatial_vtk.qc.build.filtering.trace_passband_is_accepted(lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str, station: str, component: str, passband_label: str | None = None, period_min: float | None = None, period_max: float | None = None) bool
Return whether one trace is accepted for one passband request.
- Parameters:
lookup (dict) – Required function argument.
observed_variant (str) – Required function argument.
event_id (str) – Required function argument.
station (str) – Required function argument.
component (str) – Required function argument.
passband_label (str | None, optional) – Optional function argument. Defaults to
None.period_min (float | None, optional) – Optional function argument. Defaults to
None.period_max (float | None, optional) – Optional function argument. Defaults to
None.
- Returns:
Return value produced by the function.
- Return type:
bool
Inventory helpers for quality-control dataset construction.
- spatial_vtk.qc.build.inventory.build_trace_inventory(event_streams: dict[str, Any] | list[dict[str, Any]] | DataFrame, *, observed_variant: str = 'nonrotated', inventory_bands: list[tuple[str, float, float]] | None = None, min_record_length_s: float = 80.0, min_end_after_origin_s: float = 60.0, station_metadata: DataFrame | None = None, event_metadata: DataFrame | None = None) DataFrame
Build a QC trace inventory from waveform streams or trace metadata.
- Parameters:
event_streams (dict[str, Any] | list[dict[str, Any]] | pandas.DataFrame) – Mapping from event ID to stream-like objects, list of records with
event_idandstreamfields, or precomputed trace metadata.observed_variant (str, optional) – Label for the observed-data variant.
inventory_bands (list[tuple[str, float, float]] | None, optional) – Inventory passbands as
(label, period_min, period_max)tuples.min_record_length_s (float, optional) – Minimum accepted trace length in seconds.
min_end_after_origin_s (float, optional) – Minimum accepted trace end time relative to origin.
station_metadata (pandas.DataFrame | None, optional) – Optional metadata tables joined into the inventory.
event_metadata (pandas.DataFrame | None, optional) – Optional metadata tables joined into the inventory.
event_streams – Required function argument.
observed_variant – Optional function argument. Defaults to
'nonrotated'.inventory_bands – Optional function argument. Defaults to
None.min_record_length_s – Optional function argument. Defaults to
80.0.min_end_after_origin_s – Optional function argument. Defaults to
60.0.station_metadata – Optional function argument. Defaults to
None.event_metadata – Optional function argument. Defaults to
None.
- Returns:
One row per trace with passband reject flags and reject reasons.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.inventory.build_waveform_trace_qc_summary(event_station_records: DataFrame | str | Path, *, source: str = 'observed', waveform_path_col: str = 'observed_pickle', components: tuple[str, ...] | list[str] = ('Z',), passbands: tuple[str | tuple[float, float], ...] | list[str | tuple[float, float]] | None = None, preprocessing: WaveformPreprocessing | None = None, min_record_length_s: float = 60.0, min_end_after_origin_s: float = 60.0, snr_threshold: float = 3.0, noise_window_min_s: float = 1.0, signal_window_min_s: float = 10.0, noise_gap_s: float = 0.5, signal_gap_s: float = 0.5, origin_tolerance_s: float = 0.5, pre_origin_signal_ratio_threshold: float = 0.5, arrival_pick_catalog: DataFrame | str | Path | None = None, onset_phase: str = 'P', min_onset_pick_probability: float = 0.0, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25) DataFrame
Build side-specific trace QC rows from waveform files.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station records with event IDs, station codes, event origin times, and waveform paths.
source (str, optional) – Source label copied to the output, usually
"observed"or"synthetic".waveform_path_col (str, optional) – Column containing waveform files for this source.
components (tuple[str, ...] | list[str], optional) – Components to inspect.
passbands (tuple[str | tuple[float, float], ...] | list[str | tuple[float, float]] | None, optional) – Period bands to report. When omitted, the public inventory standard bands are used.
preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional waveform preprocessing applied before QC calculations.
min_record_length_s (float, optional) – Minimum trace duration in seconds.
min_end_after_origin_s (float, optional) – Minimum required trace end time relative to event origin.
snr_threshold (float, optional) – Minimum RMS signal-to-noise ratio.
noise_window_min_s (float, optional) – Minimum noise-window length in seconds.
signal_window_min_s (float, optional) – Minimum signal-window length in seconds.
noise_gap_s (float, optional) – Gap between detected onset and the noise window.
signal_gap_s (float, optional) – Gap between detected onset and the signal window.
origin_tolerance_s (float, optional) – Half-width of the origin energy check window.
pre_origin_signal_ratio_threshold (float, optional) – Maximum origin/pre-origin to signal RMS ratio.
arrival_pick_catalog (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional PhaseNet-style pick catalog. When a finite pick exists for the requested onset phase, QC uses it as the signal onset; otherwise QC falls back to the waveform-envelope onset.
onset_phase (str, optional) – Pick phase used to anchor QC noise and signal windows.
min_onset_pick_probability (float, optional) – Minimum picker probability accepted for the QC onset pick.
verbose (bool, optional) – Print progress messages while loading waveform files and building rows.
progress_interval (int, optional) – Number of event-station records between progress messages.
checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where intermediate QC rows are written.
resume (bool, optional) – When true and
checkpoint_pathexists, skip event/station/component groups already present in that checkpoint.checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.
event_station_records – Required function argument.
source – Optional function argument. Defaults to
'observed'.waveform_path_col – Optional function argument. Defaults to
'observed_pickle'.components – Optional function argument. Defaults to
('Z',).passbands – Optional function argument. Defaults to
None.preprocessing – Optional function argument. Defaults to
None.min_record_length_s – Optional function argument. Defaults to
60.0.min_end_after_origin_s – Optional function argument. Defaults to
60.0.snr_threshold – Optional function argument. Defaults to
3.0.noise_window_min_s – Optional function argument. Defaults to
1.0.signal_window_min_s – Optional function argument. Defaults to
10.0.noise_gap_s – Optional function argument. Defaults to
0.5.signal_gap_s – Optional function argument. Defaults to
0.5.origin_tolerance_s – Optional function argument. Defaults to
0.5.pre_origin_signal_ratio_threshold – Optional function argument. Defaults to
0.5.arrival_pick_catalog – Optional function argument. Defaults to
None.onset_phase – Optional function argument. Defaults to
'P'.min_onset_pick_probability – Optional function argument. Defaults to
0.0.verbose – Optional function argument. Defaults to
False.progress_interval – Optional function argument. Defaults to
25.checkpoint_path – Optional function argument. Defaults to
None.resume – Optional function argument. Defaults to
True.checkpoint_interval – Optional function argument. Defaults to
25.
- Returns:
Metric-QC-compatible rows with one row per source/event/station/component/passband.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.inventory.companion_rows_from_master(master_rows: list[dict[str, object]] | DataFrame, inventory_bands: list[tuple[str, float, float]] | None = None) list[dict[str, object]]
Build per-event QC companion rows from master inventory rows.
- Parameters:
master_rows (list[dict[str, object]] | pandas.DataFrame) – Full master inventory rows.
inventory_bands (list[tuple[str, float, float]] | None, optional) – Inventory passbands as
(label, period_min, period_max)tuples.master_rows – Required function argument.
inventory_bands – Optional function argument. Defaults to
None.
- Returns:
Per-event, per-variant, per-band summaries with distance-bin counts.
- Return type:
list of dict
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.build.inventory.determine_available_components(stream: Any, station: str, components: tuple[str, ...] = ('N', 'E', 'Z', 'R', 'T')) list[str]
Determine which requested components are available for one station.
- Parameters:
stream (Any) – ObsPy stream-like iterable with trace
stats.stationandstats.channelattributes.station (str) – Station code to inspect.
components (tuple, optional) – Component suffixes to search for.
stream – Required function argument.
station – Required function argument.
components – Optional function argument. Defaults to
('N', 'E', 'Z', 'R', 'T').
- Returns:
Components present in the stream for the station.
- Return type:
list of str
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.build.inventory.discover_event_ids(*roots: str | Path) list[str]
Discover candidate event IDs from one or more observed-data roots.
- Parameters:
*roots – Directories containing event files or event subdirectories.
roots (str | pathlib.Path) – Additional positional arguments passed to the function.
- Returns:
Sorted unique event identifiers.
- Return type:
list of str
- Returns:
Return value produced by the function.
- Return type:
list
Spectral QC helpers for PSA and FAS period support.
Spectral Overview
This module evaluates which spectral periods are usable for observed and synthetic traces without requiring a pre-event noise window. It uses relative spectral amplitude, physical period support, and synthetic max-frequency limits.
Spectral Examples
- Find valid FAS periods for a synthetic trace:
qc = qc_fas_periods(trace, dt=0.02, periods_s=[1, 2, 5], synthetic_max_frequency_hz=1.0, source="synthetic")
- spatial_vtk.qc.build.spectral.qc_fas_periods(trace: Any, *, dt: float, periods_s: Sequence[float], threshold: float = 0.25, min_cycles_in_record: float = 3.0, synthetic_max_frequency_hz: float | None = None, source: str = 'observed', disable_relative_amplitude_qc: bool = False) DataFrame
QC FAS values on a requested period grid.
- Parameters:
trace (Any) – Waveform samples or trace-like object.
dt (float) – Sample interval in seconds.
periods_s (Sequence) – Requested periods.
threshold (float, optional) – Relative spectral support threshold.
min_cycles_in_record (float, optional) – Minimum cycles required in the record.
synthetic_max_frequency_hz (float | None, optional) – Optional synthetic maximum valid frequency.
source (str, optional) –
"observed"or"synthetic"for status reasons.disable_relative_amplitude_qc (bool, optional) – Whether to skip relative amplitude support.
trace – Required function argument.
dt – Required function argument.
periods_s – Required function argument.
threshold – Optional function argument. Defaults to
0.25.min_cycles_in_record – Optional function argument. Defaults to
3.0.synthetic_max_frequency_hz – Optional function argument. Defaults to
None.source – Optional function argument. Defaults to
'observed'.disable_relative_amplitude_qc – Optional function argument. Defaults to
False.
- Returns:
Period-level QC rows with FAS amplitudes and pass/fail status.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.spectral.qc_psa_periods(trace: Any, *, dt: float, periods_s: Sequence[float], threshold: float = 0.25, damping: float = 0.05, min_cycles_in_record: float = 3.0, synthetic_max_frequency_hz: float | None = None, source: str = 'observed', disable_relative_amplitude_qc: bool = False) DataFrame
QC PSA values on a requested period grid.
- Parameters:
trace (Any) – Required function argument.
dt (float) – Required function argument.
periods_s (Sequence) – Required function argument.
threshold (float, optional) – Optional function argument. Defaults to
0.25.damping (float, optional) – Optional function argument. Defaults to
0.05.min_cycles_in_record (float, optional) – Optional function argument. Defaults to
3.0.synthetic_max_frequency_hz (float | None, optional) – Optional function argument. Defaults to
None.source (str, optional) – Optional function argument. Defaults to
'observed'.disable_relative_amplitude_qc (bool, optional) – Optional function argument. Defaults to
False.
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.spectral.spectral_relative_amplitude_mask(periods_s: Sequence[float], amplitudes: Sequence[float], *, threshold: float = 0.25, min_period_s: float | None = None, max_period_s: float | None = None, disable_relative_amplitude_qc: bool = False) ndarray
Return valid periods based on relative spectral amplitude.
- Parameters:
periods_s (Sequence) – Period grid in seconds.
amplitudes (Sequence) – Spectral amplitudes aligned to
periods_s.threshold (float, optional) – Minimum fraction of the maximum finite amplitude required.
min_period_s (float | None, optional) – Optional hard period bounds.
max_period_s (float | None, optional) – Optional hard period bounds.
disable_relative_amplitude_qc (bool, optional) – Whether to skip the relative-amplitude threshold.
periods_s – Required function argument.
amplitudes – Required function argument.
threshold – Optional function argument. Defaults to
0.25.min_period_s – Optional function argument. Defaults to
None.max_period_s – Optional function argument. Defaults to
None.disable_relative_amplitude_qc – Optional function argument. Defaults to
False.
- Returns:
Boolean validity mask.
- Return type:
numpy.ndarray
- Returns:
Return value produced by the function.
- Return type:
numpy.ndarray
- spatial_vtk.qc.build.spectral.spectral_valid_period_bounds(periods_s: Sequence[float], valid_mask: Sequence[bool]) tuple[float | None, float | None]
Return minimum and maximum valid period from a validity mask.
- Parameters:
periods_s (Sequence) – Required function argument.
valid_mask (Sequence) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple
High-level QC table builders for public workflows.
Workflow Overview
This module creates the standard QC tables used by notebooks, CLI commands, figures, metric filtering, dashboards, and manual-review exports.
- spatial_vtk.qc.build.workflow.build_comparison_eligibility(qc_summary: DataFrame | str | Path) DataFrame
Return rows where observed and synthetic QC both pass.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific QC summary table.
qc_summary – Required function argument.
- Returns:
Comparison-eligible event/station/component/passband/metric rows.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.workflow.build_metric_pair_retention_table(qc_summary: DataFrame | str | Path, *, group_cols: Sequence[str] = ('metric', 'passband')) DataFrame
Summarize post-QC observed/synthetic pair retention.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific metric QC table with observed and synthetic rows.
group_cols (collections.abc.Sequence, optional) – Columns used to group retention percentages, usually metric and passband.
qc_summary – Required function argument.
group_cols – Optional function argument. Defaults to
('metric', 'passband').
- Returns:
Pair-retention rows with total pairs before QC, retained pairs after QC, and retained percentage.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.workflow.build_metric_qc_summary(event_station_records: DataFrame | str | Path, *, metrics: Sequence[str], components: Sequence[str], passbands: Sequence[str | Sequence[float]], spectral_periods_s: Sequence[float] = (), sources: Sequence[str] = ('observed', 'synthetic'), synthetic_max_frequency_hz: float | None = None, observed_available: bool = True, synthetic_available: bool = True, trace_qc_summary: DataFrame | str | Path | None = None, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25, return_result: bool = True) DataFrame
Build a side-specific metric QC summary from event-station records.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with at least
event_idandstationcolumns.metrics (collections.abc.Sequence) – Requested metrics, components, and period bands.
components (collections.abc.Sequence) – Requested metrics, components, and period bands.
passbands (collections.abc.Sequence) – Requested metrics, components, and period bands.
spectral_periods_s (collections.abc.Sequence, optional) – Periods to check for spectral metrics.
sources (collections.abc.Sequence, optional) – Sources to include, usually
observedandsynthetic.synthetic_max_frequency_hz (float | None, optional) – Maximum valid synthetic frequency. Synthetic spectral periods shorter than
1 / synthetic_max_frequency_hzfail QC.observed_available (bool, optional) – Default availability values when the input table does not include source-specific availability columns.
synthetic_available (bool, optional) – Default availability values when the input table does not include source-specific availability columns.
trace_qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional side-specific waveform QC table. When provided, failed source/event/station/component/passband rows fail matching metric rows.
verbose (bool, optional) – Print progress messages while building QC rows.
progress_interval (int, optional) – Number of event-station records between progress messages.
checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where intermediate QC rows are written.
resume (bool, optional) – When true and
checkpoint_pathexists, skip event/station records already present in that checkpoint.checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.
return_result (bool, optional) – When false, append checkpoint rows on disk and return an empty dataframe instead of loading/returning the full QC inventory. This is intended for large Slurm jobs.
event_station_records – Required function argument.
metrics – Required function argument.
components – Required function argument.
passbands – Required function argument.
spectral_periods_s – Optional function argument. Defaults to
().sources – Optional function argument. Defaults to
('observed', 'synthetic').synthetic_max_frequency_hz – Optional function argument. Defaults to
None.observed_available – Optional function argument. Defaults to
True.synthetic_available – Optional function argument. Defaults to
True.trace_qc_summary – Optional function argument. Defaults to
None.verbose – Optional function argument. Defaults to
False.progress_interval – Optional function argument. Defaults to
25.checkpoint_path – Optional function argument. Defaults to
None.resume – Optional function argument. Defaults to
True.checkpoint_interval – Optional function argument. Defaults to
25.return_result – Optional function argument. Defaults to
True.
- Returns:
Standard metric QC rows.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.workflow.build_post_qc_record_table(event_station_records: DataFrame | str | Path, *, events: DataFrame | str | Path | None = None, qc_summary: DataFrame | str | Path | None = None) DataFrame
Build station-event records for post-QC map figures.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with station coordinates.
events (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional event metadata with event coordinates.
qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional QC summary used to assign pass/fail status per event/station.
event_station_records – Required function argument.
events – Optional function argument. Defaults to
None.qc_summary – Optional function argument. Defaults to
None.
- Returns:
Records with
sta_lat,sta_lon, event coordinates, andqc_status.- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.workflow.build_qc_availability_table(event_station_records: DataFrame | str | Path, *, qc_summary: DataFrame | str | Path | None = None, qc_aggregate: str = 'all_pass', observed_root: str | Path | None = None, synthetic_root: str | Path | None = None, observed_inventory: DataFrame | str | Path | None = None, synthetic_inventory: DataFrame | str | Path | None = None, cfg: SpatialVTKConfig | None = None) DataFrame
Build observed/synthetic availability rows for QC figures.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table.
qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional side-specific QC summary. When provided, availability is based on post-QC retained rows instead of file presence.
qc_aggregate (str, optional) –
"all_pass"marks a side available only if all matching QC rows pass."any_pass"marks a side available when at least one matching row passes.observed_root (str | pathlib.Path | None, optional) – Optional roots used to inventory files when inventory tables are not already available. When omitted, the active config’s
paths.observed_rootandpaths.synthetic_rootare used when present.synthetic_root (str | pathlib.Path | None, optional) – Optional roots used to inventory files when inventory tables are not already available. When omitted, the active config’s
paths.observed_rootandpaths.synthetic_rootare used when present.observed_inventory (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional file inventory tables.
synthetic_inventory (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional file inventory tables.
cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object used to resolve default roots.
event_station_records – Required function argument.
qc_summary – Optional function argument. Defaults to
None.qc_aggregate – Optional function argument. Defaults to
'all_pass'.observed_root – Optional function argument. Defaults to
None.synthetic_root – Optional function argument. Defaults to
None.observed_inventory – Optional function argument. Defaults to
None.synthetic_inventory – Optional function argument. Defaults to
None.cfg – Optional function argument. Defaults to
None.
- Returns:
Availability table with one row per event/station.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.workflow.build_qc_waveform_comparison_records(event_station_records: DataFrame | str | Path, qc_summary: DataFrame | str | Path | None = None, *, comparison_eligible: DataFrame | str | Path | None = None, component: str = 'Z', passband: str | None = None, event_id: str | list[str] | tuple[str, ...] | None = None, max_distance_km: float | None = 50.0, max_records: int | None = 12, observed_waveform_col: str = 'observed_processed_waveform', synthetic_waveform_col: str = 'synthetic_processed_waveform') DataFrame
Build post-QC waveform rows for observed/synthetic visual inspection.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Prepared event-station table with waveform paths.
qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Side-specific QC table. Used to build comparison-eligible rows when
comparison_eligibleis not supplied.comparison_eligible (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional precomputed output from
build_comparison_eligibility().component (str, optional) – Component to load for the visual comparison.
passband (str | None, optional) – Optional retained passband to select.
event_id (str | list[str] | tuple[str, ...] | None, optional) – Optional event ID or IDs to select before loading waveforms.
max_distance_km (float | None, optional) – Optional distance limit in kilometers.
max_records (int | None, optional) – Optional maximum retained event-station rows to load.
observed_waveform_col (str, optional) – Waveform path columns in
event_station_records.synthetic_waveform_col (str, optional) – Waveform path columns in
event_station_records.event_station_records – Required function argument.
qc_summary – Optional function argument. Defaults to
None.comparison_eligible – Optional function argument. Defaults to
None.component – Optional function argument. Defaults to
'Z'.passband – Optional function argument. Defaults to
None.event_id – Optional function argument. Defaults to
None.max_distance_km – Optional function argument. Defaults to
50.0.max_records – Optional function argument. Defaults to
12.observed_waveform_col – Optional function argument. Defaults to
'observed_processed_waveform'.synthetic_waveform_col – Optional function argument. Defaults to
'synthetic_processed_waveform'.
- Returns:
Rows with observed and synthetic trace objects, sample intervals, event-origin offsets, and distance metadata.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.workflow.build_retention_figure_table(qc_summary: DataFrame | str | Path) DataFrame
Prepare QC rows for retention summary figures.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Metric QC summary table.
qc_summary – Required function argument.
- Returns:
Copy with
stageset to passband labels.- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.workflow.build_waveform_qc_summary(event_station_records: DataFrame | str | Path, *, sources: Sequence[str] = ('observed', 'synthetic'), waveform_path_columns: dict[str, str] | None = None, components: Sequence[str] | None = None, passbands: Sequence[str | Sequence[float]] | None = None, preprocessing: WaveformPreprocessing | None = None, apply_config_preprocessing_to_processed_files: bool = False, cfg: SpatialVTKConfig | None = None, min_record_length_s: float | None = None, min_end_after_origin_s: float | None = None, snr_threshold: float | None = None, arrival_pick_catalog: DataFrame | str | Path | None = None, onset_phase: str = 'P', min_onset_pick_probability: float = 0.0, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25, return_result: bool = True) DataFrame
Build observed/synthetic waveform QC rows from event-station records.
- Parameters:
event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with waveform path columns.
sources (collections.abc.Sequence, optional) – Source labels to inspect.
waveform_path_columns (dict[str, str] | None, optional) – Optional mapping from source label to waveform path column.
components (collections.abc.Sequence[str] | None, optional) – Optional component and period-band selections. When omitted, the active metric settings are used.
passbands (collections.abc.Sequence[str | collections.abc.Sequence[float]] | None, optional) – Optional component and period-band selections. When omitted, the active metric settings are used.
preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional preprocessing applied before QC. When omitted and a processed waveform column is used, no extra filtering is applied.
apply_config_preprocessing_to_processed_files (bool, optional) – Whether processed waveform columns should still use config preprocessing during QC.
cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active config is used.
min_record_length_s (float | None, optional) – Optional QC threshold overrides. Missing values are read from
qc.automaticin the config.min_end_after_origin_s (float | None, optional) – Optional QC threshold overrides. Missing values are read from
qc.automaticin the config.snr_threshold (float | None, optional) – Optional QC threshold overrides. Missing values are read from
qc.automaticin the config.arrival_pick_catalog (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional PhaseNet-style pick catalog used to anchor QC windows.
onset_phase (str, optional) – Pick phase used as the QC onset when available.
min_onset_pick_probability (float, optional) – Minimum picker probability accepted for the QC onset pick.
verbose (bool, optional) – Print progress messages while loading waveforms and building QC rows.
progress_interval (int, optional) – Number of event-station records between progress messages.
checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where the combined waveform QC summary is written. Per-source intermediate checkpoints are written next to this path.
resume (bool, optional) – When true, existing per-source checkpoints are used to skip completed event/station/component groups.
checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.
return_result (bool, optional) – When false, write per-source checkpoints and combine them on disk into
checkpoint_pathinstead of keeping all source QC rows in memory. This is intended for Slurm workers on large inventories.event_station_records – Required function argument.
sources – Optional function argument. Defaults to
('observed', 'synthetic').waveform_path_columns – Optional function argument. Defaults to
None.components – Optional function argument. Defaults to
None.passbands – Optional function argument. Defaults to
None.preprocessing – Optional function argument. Defaults to
None.apply_config_preprocessing_to_processed_files – Optional function argument. Defaults to
False.cfg – Optional function argument. Defaults to
None.min_record_length_s – Optional function argument. Defaults to
None.min_end_after_origin_s – Optional function argument. Defaults to
None.snr_threshold – Optional function argument. Defaults to
None.arrival_pick_catalog – Optional function argument. Defaults to
None.onset_phase – Optional function argument. Defaults to
'P'.min_onset_pick_probability – Optional function argument. Defaults to
0.0.verbose – Optional function argument. Defaults to
False.progress_interval – Optional function argument. Defaults to
25.checkpoint_path – Optional function argument. Defaults to
None.resume – Optional function argument. Defaults to
True.checkpoint_interval – Optional function argument. Defaults to
25.return_result – Optional function argument. Defaults to
True.
- Returns:
Side-specific waveform QC rows that can be passed to
build_metric_qc_summary(trace_qc_summary=...).- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.build.workflow.export_manual_review_queue(qc_summary: DataFrame | str | Path, output_path: str | Path | None = None, *, cfg: SpatialVTKConfig | None = None) Path
Write a manual-review queue from QC summary rows.
- Parameters:
qc_summary (pandas.DataFrame | str | pathlib.Path) – Metric QC summary table.
output_path (str | pathlib.Path | None, optional) – CSV or JSON output path. When omitted, the standard
manual_review_queueoutput path is resolved from the active config.cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object used when resolving the default output path.
qc_summary – Required function argument.
output_path – Optional function argument. Defaults to
None.cfg – Optional function argument. Defaults to
None.
- Returns:
Written queue path.
- Return type:
pathlib.Path
- Returns:
Return value produced by the function.
- Return type:
pathlib.Path
Review
Table helpers for manual QC review queues.
Review Overview
This module owns small public table operations for manual quality-control review: queue creation, decision CSV loading/writing, and applying manual decisions to automated inventory rows.
- spatial_vtk.qc.review.tables.apply_manual_qc_decisions(inventory_df: DataFrame, decisions: DataFrame | str | Path | None, *, band_columns: tuple[str, ...] | None = None) DataFrame
Apply manual accept/reject decisions to an automated inventory table.
- Parameters:
inventory_df (pandas.DataFrame) – Automated QC inventory.
decisions (pandas.DataFrame | str | pathlib.Path | None) – Decision table or CSV path.
band_columns (tuple[str, ...] | None, optional) – Optional reject-column suffixes. When omitted, columns named
reject_*are inferred.inventory_df – Required function argument.
decisions – Required function argument.
band_columns – Optional function argument. Defaults to
None.
- Returns:
Inventory copy with manual decision columns applied.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.review.tables.decision_key(event_id: object, station: object, component: object, scope_kind: object, scope_label: object) tuple[str, str, str, str, str]
Build the normalized key for one manual QC decision.
- Parameters:
event_id (object) – Required function argument.
station (object) – Required function argument.
component (object) – Required function argument.
scope_kind (object) – Required function argument.
scope_label (object) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple
- spatial_vtk.qc.review.tables.filter_trace_summary(df: DataFrame, *, event_id: str | None = None, station: str | None = None, component: str | None = None, accepted: bool | None = None, reject_reason_contains: str | None = None) DataFrame
Filter a trace-summary table using common review fields.
- Parameters:
df (pandas.DataFrame) – Required function argument.
event_id (str | None, optional) – Optional function argument. Defaults to
None.station (str | None, optional) – Optional function argument. Defaults to
None.component (str | None, optional) – Optional function argument. Defaults to
None.accepted (bool | None, optional) – Optional function argument. Defaults to
None.reject_reason_contains (str | None, optional) – Optional function argument. Defaults to
None.
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.review.tables.load_manual_qc_decisions(path: str | Path | None) DataFrame
Load manual QC decisions from CSV.
- Parameters:
path (str | pathlib.Path | None) – Decision CSV path. Missing or
Nonereturns an empty table.path – Required function argument.
- Returns:
Normalized decision table.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.review.tables.normalize_manual_qc_decisions(df: DataFrame) DataFrame
Normalize a manual QC decision table.
- Parameters:
df (pandas.DataFrame) – Raw decision table.
df – Required function argument.
- Returns:
Decision table with public columns.
- Return type:
pandas.DataFrame
- Returns:
Return value produced by the function.
- Return type:
pandas.DataFrame
- spatial_vtk.qc.review.tables.queue_rows_from_filtered_trace_df(df: DataFrame, *, key_columns: tuple[str, ...] = ('event_id', 'station', 'component'), status: str = 'pending') list[dict[str, object]]
Convert a filtered trace table into manual-review queue rows.
- Parameters:
df (pandas.DataFrame) – Required function argument.
key_columns (tuple, optional) – Optional function argument. Defaults to
('event_id', 'station', 'component').status (str, optional) – Optional function argument. Defaults to
'pending'.
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.review.tables.write_manual_qc_decisions(df: DataFrame, path: str | Path, *, overwrite: bool = True) Path
Write manual QC decisions to CSV.
- Parameters:
df (pandas.DataFrame) – Decision rows.
path (str | pathlib.Path) – Output CSV path.
overwrite (bool, optional) – Whether to replace an existing file.
df – Required function argument.
path – Required function argument.
overwrite – Optional function argument. Defaults to
True.
- Returns:
Written path.
- Return type:
pathlib.Path
- Returns:
Return value produced by the function.
- Return type:
pathlib.Path
Summary
Shared QC classification and reject-rule helpers.
- spatial_vtk.qc.summary.rules.classify_station_family(network: str, station: str) str
Classify one station as broadband, strong-motion, or unknown.
- Parameters:
network (str) – Required function argument.
station (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
str
- spatial_vtk.qc.summary.rules.dedupe_reason_codes(reasons: list[str]) list[str]
Deduplicate reject reason codes while preserving first-seen order.
- Parameters:
reasons (list) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
list
- spatial_vtk.qc.summary.rules.dominant_energy_band(freqs_hz: ndarray, power: ndarray, band_edges_hz: dict[str, tuple[float, float]]) tuple[str, dict[str, float]]
Resolve which standard band carries the most spectral power.
- Parameters:
freqs_hz (numpy.ndarray) – Required function argument.
power (numpy.ndarray) – Required function argument.
band_edges_hz (dict) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple
- spatial_vtk.qc.summary.rules.global_trace_reject_reasons(*, record_length_s: float, end_rel_s: float, onset_reasons: list[str], min_end_after_origin_s: float, min_record_length_s: float) tuple[bool, list[str]]
Apply station-level reject rules shared by every passband.
- Parameters:
record_length_s (float) – Required function argument.
end_rel_s (float) – Required function argument.
onset_reasons (list) – Required function argument.
min_end_after_origin_s (float) – Required function argument.
min_record_length_s (float) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple
- spatial_vtk.qc.summary.rules.reject_passband(*, global_reasons: list[str], snr_rms: float, snr_threshold: float, noise_window_valid: bool, signal_window_valid: bool, pre_origin_window_valid: bool, pre_origin_signal_ratio: float, pre_origin_signal_ratio_threshold: float, origin_window_valid: bool, origin_signal_ratio: float) tuple[bool, list[str]]
Apply passband-specific reject rules and merge shared global reasons.
- Parameters:
global_reasons (list) – Required function argument.
snr_rms (float) – Required function argument.
snr_threshold (float) – Required function argument.
noise_window_valid (bool) – Required function argument.
signal_window_valid (bool) – Required function argument.
pre_origin_window_valid (bool) – Required function argument.
pre_origin_signal_ratio (float) – Required function argument.
pre_origin_signal_ratio_threshold (float) – Required function argument.
origin_window_valid (bool) – Required function argument.
origin_signal_ratio (float) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
tuple
- spatial_vtk.qc.summary.rules.station_code_has_letters(station: str) bool
Return whether one station code contains at least one alphabetic character.
- Parameters:
station (str) – Required function argument.
- Returns:
Return value produced by the function.
- Return type:
bool