Quality Control API

Quality-control modules build trace inventories, evaluate waveform and metric QC rules, create summaries, and prepare manual-review queues.

Package Entry Point

Quality-control workflow modules.

class spatial_vtk.qc.InventoryBandSpec(label: str, period_min: float, period_max: float)

Describe one passband present in a trace-inventory CSV.

class spatial_vtk.qc.TraceInventoryLookup(rows: dict[tuple[str, str, str, str, str], dict[str, str]] | None = None, *, csv_path: Path | None = None, available_bands: list[InventoryBandSpec] | None = None, disabled_requested_bands: set[str] | None = None)

Dictionary lookup with trace-inventory metadata.

spatial_vtk.qc.build_comparison_eligibility(qc_summary: DataFrame | str | Path) DataFrame

Return rows where observed and synthetic QC both pass.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific QC summary table.

  • qc_summary – Required function argument.

Returns:

Comparison-eligible event/station/component/passband/metric rows.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_event_station_pair_retention_table(qc_summary: DataFrame | str | Path) DataFrame

Summarize comparison-pair retention for each event-station pair.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific metric QC table with observed and synthetic rows.

  • qc_summary – Required function argument.

Returns:

One row per event/station with total comparison pairs, retained pairs, and retained percentage across all components, passbands, and metrics.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_metric_pair_retention_table(qc_summary: DataFrame | str | Path, *, group_cols: Sequence[str] = ('metric', 'passband')) DataFrame

Summarize post-QC observed/synthetic pair retention.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific metric QC table with observed and synthetic rows.

  • group_cols (collections.abc.Sequence, optional) – Columns used to group retention percentages, usually metric and passband.

  • qc_summary – Required function argument.

  • group_cols – Optional function argument. Defaults to ('metric', 'passband').

Returns:

Pair-retention rows with total pairs before QC, retained pairs after QC, and retained percentage.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_metric_qc_summary(event_station_records: DataFrame | str | Path, *, metrics: Sequence[str], components: Sequence[str], passbands: Sequence[str | Sequence[float]], spectral_periods_s: Sequence[float] = (), sources: Sequence[str] = ('observed', 'synthetic'), synthetic_max_frequency_hz: float | None = None, observed_available: bool = True, synthetic_available: bool = True, trace_qc_summary: DataFrame | str | Path | None = None, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25, return_result: bool = True) DataFrame

Build a side-specific metric QC summary from event-station records.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with at least event_id and station columns.

  • metrics (collections.abc.Sequence) – Requested metrics, components, and period bands.

  • components (collections.abc.Sequence) – Requested metrics, components, and period bands.

  • passbands (collections.abc.Sequence) – Requested metrics, components, and period bands.

  • spectral_periods_s (collections.abc.Sequence, optional) – Periods to check for spectral metrics.

  • sources (collections.abc.Sequence, optional) – Sources to include, usually observed and synthetic.

  • synthetic_max_frequency_hz (float | None, optional) – Maximum valid synthetic frequency. Synthetic spectral periods shorter than 1 / synthetic_max_frequency_hz fail QC.

  • observed_available (bool, optional) – Default availability values when the input table does not include source-specific availability columns.

  • synthetic_available (bool, optional) – Default availability values when the input table does not include source-specific availability columns.

  • trace_qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional side-specific waveform QC table. When provided, failed source/event/station/component/passband rows fail matching metric rows.

  • verbose (bool, optional) – Print progress messages while building QC rows.

  • progress_interval (int, optional) – Number of event-station records between progress messages.

  • checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where intermediate QC rows are written.

  • resume (bool, optional) – When true and checkpoint_path exists, skip event/station records already present in that checkpoint.

  • checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.

  • return_result (bool, optional) – When false, append checkpoint rows on disk and return an empty dataframe instead of loading/returning the full QC inventory. This is intended for large Slurm jobs.

  • event_station_records – Required function argument.

  • metrics – Required function argument.

  • components – Required function argument.

  • passbands – Required function argument.

  • spectral_periods_s – Optional function argument. Defaults to ().

  • sources – Optional function argument. Defaults to ('observed', 'synthetic').

  • synthetic_max_frequency_hz – Optional function argument. Defaults to None.

  • observed_available – Optional function argument. Defaults to True.

  • synthetic_available – Optional function argument. Defaults to True.

  • trace_qc_summary – Optional function argument. Defaults to None.

  • verbose – Optional function argument. Defaults to False.

  • progress_interval – Optional function argument. Defaults to 25.

  • checkpoint_path – Optional function argument. Defaults to None.

  • resume – Optional function argument. Defaults to True.

  • checkpoint_interval – Optional function argument. Defaults to 25.

  • return_result – Optional function argument. Defaults to True.

Returns:

Standard metric QC rows.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_post_qc_record_table(event_station_records: DataFrame | str | Path, *, events: DataFrame | str | Path | None = None, qc_summary: DataFrame | str | Path | None = None) DataFrame

Build station-event records for post-QC map figures.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with station coordinates.

  • events (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional event metadata with event coordinates.

  • qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional QC summary used to assign pass/fail status per event/station.

  • event_station_records – Required function argument.

  • events – Optional function argument. Defaults to None.

  • qc_summary – Optional function argument. Defaults to None.

Returns:

Records with sta_lat, sta_lon, event coordinates, and qc_status.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_qc_availability_table(event_station_records: DataFrame | str | Path, *, qc_summary: DataFrame | str | Path | None = None, qc_aggregate: str = 'all_pass', observed_root: str | Path | None = None, synthetic_root: str | Path | None = None, observed_inventory: DataFrame | str | Path | None = None, synthetic_inventory: DataFrame | str | Path | None = None, cfg: SpatialVTKConfig | None = None) DataFrame

Build observed/synthetic availability rows for QC figures.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table.

  • qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional side-specific QC summary. When provided, availability is based on post-QC retained rows instead of file presence.

  • qc_aggregate (str, optional) – "all_pass" marks a side available only if all matching QC rows pass. "any_pass" marks a side available when at least one matching row passes.

  • observed_root (str | pathlib.Path | None, optional) – Optional roots used to inventory files when inventory tables are not already available. When omitted, the active config’s paths.observed_root and paths.synthetic_root are used when present.

  • synthetic_root (str | pathlib.Path | None, optional) – Optional roots used to inventory files when inventory tables are not already available. When omitted, the active config’s paths.observed_root and paths.synthetic_root are used when present.

  • observed_inventory (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional file inventory tables.

  • synthetic_inventory (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional file inventory tables.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object used to resolve default roots.

  • event_station_records – Required function argument.

  • qc_summary – Optional function argument. Defaults to None.

  • qc_aggregate – Optional function argument. Defaults to 'all_pass'.

  • observed_root – Optional function argument. Defaults to None.

  • synthetic_root – Optional function argument. Defaults to None.

  • observed_inventory – Optional function argument. Defaults to None.

  • synthetic_inventory – Optional function argument. Defaults to None.

  • cfg – Optional function argument. Defaults to None.

Returns:

Availability table with one row per event/station.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_qc_waveform_comparison_records(event_station_records: DataFrame | str | Path, qc_summary: DataFrame | str | Path | None = None, *, comparison_eligible: DataFrame | str | Path | None = None, component: str = 'Z', passband: str | None = None, event_id: str | list[str] | tuple[str, ...] | None = None, max_distance_km: float | None = 50.0, max_records: int | None = 12, observed_waveform_col: str = 'observed_processed_waveform', synthetic_waveform_col: str = 'synthetic_processed_waveform') DataFrame

Build post-QC waveform rows for observed/synthetic visual inspection.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Prepared event-station table with waveform paths.

  • qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Side-specific QC table. Used to build comparison-eligible rows when comparison_eligible is not supplied.

  • comparison_eligible (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional precomputed output from build_comparison_eligibility().

  • component (str, optional) – Component to load for the visual comparison.

  • passband (str | None, optional) – Optional retained passband to select.

  • event_id (str | list[str] | tuple[str, ...] | None, optional) – Optional event ID or IDs to select before loading waveforms.

  • max_distance_km (float | None, optional) – Optional distance limit in kilometers.

  • max_records (int | None, optional) – Optional maximum retained event-station rows to load.

  • observed_waveform_col (str, optional) – Waveform path columns in event_station_records.

  • synthetic_waveform_col (str, optional) – Waveform path columns in event_station_records.

  • event_station_records – Required function argument.

  • qc_summary – Optional function argument. Defaults to None.

  • comparison_eligible – Optional function argument. Defaults to None.

  • component – Optional function argument. Defaults to 'Z'.

  • passband – Optional function argument. Defaults to None.

  • event_id – Optional function argument. Defaults to None.

  • max_distance_km – Optional function argument. Defaults to 50.0.

  • max_records – Optional function argument. Defaults to 12.

  • observed_waveform_col – Optional function argument. Defaults to 'observed_processed_waveform'.

  • synthetic_waveform_col – Optional function argument. Defaults to 'synthetic_processed_waveform'.

Returns:

Rows with observed and synthetic trace objects, sample intervals, event-origin offsets, and distance metadata.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_retention_figure_table(qc_summary: DataFrame | str | Path) DataFrame

Prepare QC rows for retention summary figures.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Metric QC summary table.

  • qc_summary – Required function argument.

Returns:

Copy with stage set to passband labels.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_waveform_qc_summary(event_station_records: DataFrame | str | Path, *, sources: Sequence[str] = ('observed', 'synthetic'), waveform_path_columns: dict[str, str] | None = None, components: Sequence[str] | None = None, passbands: Sequence[str | Sequence[float]] | None = None, preprocessing: WaveformPreprocessing | None = None, apply_config_preprocessing_to_processed_files: bool = False, cfg: SpatialVTKConfig | None = None, min_record_length_s: float | None = None, min_end_after_origin_s: float | None = None, snr_threshold: float | None = None, arrival_pick_catalog: DataFrame | str | Path | None = None, onset_phase: str = 'P', min_onset_pick_probability: float = 0.0, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25, return_result: bool = True) DataFrame

Build observed/synthetic waveform QC rows from event-station records.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with waveform path columns.

  • sources (collections.abc.Sequence, optional) – Source labels to inspect.

  • waveform_path_columns (dict[str, str] | None, optional) – Optional mapping from source label to waveform path column.

  • components (collections.abc.Sequence[str] | None, optional) – Optional component and period-band selections. When omitted, the active metric settings are used.

  • passbands (collections.abc.Sequence[str | collections.abc.Sequence[float]] | None, optional) – Optional component and period-band selections. When omitted, the active metric settings are used.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional preprocessing applied before QC. When omitted and a processed waveform column is used, no extra filtering is applied.

  • apply_config_preprocessing_to_processed_files (bool, optional) – Whether processed waveform columns should still use config preprocessing during QC.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active config is used.

  • min_record_length_s (float | None, optional) – Optional QC threshold overrides. Missing values are read from qc.automatic in the config.

  • min_end_after_origin_s (float | None, optional) – Optional QC threshold overrides. Missing values are read from qc.automatic in the config.

  • snr_threshold (float | None, optional) – Optional QC threshold overrides. Missing values are read from qc.automatic in the config.

  • arrival_pick_catalog (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional PhaseNet-style pick catalog used to anchor QC windows.

  • onset_phase (str, optional) – Pick phase used as the QC onset when available.

  • min_onset_pick_probability (float, optional) – Minimum picker probability accepted for the QC onset pick.

  • verbose (bool, optional) – Print progress messages while loading waveforms and building QC rows.

  • progress_interval (int, optional) – Number of event-station records between progress messages.

  • checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where the combined waveform QC summary is written. Per-source intermediate checkpoints are written next to this path.

  • resume (bool, optional) – When true, existing per-source checkpoints are used to skip completed event/station/component groups.

  • checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.

  • return_result (bool, optional) – When false, write per-source checkpoints and combine them on disk into checkpoint_path instead of keeping all source QC rows in memory. This is intended for Slurm workers on large inventories.

  • event_station_records – Required function argument.

  • sources – Optional function argument. Defaults to ('observed', 'synthetic').

  • waveform_path_columns – Optional function argument. Defaults to None.

  • components – Optional function argument. Defaults to None.

  • passbands – Optional function argument. Defaults to None.

  • preprocessing – Optional function argument. Defaults to None.

  • apply_config_preprocessing_to_processed_files – Optional function argument. Defaults to False.

  • cfg – Optional function argument. Defaults to None.

  • min_record_length_s – Optional function argument. Defaults to None.

  • min_end_after_origin_s – Optional function argument. Defaults to None.

  • snr_threshold – Optional function argument. Defaults to None.

  • arrival_pick_catalog – Optional function argument. Defaults to None.

  • onset_phase – Optional function argument. Defaults to 'P'.

  • min_onset_pick_probability – Optional function argument. Defaults to 0.0.

  • verbose – Optional function argument. Defaults to False.

  • progress_interval – Optional function argument. Defaults to 25.

  • checkpoint_path – Optional function argument. Defaults to None.

  • resume – Optional function argument. Defaults to True.

  • checkpoint_interval – Optional function argument. Defaults to 25.

  • return_result – Optional function argument. Defaults to True.

Returns:

Side-specific waveform QC rows that can be passed to build_metric_qc_summary(trace_qc_summary=...).

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build_waveform_trace_qc_summary(event_station_records: DataFrame | str | Path, *, source: str = 'observed', waveform_path_col: str = 'observed_pickle', components: tuple[str, ...] | list[str] = ('Z',), passbands: tuple[str | tuple[float, float], ...] | list[str | tuple[float, float]] | None = None, preprocessing: WaveformPreprocessing | None = None, min_record_length_s: float = 60.0, min_end_after_origin_s: float = 60.0, snr_threshold: float = 3.0, noise_window_min_s: float = 1.0, signal_window_min_s: float = 10.0, noise_gap_s: float = 0.5, signal_gap_s: float = 0.5, origin_tolerance_s: float = 0.5, pre_origin_signal_ratio_threshold: float = 0.5, arrival_pick_catalog: DataFrame | str | Path | None = None, onset_phase: str = 'P', min_onset_pick_probability: float = 0.0, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25) DataFrame

Build side-specific trace QC rows from waveform files.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station records with event IDs, station codes, event origin times, and waveform paths.

  • source (str, optional) – Source label copied to the output, usually "observed" or "synthetic".

  • waveform_path_col (str, optional) – Column containing waveform files for this source.

  • components (tuple[str, ...] | list[str], optional) – Components to inspect.

  • passbands (tuple[str | tuple[float, float], ...] | list[str | tuple[float, float]] | None, optional) – Period bands to report. When omitted, the public inventory standard bands are used.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional waveform preprocessing applied before QC calculations.

  • min_record_length_s (float, optional) – Minimum trace duration in seconds.

  • min_end_after_origin_s (float, optional) – Minimum required trace end time relative to event origin.

  • snr_threshold (float, optional) – Minimum RMS signal-to-noise ratio.

  • noise_window_min_s (float, optional) – Minimum noise-window length in seconds.

  • signal_window_min_s (float, optional) – Minimum signal-window length in seconds.

  • noise_gap_s (float, optional) – Gap between detected onset and the noise window.

  • signal_gap_s (float, optional) – Gap between detected onset and the signal window.

  • origin_tolerance_s (float, optional) – Half-width of the origin energy check window.

  • pre_origin_signal_ratio_threshold (float, optional) – Maximum origin/pre-origin to signal RMS ratio.

  • arrival_pick_catalog (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional PhaseNet-style pick catalog. When a finite pick exists for the requested onset phase, QC uses it as the signal onset; otherwise QC falls back to the waveform-envelope onset.

  • onset_phase (str, optional) – Pick phase used to anchor QC noise and signal windows.

  • min_onset_pick_probability (float, optional) – Minimum picker probability accepted for the QC onset pick.

  • verbose (bool, optional) – Print progress messages while loading waveform files and building rows.

  • progress_interval (int, optional) – Number of event-station records between progress messages.

  • checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where intermediate QC rows are written.

  • resume (bool, optional) – When true and checkpoint_path exists, skip event/station/component groups already present in that checkpoint.

  • checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.

  • event_station_records – Required function argument.

  • source – Optional function argument. Defaults to 'observed'.

  • waveform_path_col – Optional function argument. Defaults to 'observed_pickle'.

  • components – Optional function argument. Defaults to ('Z',).

  • passbands – Optional function argument. Defaults to None.

  • preprocessing – Optional function argument. Defaults to None.

  • min_record_length_s – Optional function argument. Defaults to 60.0.

  • min_end_after_origin_s – Optional function argument. Defaults to 60.0.

  • snr_threshold – Optional function argument. Defaults to 3.0.

  • noise_window_min_s – Optional function argument. Defaults to 1.0.

  • signal_window_min_s – Optional function argument. Defaults to 10.0.

  • noise_gap_s – Optional function argument. Defaults to 0.5.

  • signal_gap_s – Optional function argument. Defaults to 0.5.

  • origin_tolerance_s – Optional function argument. Defaults to 0.5.

  • pre_origin_signal_ratio_threshold – Optional function argument. Defaults to 0.5.

  • arrival_pick_catalog – Optional function argument. Defaults to None.

  • onset_phase – Optional function argument. Defaults to 'P'.

  • min_onset_pick_probability – Optional function argument. Defaults to 0.0.

  • verbose – Optional function argument. Defaults to False.

  • progress_interval – Optional function argument. Defaults to 25.

  • checkpoint_path – Optional function argument. Defaults to None.

  • resume – Optional function argument. Defaults to True.

  • checkpoint_interval – Optional function argument. Defaults to 25.

Returns:

Metric-QC-compatible rows with one row per source/event/station/component/passband.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.classify_station_family(network: str, station: str) str

Classify one station as broadband, strong-motion, or unknown.

Parameters:
  • network (str) – Required function argument.

  • station (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.qc.companion_rows_from_master(master_rows: list[dict[str, object]] | DataFrame, inventory_bands: list[tuple[str, float, float]] | None = None) list[dict[str, object]]

Build per-event QC companion rows from master inventory rows.

Parameters:
  • master_rows (list[dict[str, object]] | pandas.DataFrame) – Full master inventory rows.

  • inventory_bands (list[tuple[str, float, float]] | None, optional) – Inventory passbands as (label, period_min, period_max) tuples.

  • master_rows – Required function argument.

  • inventory_bands – Optional function argument. Defaults to None.

Returns:

Per-event, per-variant, per-band summaries with distance-bin counts.

Return type:

list of dict

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.determine_available_components(stream: Any, station: str, components: tuple[str, ...] = ('N', 'E', 'Z', 'R', 'T')) list[str]

Determine which requested components are available for one station.

Parameters:
  • stream (Any) – ObsPy stream-like iterable with trace stats.station and stats.channel attributes.

  • station (str) – Station code to inspect.

  • components (tuple, optional) – Component suffixes to search for.

  • stream – Required function argument.

  • station – Required function argument.

  • components – Optional function argument. Defaults to ('N', 'E', 'Z', 'R', 'T').

Returns:

Components present in the stream for the station.

Return type:

list of str

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.discover_event_ids(*roots: str | Path) list[str]

Discover candidate event IDs from one or more observed-data roots.

Parameters:
  • *roots – Directories containing event files or event subdirectories.

  • roots (str | pathlib.Path) – Additional positional arguments passed to the function.

Returns:

Sorted unique event identifiers.

Return type:

list of str

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.export_manual_review_queue(qc_summary: DataFrame | str | Path, output_path: str | Path | None = None, *, cfg: SpatialVTKConfig | None = None) Path

Write a manual-review queue from QC summary rows.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Metric QC summary table.

  • output_path (str | pathlib.Path | None, optional) – CSV or JSON output path. When omitted, the standard manual_review_queue output path is resolved from the active config.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object used when resolving the default output path.

  • qc_summary – Required function argument.

  • output_path – Optional function argument. Defaults to None.

  • cfg – Optional function argument. Defaults to None.

Returns:

Written queue path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.qc.filter_trace_summary(df: DataFrame, *, event_id: str | None = None, station: str | None = None, component: str | None = None, accepted: bool | None = None, reject_reason_contains: str | None = None) DataFrame

Filter a trace-summary table using common review fields.

Parameters:
  • df (pandas.DataFrame) – Required function argument.

  • event_id (str | None, optional) – Optional function argument. Defaults to None.

  • station (str | None, optional) – Optional function argument. Defaults to None.

  • component (str | None, optional) – Optional function argument. Defaults to None.

  • accepted (bool | None, optional) – Optional function argument. Defaults to None.

  • reject_reason_contains (str | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.global_trace_reject_reasons(*, record_length_s: float, end_rel_s: float, onset_reasons: list[str], min_end_after_origin_s: float, min_record_length_s: float) tuple[bool, list[str]]

Apply station-level reject rules shared by every passband.

Parameters:
  • record_length_s (float) – Required function argument.

  • end_rel_s (float) – Required function argument.

  • onset_reasons (list) – Required function argument.

  • min_end_after_origin_s (float) – Required function argument.

  • min_record_length_s (float) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.qc.load_trace_inventory_lookup(csv_path: Path | str | None) TraceInventoryLookup

Load one master inventory CSV into a normalized lookup dictionary.

Parameters:

csv_path (pathlib.Path | str | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

spatial_vtk.qc.build.filtering.TraceInventoryLookup

spatial_vtk.qc.queue_rows_from_filtered_trace_df(df: DataFrame, *, key_columns: tuple[str, ...] = ('event_id', 'station', 'component'), status: str = 'pending') list[dict[str, object]]

Convert a filtered trace table into manual-review queue rows.

Parameters:
  • df (pandas.DataFrame) – Required function argument.

  • key_columns (tuple, optional) – Optional function argument. Defaults to ('event_id', 'station', 'component').

  • status (str, optional) – Optional function argument. Defaults to 'pending'.

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.reject_passband(*, global_reasons: list[str], snr_rms: float, snr_threshold: float, noise_window_valid: bool, signal_window_valid: bool, pre_origin_window_valid: bool, pre_origin_signal_ratio: float, pre_origin_signal_ratio_threshold: float, origin_window_valid: bool, origin_signal_ratio: float) tuple[bool, list[str]]

Apply passband-specific reject rules and merge shared global reasons.

Parameters:
  • global_reasons (list) – Required function argument.

  • snr_rms (float) – Required function argument.

  • snr_threshold (float) – Required function argument.

  • noise_window_valid (bool) – Required function argument.

  • signal_window_valid (bool) – Required function argument.

  • pre_origin_window_valid (bool) – Required function argument.

  • pre_origin_signal_ratio (float) – Required function argument.

  • pre_origin_signal_ratio_threshold (float) – Required function argument.

  • origin_window_valid (bool) – Required function argument.

  • origin_signal_ratio (float) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.qc.trace_passband_is_accepted(lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str, station: str, component: str, passband_label: str | None = None, period_min: float | None = None, period_max: float | None = None) bool

Return whether one trace is accepted for one passband request.

Parameters:
  • lookup (dict) – Required function argument.

  • observed_variant (str) – Required function argument.

  • event_id (str) – Required function argument.

  • station (str) – Required function argument.

  • component (str) – Required function argument.

  • passband_label (str | None, optional) – Optional function argument. Defaults to None.

  • period_min (float | None, optional) – Optional function argument. Defaults to None.

  • period_max (float | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

bool

Build

Inventory lookup and filtering helpers for QC workflows.

class spatial_vtk.qc.build.filtering.InventoryBandSpec(label: str, period_min: float, period_max: float)

Describe one passband present in a trace-inventory CSV.

class spatial_vtk.qc.build.filtering.TraceInventoryLookup(rows: dict[tuple[str, str, str, str, str], dict[str, str]] | None = None, *, csv_path: Path | None = None, available_bands: list[InventoryBandSpec] | None = None, disabled_requested_bands: set[str] | None = None)

Dictionary lookup with trace-inventory metadata.

spatial_vtk.qc.build.filtering.band_key_from_label(label: str) str

Convert a period-band label to a stable CSV column suffix.

Parameters:

label (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.qc.build.filtering.band_label_from_key(key: str) str | None

Convert one inventory CSV suffix to a period-band label.

Parameters:

key (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str | None

spatial_vtk.qc.build.filtering.event_station_has_any_accepted_component(lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str, station: str, components: tuple[str, ...] | list[str], period_min: float | None = None, period_max: float | None = None) bool

Return whether any requested component survives inventory QC.

Parameters:
  • lookup (dict) – Required function argument.

  • observed_variant (str) – Required function argument.

  • event_id (str) – Required function argument.

  • station (str) – Required function argument.

  • components (tuple[str, ...] | list[str]) – Required function argument.

  • period_min (float | None, optional) – Optional function argument. Defaults to None.

  • period_max (float | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

bool

spatial_vtk.qc.build.filtering.filter_stream_by_inventory(stream: Any, lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str) Any

Filter a stream to traces with at least one accepted inventory band.

Parameters:
  • stream (Any) – Required function argument.

  • lookup (dict) – Required function argument.

  • observed_variant (str) – Required function argument.

  • event_id (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.qc.build.filtering.inventory_lookup_key(observed_variant: str, event_id: str, station: str, component: str, passband_label: str) tuple[str, str, str, str, str]

Build one normalized inventory key tuple.

Parameters:
  • observed_variant (str) – Required function argument.

  • event_id (str) – Required function argument.

  • station (str) – Required function argument.

  • component (str) – Required function argument.

  • passband_label (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.qc.build.filtering.load_trace_inventory_lookup(csv_path: Path | str | None) TraceInventoryLookup

Load one master inventory CSV into a normalized lookup dictionary.

Parameters:

csv_path (pathlib.Path | str | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

spatial_vtk.qc.build.filtering.TraceInventoryLookup

spatial_vtk.qc.build.filtering.normalize_band_label(label: str | None) str

Normalize one inventory band label.

Parameters:

label (str | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.qc.build.filtering.normalize_component(component: str | None) str

Normalize one component code.

Parameters:

component (str | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.qc.build.filtering.normalize_event_id(event_id: object) str

Normalize one event identifier for inventory lookup keys.

Parameters:

event_id (object) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.qc.build.filtering.normalize_observed_variant(text: str | None) str

Normalize one observed-variant label.

Parameters:

text (str | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.qc.build.filtering.normalize_station_code(station: object) str

Normalize one station code for inventory lookup keys.

Parameters:

station (object) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.qc.build.filtering.parse_period_band_label(label: str) tuple[float, float] | None

Parse a period-band label into (period_min, period_max).

Parameters:

label (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple[float, float] | None

spatial_vtk.qc.build.filtering.period_band_label(period_min: float | None, period_max: float | None) str | None

Return the canonical period-band label for one requested band.

Parameters:
  • period_min (float | None) – Required function argument.

  • period_max (float | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str | None

spatial_vtk.qc.build.filtering.relevant_inventory_bands(period_min: float | None, period_max: float | None, *, available_bands: list[InventoryBandSpec] | None = None) list[str]

Resolve which inventory bands are relevant for one requested band.

Parameters:
  • period_min (float | None) – Required function argument.

  • period_max (float | None) – Required function argument.

  • available_bands (list[spatial_vtk.qc.build.filtering.InventoryBandSpec] | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.build.filtering.trace_has_any_accepted_passband(lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str, station: str, component: str) bool

Return whether one trace has at least one accepted inventory passband.

Parameters:
  • lookup (dict) – Required function argument.

  • observed_variant (str) – Required function argument.

  • event_id (str) – Required function argument.

  • station (str) – Required function argument.

  • component (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

bool

spatial_vtk.qc.build.filtering.trace_passband_is_accepted(lookup: dict[tuple[str, str, str, str, str], dict[str, str]], *, observed_variant: str, event_id: str, station: str, component: str, passband_label: str | None = None, period_min: float | None = None, period_max: float | None = None) bool

Return whether one trace is accepted for one passband request.

Parameters:
  • lookup (dict) – Required function argument.

  • observed_variant (str) – Required function argument.

  • event_id (str) – Required function argument.

  • station (str) – Required function argument.

  • component (str) – Required function argument.

  • passband_label (str | None, optional) – Optional function argument. Defaults to None.

  • period_min (float | None, optional) – Optional function argument. Defaults to None.

  • period_max (float | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

bool

Inventory helpers for quality-control dataset construction.

spatial_vtk.qc.build.inventory.build_trace_inventory(event_streams: dict[str, Any] | list[dict[str, Any]] | DataFrame, *, observed_variant: str = 'nonrotated', inventory_bands: list[tuple[str, float, float]] | None = None, min_record_length_s: float = 80.0, min_end_after_origin_s: float = 60.0, station_metadata: DataFrame | None = None, event_metadata: DataFrame | None = None) DataFrame

Build a QC trace inventory from waveform streams or trace metadata.

Parameters:
  • event_streams (dict[str, Any] | list[dict[str, Any]] | pandas.DataFrame) – Mapping from event ID to stream-like objects, list of records with event_id and stream fields, or precomputed trace metadata.

  • observed_variant (str, optional) – Label for the observed-data variant.

  • inventory_bands (list[tuple[str, float, float]] | None, optional) – Inventory passbands as (label, period_min, period_max) tuples.

  • min_record_length_s (float, optional) – Minimum accepted trace length in seconds.

  • min_end_after_origin_s (float, optional) – Minimum accepted trace end time relative to origin.

  • station_metadata (pandas.DataFrame | None, optional) – Optional metadata tables joined into the inventory.

  • event_metadata (pandas.DataFrame | None, optional) – Optional metadata tables joined into the inventory.

  • event_streams – Required function argument.

  • observed_variant – Optional function argument. Defaults to 'nonrotated'.

  • inventory_bands – Optional function argument. Defaults to None.

  • min_record_length_s – Optional function argument. Defaults to 80.0.

  • min_end_after_origin_s – Optional function argument. Defaults to 60.0.

  • station_metadata – Optional function argument. Defaults to None.

  • event_metadata – Optional function argument. Defaults to None.

Returns:

One row per trace with passband reject flags and reject reasons.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.inventory.build_waveform_trace_qc_summary(event_station_records: DataFrame | str | Path, *, source: str = 'observed', waveform_path_col: str = 'observed_pickle', components: tuple[str, ...] | list[str] = ('Z',), passbands: tuple[str | tuple[float, float], ...] | list[str | tuple[float, float]] | None = None, preprocessing: WaveformPreprocessing | None = None, min_record_length_s: float = 60.0, min_end_after_origin_s: float = 60.0, snr_threshold: float = 3.0, noise_window_min_s: float = 1.0, signal_window_min_s: float = 10.0, noise_gap_s: float = 0.5, signal_gap_s: float = 0.5, origin_tolerance_s: float = 0.5, pre_origin_signal_ratio_threshold: float = 0.5, arrival_pick_catalog: DataFrame | str | Path | None = None, onset_phase: str = 'P', min_onset_pick_probability: float = 0.0, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25) DataFrame

Build side-specific trace QC rows from waveform files.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station records with event IDs, station codes, event origin times, and waveform paths.

  • source (str, optional) – Source label copied to the output, usually "observed" or "synthetic".

  • waveform_path_col (str, optional) – Column containing waveform files for this source.

  • components (tuple[str, ...] | list[str], optional) – Components to inspect.

  • passbands (tuple[str | tuple[float, float], ...] | list[str | tuple[float, float]] | None, optional) – Period bands to report. When omitted, the public inventory standard bands are used.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional waveform preprocessing applied before QC calculations.

  • min_record_length_s (float, optional) – Minimum trace duration in seconds.

  • min_end_after_origin_s (float, optional) – Minimum required trace end time relative to event origin.

  • snr_threshold (float, optional) – Minimum RMS signal-to-noise ratio.

  • noise_window_min_s (float, optional) – Minimum noise-window length in seconds.

  • signal_window_min_s (float, optional) – Minimum signal-window length in seconds.

  • noise_gap_s (float, optional) – Gap between detected onset and the noise window.

  • signal_gap_s (float, optional) – Gap between detected onset and the signal window.

  • origin_tolerance_s (float, optional) – Half-width of the origin energy check window.

  • pre_origin_signal_ratio_threshold (float, optional) – Maximum origin/pre-origin to signal RMS ratio.

  • arrival_pick_catalog (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional PhaseNet-style pick catalog. When a finite pick exists for the requested onset phase, QC uses it as the signal onset; otherwise QC falls back to the waveform-envelope onset.

  • onset_phase (str, optional) – Pick phase used to anchor QC noise and signal windows.

  • min_onset_pick_probability (float, optional) – Minimum picker probability accepted for the QC onset pick.

  • verbose (bool, optional) – Print progress messages while loading waveform files and building rows.

  • progress_interval (int, optional) – Number of event-station records between progress messages.

  • checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where intermediate QC rows are written.

  • resume (bool, optional) – When true and checkpoint_path exists, skip event/station/component groups already present in that checkpoint.

  • checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.

  • event_station_records – Required function argument.

  • source – Optional function argument. Defaults to 'observed'.

  • waveform_path_col – Optional function argument. Defaults to 'observed_pickle'.

  • components – Optional function argument. Defaults to ('Z',).

  • passbands – Optional function argument. Defaults to None.

  • preprocessing – Optional function argument. Defaults to None.

  • min_record_length_s – Optional function argument. Defaults to 60.0.

  • min_end_after_origin_s – Optional function argument. Defaults to 60.0.

  • snr_threshold – Optional function argument. Defaults to 3.0.

  • noise_window_min_s – Optional function argument. Defaults to 1.0.

  • signal_window_min_s – Optional function argument. Defaults to 10.0.

  • noise_gap_s – Optional function argument. Defaults to 0.5.

  • signal_gap_s – Optional function argument. Defaults to 0.5.

  • origin_tolerance_s – Optional function argument. Defaults to 0.5.

  • pre_origin_signal_ratio_threshold – Optional function argument. Defaults to 0.5.

  • arrival_pick_catalog – Optional function argument. Defaults to None.

  • onset_phase – Optional function argument. Defaults to 'P'.

  • min_onset_pick_probability – Optional function argument. Defaults to 0.0.

  • verbose – Optional function argument. Defaults to False.

  • progress_interval – Optional function argument. Defaults to 25.

  • checkpoint_path – Optional function argument. Defaults to None.

  • resume – Optional function argument. Defaults to True.

  • checkpoint_interval – Optional function argument. Defaults to 25.

Returns:

Metric-QC-compatible rows with one row per source/event/station/component/passband.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.inventory.companion_rows_from_master(master_rows: list[dict[str, object]] | DataFrame, inventory_bands: list[tuple[str, float, float]] | None = None) list[dict[str, object]]

Build per-event QC companion rows from master inventory rows.

Parameters:
  • master_rows (list[dict[str, object]] | pandas.DataFrame) – Full master inventory rows.

  • inventory_bands (list[tuple[str, float, float]] | None, optional) – Inventory passbands as (label, period_min, period_max) tuples.

  • master_rows – Required function argument.

  • inventory_bands – Optional function argument. Defaults to None.

Returns:

Per-event, per-variant, per-band summaries with distance-bin counts.

Return type:

list of dict

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.build.inventory.determine_available_components(stream: Any, station: str, components: tuple[str, ...] = ('N', 'E', 'Z', 'R', 'T')) list[str]

Determine which requested components are available for one station.

Parameters:
  • stream (Any) – ObsPy stream-like iterable with trace stats.station and stats.channel attributes.

  • station (str) – Station code to inspect.

  • components (tuple, optional) – Component suffixes to search for.

  • stream – Required function argument.

  • station – Required function argument.

  • components – Optional function argument. Defaults to ('N', 'E', 'Z', 'R', 'T').

Returns:

Components present in the stream for the station.

Return type:

list of str

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.build.inventory.discover_event_ids(*roots: str | Path) list[str]

Discover candidate event IDs from one or more observed-data roots.

Parameters:
  • *roots – Directories containing event files or event subdirectories.

  • roots (str | pathlib.Path) – Additional positional arguments passed to the function.

Returns:

Sorted unique event identifiers.

Return type:

list of str

Returns:

Return value produced by the function.

Return type:

list

Spectral QC helpers for PSA and FAS period support.

Spectral Overview

This module evaluates which spectral periods are usable for observed and synthetic traces without requiring a pre-event noise window. It uses relative spectral amplitude, physical period support, and synthetic max-frequency limits.

Spectral Examples

Find valid FAS periods for a synthetic trace:

qc = qc_fas_periods(trace, dt=0.02, periods_s=[1, 2, 5], synthetic_max_frequency_hz=1.0, source="synthetic")

spatial_vtk.qc.build.spectral.qc_fas_periods(trace: Any, *, dt: float, periods_s: Sequence[float], threshold: float = 0.25, min_cycles_in_record: float = 3.0, synthetic_max_frequency_hz: float | None = None, source: str = 'observed', disable_relative_amplitude_qc: bool = False) DataFrame

QC FAS values on a requested period grid.

Parameters:
  • trace (Any) – Waveform samples or trace-like object.

  • dt (float) – Sample interval in seconds.

  • periods_s (Sequence) – Requested periods.

  • threshold (float, optional) – Relative spectral support threshold.

  • min_cycles_in_record (float, optional) – Minimum cycles required in the record.

  • synthetic_max_frequency_hz (float | None, optional) – Optional synthetic maximum valid frequency.

  • source (str, optional) – "observed" or "synthetic" for status reasons.

  • disable_relative_amplitude_qc (bool, optional) – Whether to skip relative amplitude support.

  • trace – Required function argument.

  • dt – Required function argument.

  • periods_s – Required function argument.

  • threshold – Optional function argument. Defaults to 0.25.

  • min_cycles_in_record – Optional function argument. Defaults to 3.0.

  • synthetic_max_frequency_hz – Optional function argument. Defaults to None.

  • source – Optional function argument. Defaults to 'observed'.

  • disable_relative_amplitude_qc – Optional function argument. Defaults to False.

Returns:

Period-level QC rows with FAS amplitudes and pass/fail status.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.spectral.qc_psa_periods(trace: Any, *, dt: float, periods_s: Sequence[float], threshold: float = 0.25, damping: float = 0.05, min_cycles_in_record: float = 3.0, synthetic_max_frequency_hz: float | None = None, source: str = 'observed', disable_relative_amplitude_qc: bool = False) DataFrame

QC PSA values on a requested period grid.

Parameters:
  • trace (Any) – Required function argument.

  • dt (float) – Required function argument.

  • periods_s (Sequence) – Required function argument.

  • threshold (float, optional) – Optional function argument. Defaults to 0.25.

  • damping (float, optional) – Optional function argument. Defaults to 0.05.

  • min_cycles_in_record (float, optional) – Optional function argument. Defaults to 3.0.

  • synthetic_max_frequency_hz (float | None, optional) – Optional function argument. Defaults to None.

  • source (str, optional) – Optional function argument. Defaults to 'observed'.

  • disable_relative_amplitude_qc (bool, optional) – Optional function argument. Defaults to False.

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.spectral.spectral_relative_amplitude_mask(periods_s: Sequence[float], amplitudes: Sequence[float], *, threshold: float = 0.25, min_period_s: float | None = None, max_period_s: float | None = None, disable_relative_amplitude_qc: bool = False) ndarray

Return valid periods based on relative spectral amplitude.

Parameters:
  • periods_s (Sequence) – Period grid in seconds.

  • amplitudes (Sequence) – Spectral amplitudes aligned to periods_s.

  • threshold (float, optional) – Minimum fraction of the maximum finite amplitude required.

  • min_period_s (float | None, optional) – Optional hard period bounds.

  • max_period_s (float | None, optional) – Optional hard period bounds.

  • disable_relative_amplitude_qc (bool, optional) – Whether to skip the relative-amplitude threshold.

  • periods_s – Required function argument.

  • amplitudes – Required function argument.

  • threshold – Optional function argument. Defaults to 0.25.

  • min_period_s – Optional function argument. Defaults to None.

  • max_period_s – Optional function argument. Defaults to None.

  • disable_relative_amplitude_qc – Optional function argument. Defaults to False.

Returns:

Boolean validity mask.

Return type:

numpy.ndarray

Returns:

Return value produced by the function.

Return type:

numpy.ndarray

spatial_vtk.qc.build.spectral.spectral_valid_period_bounds(periods_s: Sequence[float], valid_mask: Sequence[bool]) tuple[float | None, float | None]

Return minimum and maximum valid period from a validity mask.

Parameters:
  • periods_s (Sequence) – Required function argument.

  • valid_mask (Sequence) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple

High-level QC table builders for public workflows.

Workflow Overview

This module creates the standard QC tables used by notebooks, CLI commands, figures, metric filtering, dashboards, and manual-review exports.

spatial_vtk.qc.build.workflow.build_comparison_eligibility(qc_summary: DataFrame | str | Path) DataFrame

Return rows where observed and synthetic QC both pass.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific QC summary table.

  • qc_summary – Required function argument.

Returns:

Comparison-eligible event/station/component/passband/metric rows.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.workflow.build_metric_pair_retention_table(qc_summary: DataFrame | str | Path, *, group_cols: Sequence[str] = ('metric', 'passband')) DataFrame

Summarize post-QC observed/synthetic pair retention.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Side-specific metric QC table with observed and synthetic rows.

  • group_cols (collections.abc.Sequence, optional) – Columns used to group retention percentages, usually metric and passband.

  • qc_summary – Required function argument.

  • group_cols – Optional function argument. Defaults to ('metric', 'passband').

Returns:

Pair-retention rows with total pairs before QC, retained pairs after QC, and retained percentage.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.workflow.build_metric_qc_summary(event_station_records: DataFrame | str | Path, *, metrics: Sequence[str], components: Sequence[str], passbands: Sequence[str | Sequence[float]], spectral_periods_s: Sequence[float] = (), sources: Sequence[str] = ('observed', 'synthetic'), synthetic_max_frequency_hz: float | None = None, observed_available: bool = True, synthetic_available: bool = True, trace_qc_summary: DataFrame | str | Path | None = None, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25, return_result: bool = True) DataFrame

Build a side-specific metric QC summary from event-station records.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with at least event_id and station columns.

  • metrics (collections.abc.Sequence) – Requested metrics, components, and period bands.

  • components (collections.abc.Sequence) – Requested metrics, components, and period bands.

  • passbands (collections.abc.Sequence) – Requested metrics, components, and period bands.

  • spectral_periods_s (collections.abc.Sequence, optional) – Periods to check for spectral metrics.

  • sources (collections.abc.Sequence, optional) – Sources to include, usually observed and synthetic.

  • synthetic_max_frequency_hz (float | None, optional) – Maximum valid synthetic frequency. Synthetic spectral periods shorter than 1 / synthetic_max_frequency_hz fail QC.

  • observed_available (bool, optional) – Default availability values when the input table does not include source-specific availability columns.

  • synthetic_available (bool, optional) – Default availability values when the input table does not include source-specific availability columns.

  • trace_qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional side-specific waveform QC table. When provided, failed source/event/station/component/passband rows fail matching metric rows.

  • verbose (bool, optional) – Print progress messages while building QC rows.

  • progress_interval (int, optional) – Number of event-station records between progress messages.

  • checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where intermediate QC rows are written.

  • resume (bool, optional) – When true and checkpoint_path exists, skip event/station records already present in that checkpoint.

  • checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.

  • return_result (bool, optional) – When false, append checkpoint rows on disk and return an empty dataframe instead of loading/returning the full QC inventory. This is intended for large Slurm jobs.

  • event_station_records – Required function argument.

  • metrics – Required function argument.

  • components – Required function argument.

  • passbands – Required function argument.

  • spectral_periods_s – Optional function argument. Defaults to ().

  • sources – Optional function argument. Defaults to ('observed', 'synthetic').

  • synthetic_max_frequency_hz – Optional function argument. Defaults to None.

  • observed_available – Optional function argument. Defaults to True.

  • synthetic_available – Optional function argument. Defaults to True.

  • trace_qc_summary – Optional function argument. Defaults to None.

  • verbose – Optional function argument. Defaults to False.

  • progress_interval – Optional function argument. Defaults to 25.

  • checkpoint_path – Optional function argument. Defaults to None.

  • resume – Optional function argument. Defaults to True.

  • checkpoint_interval – Optional function argument. Defaults to 25.

  • return_result – Optional function argument. Defaults to True.

Returns:

Standard metric QC rows.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.workflow.build_post_qc_record_table(event_station_records: DataFrame | str | Path, *, events: DataFrame | str | Path | None = None, qc_summary: DataFrame | str | Path | None = None) DataFrame

Build station-event records for post-QC map figures.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with station coordinates.

  • events (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional event metadata with event coordinates.

  • qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional QC summary used to assign pass/fail status per event/station.

  • event_station_records – Required function argument.

  • events – Optional function argument. Defaults to None.

  • qc_summary – Optional function argument. Defaults to None.

Returns:

Records with sta_lat, sta_lon, event coordinates, and qc_status.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.workflow.build_qc_availability_table(event_station_records: DataFrame | str | Path, *, qc_summary: DataFrame | str | Path | None = None, qc_aggregate: str = 'all_pass', observed_root: str | Path | None = None, synthetic_root: str | Path | None = None, observed_inventory: DataFrame | str | Path | None = None, synthetic_inventory: DataFrame | str | Path | None = None, cfg: SpatialVTKConfig | None = None) DataFrame

Build observed/synthetic availability rows for QC figures.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table.

  • qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional side-specific QC summary. When provided, availability is based on post-QC retained rows instead of file presence.

  • qc_aggregate (str, optional) – "all_pass" marks a side available only if all matching QC rows pass. "any_pass" marks a side available when at least one matching row passes.

  • observed_root (str | pathlib.Path | None, optional) – Optional roots used to inventory files when inventory tables are not already available. When omitted, the active config’s paths.observed_root and paths.synthetic_root are used when present.

  • synthetic_root (str | pathlib.Path | None, optional) – Optional roots used to inventory files when inventory tables are not already available. When omitted, the active config’s paths.observed_root and paths.synthetic_root are used when present.

  • observed_inventory (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional file inventory tables.

  • synthetic_inventory (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional file inventory tables.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object used to resolve default roots.

  • event_station_records – Required function argument.

  • qc_summary – Optional function argument. Defaults to None.

  • qc_aggregate – Optional function argument. Defaults to 'all_pass'.

  • observed_root – Optional function argument. Defaults to None.

  • synthetic_root – Optional function argument. Defaults to None.

  • observed_inventory – Optional function argument. Defaults to None.

  • synthetic_inventory – Optional function argument. Defaults to None.

  • cfg – Optional function argument. Defaults to None.

Returns:

Availability table with one row per event/station.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.workflow.build_qc_waveform_comparison_records(event_station_records: DataFrame | str | Path, qc_summary: DataFrame | str | Path | None = None, *, comparison_eligible: DataFrame | str | Path | None = None, component: str = 'Z', passband: str | None = None, event_id: str | list[str] | tuple[str, ...] | None = None, max_distance_km: float | None = 50.0, max_records: int | None = 12, observed_waveform_col: str = 'observed_processed_waveform', synthetic_waveform_col: str = 'synthetic_processed_waveform') DataFrame

Build post-QC waveform rows for observed/synthetic visual inspection.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Prepared event-station table with waveform paths.

  • qc_summary (pandas.DataFrame | str | pathlib.Path | None, optional) – Side-specific QC table. Used to build comparison-eligible rows when comparison_eligible is not supplied.

  • comparison_eligible (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional precomputed output from build_comparison_eligibility().

  • component (str, optional) – Component to load for the visual comparison.

  • passband (str | None, optional) – Optional retained passband to select.

  • event_id (str | list[str] | tuple[str, ...] | None, optional) – Optional event ID or IDs to select before loading waveforms.

  • max_distance_km (float | None, optional) – Optional distance limit in kilometers.

  • max_records (int | None, optional) – Optional maximum retained event-station rows to load.

  • observed_waveform_col (str, optional) – Waveform path columns in event_station_records.

  • synthetic_waveform_col (str, optional) – Waveform path columns in event_station_records.

  • event_station_records – Required function argument.

  • qc_summary – Optional function argument. Defaults to None.

  • comparison_eligible – Optional function argument. Defaults to None.

  • component – Optional function argument. Defaults to 'Z'.

  • passband – Optional function argument. Defaults to None.

  • event_id – Optional function argument. Defaults to None.

  • max_distance_km – Optional function argument. Defaults to 50.0.

  • max_records – Optional function argument. Defaults to 12.

  • observed_waveform_col – Optional function argument. Defaults to 'observed_processed_waveform'.

  • synthetic_waveform_col – Optional function argument. Defaults to 'synthetic_processed_waveform'.

Returns:

Rows with observed and synthetic trace objects, sample intervals, event-origin offsets, and distance metadata.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.workflow.build_retention_figure_table(qc_summary: DataFrame | str | Path) DataFrame

Prepare QC rows for retention summary figures.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Metric QC summary table.

  • qc_summary – Required function argument.

Returns:

Copy with stage set to passband labels.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.workflow.build_waveform_qc_summary(event_station_records: DataFrame | str | Path, *, sources: Sequence[str] = ('observed', 'synthetic'), waveform_path_columns: dict[str, str] | None = None, components: Sequence[str] | None = None, passbands: Sequence[str | Sequence[float]] | None = None, preprocessing: WaveformPreprocessing | None = None, apply_config_preprocessing_to_processed_files: bool = False, cfg: SpatialVTKConfig | None = None, min_record_length_s: float | None = None, min_end_after_origin_s: float | None = None, snr_threshold: float | None = None, arrival_pick_catalog: DataFrame | str | Path | None = None, onset_phase: str = 'P', min_onset_pick_probability: float = 0.0, verbose: bool = False, progress_interval: int = 25, checkpoint_path: str | Path | None = None, resume: bool = True, checkpoint_interval: int = 25, return_result: bool = True) DataFrame

Build observed/synthetic waveform QC rows from event-station records.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – Event-station table with waveform path columns.

  • sources (collections.abc.Sequence, optional) – Source labels to inspect.

  • waveform_path_columns (dict[str, str] | None, optional) – Optional mapping from source label to waveform path column.

  • components (collections.abc.Sequence[str] | None, optional) – Optional component and period-band selections. When omitted, the active metric settings are used.

  • passbands (collections.abc.Sequence[str | collections.abc.Sequence[float]] | None, optional) – Optional component and period-band selections. When omitted, the active metric settings are used.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional preprocessing applied before QC. When omitted and a processed waveform column is used, no extra filtering is applied.

  • apply_config_preprocessing_to_processed_files (bool, optional) – Whether processed waveform columns should still use config preprocessing during QC.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active config is used.

  • min_record_length_s (float | None, optional) – Optional QC threshold overrides. Missing values are read from qc.automatic in the config.

  • min_end_after_origin_s (float | None, optional) – Optional QC threshold overrides. Missing values are read from qc.automatic in the config.

  • snr_threshold (float | None, optional) – Optional QC threshold overrides. Missing values are read from qc.automatic in the config.

  • arrival_pick_catalog (pandas.DataFrame | str | pathlib.Path | None, optional) – Optional PhaseNet-style pick catalog used to anchor QC windows.

  • onset_phase (str, optional) – Pick phase used as the QC onset when available.

  • min_onset_pick_probability (float, optional) – Minimum picker probability accepted for the QC onset pick.

  • verbose (bool, optional) – Print progress messages while loading waveforms and building QC rows.

  • progress_interval (int, optional) – Number of event-station records between progress messages.

  • checkpoint_path (str | pathlib.Path | None, optional) – Optional table path where the combined waveform QC summary is written. Per-source intermediate checkpoints are written next to this path.

  • resume (bool, optional) – When true, existing per-source checkpoints are used to skip completed event/station/component groups.

  • checkpoint_interval (int, optional) – Number of event-station records between checkpoint writes.

  • return_result (bool, optional) – When false, write per-source checkpoints and combine them on disk into checkpoint_path instead of keeping all source QC rows in memory. This is intended for Slurm workers on large inventories.

  • event_station_records – Required function argument.

  • sources – Optional function argument. Defaults to ('observed', 'synthetic').

  • waveform_path_columns – Optional function argument. Defaults to None.

  • components – Optional function argument. Defaults to None.

  • passbands – Optional function argument. Defaults to None.

  • preprocessing – Optional function argument. Defaults to None.

  • apply_config_preprocessing_to_processed_files – Optional function argument. Defaults to False.

  • cfg – Optional function argument. Defaults to None.

  • min_record_length_s – Optional function argument. Defaults to None.

  • min_end_after_origin_s – Optional function argument. Defaults to None.

  • snr_threshold – Optional function argument. Defaults to None.

  • arrival_pick_catalog – Optional function argument. Defaults to None.

  • onset_phase – Optional function argument. Defaults to 'P'.

  • min_onset_pick_probability – Optional function argument. Defaults to 0.0.

  • verbose – Optional function argument. Defaults to False.

  • progress_interval – Optional function argument. Defaults to 25.

  • checkpoint_path – Optional function argument. Defaults to None.

  • resume – Optional function argument. Defaults to True.

  • checkpoint_interval – Optional function argument. Defaults to 25.

  • return_result – Optional function argument. Defaults to True.

Returns:

Side-specific waveform QC rows that can be passed to build_metric_qc_summary(trace_qc_summary=...).

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.build.workflow.export_manual_review_queue(qc_summary: DataFrame | str | Path, output_path: str | Path | None = None, *, cfg: SpatialVTKConfig | None = None) Path

Write a manual-review queue from QC summary rows.

Parameters:
  • qc_summary (pandas.DataFrame | str | pathlib.Path) – Metric QC summary table.

  • output_path (str | pathlib.Path | None, optional) – CSV or JSON output path. When omitted, the standard manual_review_queue output path is resolved from the active config.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object used when resolving the default output path.

  • qc_summary – Required function argument.

  • output_path – Optional function argument. Defaults to None.

  • cfg – Optional function argument. Defaults to None.

Returns:

Written queue path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

Review

Table helpers for manual QC review queues.

Review Overview

This module owns small public table operations for manual quality-control review: queue creation, decision CSV loading/writing, and applying manual decisions to automated inventory rows.

spatial_vtk.qc.review.tables.apply_manual_qc_decisions(inventory_df: DataFrame, decisions: DataFrame | str | Path | None, *, band_columns: tuple[str, ...] | None = None) DataFrame

Apply manual accept/reject decisions to an automated inventory table.

Parameters:
  • inventory_df (pandas.DataFrame) – Automated QC inventory.

  • decisions (pandas.DataFrame | str | pathlib.Path | None) – Decision table or CSV path.

  • band_columns (tuple[str, ...] | None, optional) – Optional reject-column suffixes. When omitted, columns named reject_* are inferred.

  • inventory_df – Required function argument.

  • decisions – Required function argument.

  • band_columns – Optional function argument. Defaults to None.

Returns:

Inventory copy with manual decision columns applied.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.review.tables.decision_key(event_id: object, station: object, component: object, scope_kind: object, scope_label: object) tuple[str, str, str, str, str]

Build the normalized key for one manual QC decision.

Parameters:
  • event_id (object) – Required function argument.

  • station (object) – Required function argument.

  • component (object) – Required function argument.

  • scope_kind (object) – Required function argument.

  • scope_label (object) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.qc.review.tables.filter_trace_summary(df: DataFrame, *, event_id: str | None = None, station: str | None = None, component: str | None = None, accepted: bool | None = None, reject_reason_contains: str | None = None) DataFrame

Filter a trace-summary table using common review fields.

Parameters:
  • df (pandas.DataFrame) – Required function argument.

  • event_id (str | None, optional) – Optional function argument. Defaults to None.

  • station (str | None, optional) – Optional function argument. Defaults to None.

  • component (str | None, optional) – Optional function argument. Defaults to None.

  • accepted (bool | None, optional) – Optional function argument. Defaults to None.

  • reject_reason_contains (str | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.review.tables.load_manual_qc_decisions(path: str | Path | None) DataFrame

Load manual QC decisions from CSV.

Parameters:
  • path (str | pathlib.Path | None) – Decision CSV path. Missing or None returns an empty table.

  • path – Required function argument.

Returns:

Normalized decision table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.review.tables.normalize_manual_qc_decisions(df: DataFrame) DataFrame

Normalize a manual QC decision table.

Parameters:
  • df (pandas.DataFrame) – Raw decision table.

  • df – Required function argument.

Returns:

Decision table with public columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.qc.review.tables.queue_rows_from_filtered_trace_df(df: DataFrame, *, key_columns: tuple[str, ...] = ('event_id', 'station', 'component'), status: str = 'pending') list[dict[str, object]]

Convert a filtered trace table into manual-review queue rows.

Parameters:
  • df (pandas.DataFrame) – Required function argument.

  • key_columns (tuple, optional) – Optional function argument. Defaults to ('event_id', 'station', 'component').

  • status (str, optional) – Optional function argument. Defaults to 'pending'.

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.review.tables.write_manual_qc_decisions(df: DataFrame, path: str | Path, *, overwrite: bool = True) Path

Write manual QC decisions to CSV.

Parameters:
  • df (pandas.DataFrame) – Decision rows.

  • path (str | pathlib.Path) – Output CSV path.

  • overwrite (bool, optional) – Whether to replace an existing file.

  • df – Required function argument.

  • path – Required function argument.

  • overwrite – Optional function argument. Defaults to True.

Returns:

Written path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

Summary

Shared QC classification and reject-rule helpers.

spatial_vtk.qc.summary.rules.classify_station_family(network: str, station: str) str

Classify one station as broadband, strong-motion, or unknown.

Parameters:
  • network (str) – Required function argument.

  • station (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.qc.summary.rules.dedupe_reason_codes(reasons: list[str]) list[str]

Deduplicate reject reason codes while preserving first-seen order.

Parameters:

reasons (list) – Required function argument.

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.qc.summary.rules.dominant_energy_band(freqs_hz: ndarray, power: ndarray, band_edges_hz: dict[str, tuple[float, float]]) tuple[str, dict[str, float]]

Resolve which standard band carries the most spectral power.

Parameters:
  • freqs_hz (numpy.ndarray) – Required function argument.

  • power (numpy.ndarray) – Required function argument.

  • band_edges_hz (dict) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.qc.summary.rules.global_trace_reject_reasons(*, record_length_s: float, end_rel_s: float, onset_reasons: list[str], min_end_after_origin_s: float, min_record_length_s: float) tuple[bool, list[str]]

Apply station-level reject rules shared by every passband.

Parameters:
  • record_length_s (float) – Required function argument.

  • end_rel_s (float) – Required function argument.

  • onset_reasons (list) – Required function argument.

  • min_end_after_origin_s (float) – Required function argument.

  • min_record_length_s (float) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.qc.summary.rules.reject_passband(*, global_reasons: list[str], snr_rms: float, snr_threshold: float, noise_window_valid: bool, signal_window_valid: bool, pre_origin_window_valid: bool, pre_origin_signal_ratio: float, pre_origin_signal_ratio_threshold: float, origin_window_valid: bool, origin_signal_ratio: float) tuple[bool, list[str]]

Apply passband-specific reject rules and merge shared global reasons.

Parameters:
  • global_reasons (list) – Required function argument.

  • snr_rms (float) – Required function argument.

  • snr_threshold (float) – Required function argument.

  • noise_window_valid (bool) – Required function argument.

  • signal_window_valid (bool) – Required function argument.

  • pre_origin_window_valid (bool) – Required function argument.

  • pre_origin_signal_ratio (float) – Required function argument.

  • pre_origin_signal_ratio_threshold (float) – Required function argument.

  • origin_window_valid (bool) – Required function argument.

  • origin_signal_ratio (float) – Required function argument.

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.qc.summary.rules.station_code_has_letters(station: str) bool

Return whether one station code contains at least one alphabetic character.

Parameters:

station (str) – Required function argument.

Returns:

Return value produced by the function.

Return type:

bool