Input and Output API

The I/O modules prepare station and event metadata, discover waveform files, write standard output artifacts, and handle preprocessing products used by the rest of the workflow.

Package Entry Point

Input/output helpers for manifests and waveform format handling.

class spatial_vtk.io.ArtifactRecord(artifact_path: str, kind: str, name: str, status: str = 'planned', artifact_hash: str = '', metadata: Mapping[str, Any] | None = None, recorded_at: str = '')

Record one workflow artifact in a public registry.

Parameters:
  • artifact_path (str) – Artifact path on disk.

  • kind (str) – Broad artifact group such as "metrics" or "qc".

  • name (str) – Human-readable artifact name.

  • status (str) – Status label such as "planned", "written", or "missing".

  • artifact_hash (str) – Optional deterministic hash from an artifact spec.

  • metadata (Mapping[str, Any] | None) – Optional user-provided metadata.

  • recorded_at (str) – UTC ISO timestamp.

Returns:

Immutable registry entry.

Return type:

ArtifactRecord

class spatial_vtk.io.ArtifactRegistry(registry_path: str | Path)

Append-only JSON-lines artifact registry.

Parameters:

registry_path – Path to the registry file.

Returns:

Registry object used to append and inspect artifact records.

Return type:

ArtifactRegistry

missing() list[ArtifactRecord]

Return registry records whose paths do not exist.

Parameters:

None

Returns:

Records for missing files.

Return type:

list of ArtifactRecord

Returns:

Return value produced by the function.

Return type:

list

record(artifact_path: str | Path, *, kind: str, name: str, status: str = 'written', spec: ArtifactSpec | None = None, metadata: Mapping[str, Any] | None = None) ArtifactRecord

Append one artifact record.

Parameters:
  • artifact_path (str | pathlib.Path) – Output path being recorded.

  • kind (str) – Registry classification fields.

  • name (str) – Registry classification fields.

  • status (str, optional) – Registry classification fields.

  • spec (spatial_vtk.io.artifacts.ArtifactSpec | None, optional) – Optional artifact spec used to compute a stable hash.

  • metadata (Optional, optional) – Optional metadata copied into the registry.

  • artifact_path – Required function argument.

  • kind – Required function argument.

  • name – Required function argument.

  • status – Optional function argument. Defaults to 'written'.

  • spec – Optional function argument. Defaults to None.

  • metadata – Optional function argument. Defaults to None.

Returns:

Appended record.

Return type:

ArtifactRecord

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.artifacts.ArtifactRecord

records() list[ArtifactRecord]

Read all registry records.

Parameters:

None

Returns:

Registry records in file order.

Return type:

list of ArtifactRecord

Returns:

Return value produced by the function.

Return type:

list

to_frame()

Return records as a pandas DataFrame.

Parameters:

None

Returns:

Registry table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

Any

class spatial_vtk.io.ArtifactSpec(kind: str, name: str, scope: Mapping[str, Any] | None = None, config: Mapping[str, Any] | None = None, extension: str = '.json', subdir: str | None = None)

Describe one planned output artifact.

Parameters:
  • kind (str) – Broad artifact type such as "figure", "metrics", or "dashboard".

  • name (str) – Human-readable artifact name.

  • scope (Mapping[str, Any] | None) – Stable row/plot identity such as metric, event, station, or model.

  • config (Mapping[str, Any] | None) – Compute-relevant configuration.

  • extension (str) – Output filename extension, including the leading dot.

  • subdir (str | None) – Optional subdirectory under the artifact root.

Returns:

Immutable artifact planning record.

Return type:

ArtifactSpec

payload() dict[str, Any]

Return the deterministic payload used for hashing.

Returns:

Return value produced by the function.

Return type:

dict

class spatial_vtk.io.MetricCompleteness(expected: int, present: int, missing: int, key_columns: tuple[str, ...])

Summary of expected, present, and missing metric rows.

class spatial_vtk.io.MetricPlan(metrics: tuple[str, ...], passbands: tuple[tuple[float, float], ...], components: tuple[str, ...], models: tuple[str, ...], metric_groups: tuple[str, ...] = (), transforms: tuple[str, ...] = (), spectral_periods_s: tuple[float, ...] = (), output_mode: str = 'full', synthetic_max_frequency_hz: float | None = None, waveform_lowpass_hz: float | None = None, waveform_resample_hz: float | None = None, waveform_filter_order: int | None = None, output_path: Path | None = None)

Resolved metric-calculation plan.

Parameters:
  • metrics (tuple[str, ...]) – Metric names/codes to calculate.

  • passbands (tuple[tuple[float, float], ...]) – Period passbands as (period_min_s, period_max_s) pairs.

  • components (tuple[str, ...]) – Waveform components to process.

  • models (tuple[str, ...]) – Synthetic model aliases or names.

  • output_path (pathlib.Path | None) – Optional output path for the metric table.

Returns:

Immutable metric-calculation plan.

Return type:

MetricPlan

property transform_columns: tuple[str, ...]

Return requested transform output columns.

class spatial_vtk.io.ModelFolderCandidate(folder: str, base_model: str, basin_scope: str, has_ely: bool, implementation_tokens: tuple[str, ...], default_alias: str)

Describe one discovered synthetic model folder.

class spatial_vtk.io.ModelResolution(models: tuple[str, ...], model_folders: dict[str, str], ambiguous: dict[str, tuple[ModelFolderCandidate, ...]])

Resolved model aliases and backing folders.

class spatial_vtk.io.PreprocessedWaveform(data: ndarray, dt: float, sampling_rate_hz: float, processing_label: str)

Preprocessed waveform samples and updated timing metadata.

Parameters:
  • data (numpy.ndarray) – One-dimensional preprocessed samples.

  • dt (float) – Updated sample interval in seconds.

  • sampling_rate_hz (float) – Updated sampling rate in Hz.

  • processing_label (str) – Human-readable description of the applied preprocessing.

Returns:

Immutable preprocessing result.

Return type:

PreprocessedWaveform

class spatial_vtk.io.SyntheticFormatInfo(root: str, format: str, normalized: bool, needs_salvus_handling: bool, handling_mode: str | None = None, converted_root: str | None = None)

Description of one synthetic data layout.

class spatial_vtk.io.WaveformPreprocessing(lowpass_hz: float | None = None, highpass_hz: float | None = None, bandpass_low_hz: float | None = None, bandpass_high_hz: float | None = None, resample_hz: float | None = None, filter_order: int = 4)

Configured preprocessing applied before waveform QC, metrics, or figures.

Parameters:
  • lowpass_hz (float | None) – Optional lowpass cutoff in Hz. None means no configured lowpass.

  • highpass_hz (float | None) – Optional highpass cutoff in Hz. None means no configured highpass.

  • bandpass_low_hz – Optional bandpass corner frequencies in Hz. Set both values to apply a bandpass instead of separate highpass/lowpass steps.

  • bandpass_high_hz – Optional bandpass corner frequencies in Hz. Set both values to apply a bandpass instead of separate highpass/lowpass steps.

  • resample_hz (float | None) – Optional target sampling rate in Hz after filtering.

  • filter_order (int) – Butterworth filter order used for waveform filters.

Returns:

Immutable preprocessing settings.

Return type:

WaveformPreprocessing

class spatial_vtk.io.WaveformPreprocessingWorkflowResult(event_station_records: DataFrame, event_station_path: Path, manifest: DataFrame, manifest_path: Path, trace_metadata: DataFrame, trace_metadata_path: Path)

Outputs written by preprocess_waveform_files().

Parameters:
  • event_station_records (pandas.DataFrame) – Updated event-station table.

  • event_station_path (pathlib.Path) – Path where the updated event-station table was written.

  • manifest (pandas.DataFrame) – One row per source waveform file.

  • manifest_path (pathlib.Path) – Path where the preprocessing manifest was written.

  • trace_metadata (pandas.DataFrame) – One row per processed trace.

  • trace_metadata_path (pathlib.Path) – Path where trace metadata was written.

Returns:

Immutable workflow result with dataframes and written paths.

Return type:

WaveformPreprocessingWorkflowResult

spatial_vtk.io.aggregate_metric_by_station_over_events(df: DataFrame, *, metric_col: str, model_col: str = 'model', station_col: str = 'station', latitude_col: str = 'sta_lat', longitude_col: str = 'sta_lon', event_col: str = 'event_id') DataFrame

Average a metric by station after first averaging within each event.

Parameters:
  • df (pandas.DataFrame) – Metric table.

  • metric_col (str) – Numeric column to aggregate.

  • model_col (str, optional) – Column names defining model, station, coordinates, and event.

  • station_col (str, optional) – Column names defining model, station, coordinates, and event.

  • latitude_col (str, optional) – Column names defining model, station, coordinates, and event.

  • longitude_col (str, optional) – Column names defining model, station, coordinates, and event.

  • event_col (str, optional) – Column names defining model, station, coordinates, and event.

  • df – Required function argument.

  • metric_col – Required function argument.

  • model_col – Optional function argument. Defaults to 'model'.

  • station_col – Optional function argument. Defaults to 'station'.

  • latitude_col – Optional function argument. Defaults to 'sta_lat'.

  • longitude_col – Optional function argument. Defaults to 'sta_lon'.

  • event_col – Optional function argument. Defaults to 'event_id'.

Returns:

Station-level metric table with n_events when event information is available.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.apply_waveform_preprocessing(data: Any, dt: float, preprocessing: WaveformPreprocessing | None = None, *, config: Any | None = None) ndarray

Apply configured waveform preprocessing to one sample array.

Parameters:
  • data (Any) – One-dimensional waveform samples.

  • dt (float) – Sample interval in seconds.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Explicit preprocessing settings. When omitted, settings are read from config or from the active Spatial-VTK config.

  • config (Any | None, optional) – Optional config used only when preprocessing is omitted.

  • data – Required function argument.

  • dt – Required function argument.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

Returns:

Preprocessed waveform samples.

Return type:

numpy.ndarray

Returns:

Return value produced by the function.

Return type:

numpy.ndarray

spatial_vtk.io.apply_waveform_preprocessing_with_metadata(data: Any, dt: float, preprocessing: WaveformPreprocessing | None = None, *, config: Any | None = None) PreprocessedWaveform

Apply configured preprocessing and return updated timing metadata.

Parameters:
  • data (Any) – One-dimensional waveform samples.

  • dt (float) – Sample interval in seconds.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Explicit preprocessing settings. When omitted, settings are read from config or from the active Spatial-VTK config.

  • config (Any | None, optional) – Optional config used only when preprocessing is omitted.

  • data – Required function argument.

  • dt – Required function argument.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

Returns:

Filtered/resampled samples plus updated dt and sampling rate.

Return type:

PreprocessedWaveform

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.waveforms.PreprocessedWaveform

spatial_vtk.io.artifact_manifest_path(artifact_path: str | Path) Path

Return the JSON manifest sidecar path for one artifact path.

Parameters:

artifact_path (str | pathlib.Path) – Required function argument.

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.artifact_path_for_spec(root: str | Path, spec: ArtifactSpec, *, hash_length: int = 10) Path

Return the deterministic output path for one artifact spec.

Parameters:
  • root (str | pathlib.Path) – Artifact root directory.

  • spec (spatial_vtk.io.artifacts.ArtifactSpec) – Artifact planning record.

  • hash_length (int, optional) – Hash prefix length in the filename.

  • root – Required function argument.

  • spec – Required function argument.

  • hash_length – Optional function argument. Defaults to 10.

Returns:

Planned artifact path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.atomic_write_csv(df: DataFrame, path: str | PathLike[str]) Path

Atomically write a dataframe to CSV.

Parameters:
  • df (pandas.DataFrame) – Dataframe to write.

  • path (str | os.PathLike[str]) – Destination CSV path.

  • df – Required function argument.

  • path – Required function argument.

Returns:

Written CSV path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.available_base_models(input_syn_path: str | Path) list[str]

Return discovered base model families.

Parameters:
  • input_syn_path (str | pathlib.Path) – Synthetic root directory or template path.

  • input_syn_path – Required function argument.

Returns:

Sorted base-model family names.

Return type:

list of str

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.io.build_file_inventory(root: str | Path, *, dataset: str, suffixes: Iterable[str] = frozenset({'.asdf', '.h5', '.hdf5', '.json', '.ms', '.mseed', '.pkl', '.sac'}), relative_to: str | Path | None = None, include_sha256: bool = True) DataFrame

Build a lightweight inventory of files under one input folder.

Parameters:
  • root (str | pathlib.Path) – Folder to scan recursively.

  • dataset (str) – Dataset label recorded in the output, such as "observed" or "synthetic".

  • suffixes (Iterable, optional) – File suffixes to include.

  • relative_to (str | pathlib.Path | None, optional) – Optional base path used to store relative paths.

  • include_sha256 (bool, optional) – Whether to compute file hashes.

  • root – Required function argument.

  • dataset – Required function argument.

  • suffixes – Optional function argument. Defaults to frozenset({'.hdf5', '.h5', '.pkl', '.sac', '.asdf', '.mseed', '.ms', '.json'}).

  • relative_to – Optional function argument. Defaults to None.

  • include_sha256 – Optional function argument. Defaults to True.

Returns:

Inventory table with dataset, path, filename, suffix, size, and optional SHA-256 columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.build_master_event_list(*, event_tables: Sequence[DataFrame | str | Path] | None = None, event_records: Sequence[Mapping[str, Any]] | None = None, extra_columns: Sequence[str] | None = None) DataFrame

Build a deduplicated master event list.

Parameters:
  • event_tables (collections.abc.Sequence[pandas.DataFrame | str | pathlib.Path] | None, optional) – Event metadata tables or CSV paths.

  • event_records (collections.abc.Sequence[collections.abc.Mapping[str, Any]] | None, optional) – Optional mapping records.

  • extra_columns (collections.abc.Sequence[str] | None, optional) – Optional extra columns to preserve.

  • event_tables – Optional function argument. Defaults to None.

  • event_records – Optional function argument. Defaults to None.

  • extra_columns – Optional function argument. Defaults to None.

Returns:

Deduplicated event table with public columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.build_master_station_list(*, station_tables: Sequence[DataFrame | str | Path] | None = None, streams: Sequence[Any] | None = None, extra_columns: Sequence[str] | None = None) DataFrame

Build a deduplicated master station list.

Parameters:
  • station_tables (collections.abc.Sequence[pandas.DataFrame | str | pathlib.Path] | None, optional) – Station metadata tables or CSV paths.

  • streams (collections.abc.Sequence[Any] | None, optional) – Optional waveform streams whose trace metadata includes station fields.

  • extra_columns (collections.abc.Sequence[str] | None, optional) – Optional extra columns to preserve when present in station tables.

  • station_tables – Optional function argument. Defaults to None.

  • streams – Optional function argument. Defaults to None.

  • extra_columns – Optional function argument. Defaults to None.

Returns:

Deduplicated station table with public columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.build_observed_synthetic_inventory(observed_root: str | Path, synthetic_root: str | Path, *, suffixes: Iterable[str] = frozenset({'.asdf', '.h5', '.hdf5', '.json', '.ms', '.mseed', '.pkl', '.sac'}), relative_to: str | Path | None = None, include_sha256: bool = True) DataFrame

Build one combined inventory for observed and synthetic inputs.

Parameters:
  • observed_root (str | pathlib.Path) – Observed input folder.

  • synthetic_root (str | pathlib.Path) – Synthetic input folder.

  • suffixes (Iterable, optional) – File suffixes to include.

  • relative_to (str | pathlib.Path | None, optional) – Optional base path used to store relative paths.

  • include_sha256 (bool, optional) – Whether to compute file hashes.

  • observed_root – Required function argument.

  • synthetic_root – Required function argument.

  • suffixes – Optional function argument. Defaults to frozenset({'.hdf5', '.h5', '.pkl', '.sac', '.asdf', '.mseed', '.ms', '.json'}).

  • relative_to – Optional function argument. Defaults to None.

  • include_sha256 – Optional function argument. Defaults to True.

Returns:

Combined observed/synthetic file inventory.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.canonical_json(value: Mapping[str, Any]) str

Serialize one mapping to stable JSON.

Parameters:
  • value (Mapping) – Mapping to serialize.

  • value – Required function argument.

Returns:

Deterministic compact JSON string.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.classify_model_folder(folder_name: str) ModelFolderCandidate | None

Classify one folder name into a model candidate.

Parameters:
  • folder_name (str) – Folder name to classify.

  • folder_name – Required function argument.

Returns:

Classified model candidate, or None when no known model token is found.

Return type:

ModelFolderCandidate or None

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.model_aliases.ModelFolderCandidate | None

spatial_vtk.io.compare_metric_plan_to_table(expected_df: DataFrame, metrics_df: DataFrame, *, key_columns: Sequence[str] = ('event_id', 'station', 'component', 'model', 'passband', 'metric_group', 'metric', 'period_s')) tuple[DataFrame, MetricCompleteness]

Compare expected metric rows to an existing metrics table.

Parameters:
  • expected_df (pandas.DataFrame) – Expected key table.

  • metrics_df (pandas.DataFrame) – Existing metrics table.

  • key_columns (Sequence, optional) – Columns used for comparison.

  • expected_df – Required function argument.

  • metrics_df – Required function argument.

  • key_columns – Optional function argument. Defaults to ('event_id', 'station', 'component', 'model', 'passband', 'metric_group', 'metric', 'period_s').

Returns:

Missing-row table and completeness summary.

Return type:

tuple

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.io.comparison_qc_passed(obs_row: dict[str, Any] | None, syn_row: dict[str, Any] | None) bool

Return whether both observed and synthetic QC rows pass.

Parameters:
  • obs_row (dict[str, Any] | None) – Required function argument.

  • syn_row (dict[str, Any] | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

bool

spatial_vtk.io.compute_sha256(path: str | Path, *, chunk_size: int = 1048576) str

Compute the SHA-256 digest for one file.

Parameters:
  • path (str | pathlib.Path) – File to hash.

  • chunk_size (int, optional) – Number of bytes read per chunk.

  • path – Required function argument.

  • chunk_size – Optional function argument. Defaults to 1048576.

Returns:

Hexadecimal SHA-256 digest.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.context_dataset_paths(root: str | Path | None = None) dict[str, Path]

Return default public context-dataset paths.

Parameters:
  • root (str | pathlib.Path | None, optional) – Optional repository root containing examples/data.

  • root – Optional function argument. Defaults to None.

Returns:

Named paths for events, geology, regions, subbasins, and event patches.

Return type:

dict

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.default_output_paths(output_dir: str | Path, names: Iterable[str], *, suffix: str = '.csv', create_dir: bool = True) SimpleNamespace

Return a namespace of standard output paths.

Parameters:
  • output_dir (str | pathlib.Path) – Directory where output files should be written.

  • names (Iterable) – Basenames without extension, such as "qc_inventory".

  • suffix (str, optional) – File extension to append when a name has no extension.

  • create_dir (bool, optional) – Whether to create output_dir.

  • output_dir – Required function argument.

  • names – Required function argument.

  • suffix – Optional function argument. Defaults to '.csv'.

  • create_dir – Optional function argument. Defaults to True.

Returns:

Namespace with one attribute per normalized name.

Return type:

types.SimpleNamespace

Returns:

Return value produced by the function.

Return type:

types.SimpleNamespace

spatial_vtk.io.ensure_run_dir(output_root: str | PathLike[str], workflow: str, config_hash: str, *, run_id: str | None = None, work_root: str | PathLike[str] | None = None) Path

Create and return a compute-workflow run directory.

Parameters:
  • output_root (str | os.PathLike[str]) – Metrics or figure output root used when work_root is not supplied.

  • workflow (str) – Workflow name, for example "metrics_export".

  • config_hash (str) – Hash identifying the effective run configuration.

  • run_id (str | None, optional) – Optional human-readable run identifier.

  • work_root (str | os.PathLike[str] | None, optional) – Optional explicit parent work directory.

  • output_root – Required function argument.

  • workflow – Required function argument.

  • config_hash – Required function argument.

  • run_id – Optional function argument. Defaults to None.

  • work_root – Optional function argument. Defaults to None.

Returns:

Created run directory.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.expected_metric_rows_from_inventory(inventory_df: DataFrame, plan: MetricPlan, *, model_column: str = 'model') DataFrame

Build expected metric row keys from an inventory table and plan.

Parameters:
  • inventory_df (pandas.DataFrame) – QC inventory with event_id, station, and component fields.

  • plan (spatial_vtk.io.plans.MetricPlan) – Resolved metric plan.

  • model_column (str, optional) – Output column used for model identity.

  • inventory_df – Required function argument.

  • plan – Required function argument.

  • model_column – Optional function argument. Defaults to 'model'.

Returns:

Expected row-key table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.inspect_station_event_layouts(root: str | Path, *, suffixes: tuple[str, ...] = ('.csv', '.json', '.geojson', '.mseed', '.h5', '.hdf5', '.asdf'), max_files: int | None = None) DataFrame

Inventory station/event files below a root directory.

Parameters:
  • root (str | pathlib.Path) – Directory to inspect.

  • suffixes (tuple, optional) – File suffixes to include.

  • max_files (int | None, optional) – Optional cap on the number of files returned.

  • root – Required function argument.

  • suffixes – Optional function argument. Defaults to ('.csv', '.json', '.geojson', '.mseed', '.h5', '.hdf5', '.asdf').

  • max_files – Optional function argument. Defaults to None.

Returns:

File inventory with relative path, suffix, size, and light metadata.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.inspect_synthetic_format(input_syn_path: str | Path) SyntheticFormatInfo

Inspect one synthetic path and classify its format.

Parameters:
  • input_syn_path (str | pathlib.Path) – Synthetic root, file, or glob template.

  • input_syn_path – Required function argument.

Returns:

Format classification.

Return type:

SyntheticFormatInfo

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.synthetic_formats.SyntheticFormatInfo

spatial_vtk.io.load_csv_bundle(sources: str | Path | Sequence[str | Path] | dict[str, Any], *, base_dir: str | Path | None = None, source_column: str = '__source_csv__') DataFrame

Load one or more CSV files into a single table.

Parameters:
  • sources (str | pathlib.Path | collections.abc.Sequence[str | pathlib.Path] | dict[str, Any]) – CSV path, glob pattern, sequence of paths, or config dictionary with one of input_csv, csv_dir, or csv_files.

  • base_dir (str | pathlib.Path | None, optional) – Directory used to resolve relative paths.

  • source_column (str, optional) – Name of the column recording the source CSV path.

  • sources – Required function argument.

  • base_dir – Optional function argument. Defaults to None.

  • source_column – Optional function argument. Defaults to '__source_csv__'.

Returns:

Concatenated CSV table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.load_output_table(key: str, *, cfg: SpatialVTKConfig | None = None, **kwargs: Any) DataFrame

Load a standard output table by artifact key.

Parameters:
  • key (str) – Registered output key such as "prepared_stations".

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active/discoverable config is used.

  • **kwargs – Additional read options forwarded to read_table.

  • key – Required function argument.

  • cfg – Optional function argument. Defaults to None.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Loaded table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.load_waveform_collection(paths: Sequence[str | Path], format: str | None = None) Any

Load several waveform files and combine them when possible.

Parameters:
  • paths (collections.abc.Sequence) – File paths to load.

  • format (str | None, optional) – Optional ObsPy format forwarded to read_waveform_file().

  • paths – Required function argument.

  • format – Optional function argument. Defaults to None.

Returns:

Combined ObsPy stream when all inputs are ObsPy streams, otherwise a flat list of trace-like objects.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.metric_plan_from_config(config: SpatialVTKConfig, *, command: str = 'metrics.calculate', overrides: dict[str, Any] | None = None) MetricPlan

Build a metric plan from public config sections and run defaults.

Parameters:
  • config (spatial_vtk.config.runtime.SpatialVTKConfig) – Loaded Spatial-VTK config.

  • command (str, optional) – Dotted command key used to merge run defaults.

  • overrides (dict[str, Any] | None, optional) – Explicit values for this run. These override the config file and any selected run scenario.

  • config – Required function argument.

  • command – Optional function argument. Defaults to 'metrics.calculate'.

  • overrides – Optional function argument. Defaults to None.

Returns:

Resolved metric plan.

Return type:

MetricPlan

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.plans.MetricPlan

spatial_vtk.io.metric_qc_lookup(qc_table: DataFrame | str | Path | None) dict[tuple[str, str, str, str, str, str, str], dict[str, Any]]

Build a lookup for side-specific metric QC decisions.

Parameters:
  • qc_table (pandas.DataFrame | str | pathlib.Path | None) – QC table or path. None returns an empty lookup.

  • qc_table – Required function argument.

Returns:

Normalized lookup keyed by source/event/station/component/group/metric/period.

Return type:

dict

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.metric_qc_passed(row: dict[str, Any] | Series | None) bool

Return whether a normalized QC row passed.

Parameters:

row (dict[str, Any] | pandas.Series | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

bool

spatial_vtk.io.normalize_event_table(df: DataFrame, *, extra_columns: Sequence[str] | None = None) DataFrame

Normalize event metadata columns.

Parameters:
  • df (pandas.DataFrame) – Raw event table.

  • extra_columns (collections.abc.Sequence[str] | None, optional) – Extra columns to preserve.

  • df – Required function argument.

  • extra_columns – Optional function argument. Defaults to None.

Returns:

Table with public event columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.normalize_metric_qc_table(table: DataFrame | str | Path, *, source: str | None = None) DataFrame

Normalize a side-specific metric QC table.

Parameters:
  • table (pandas.DataFrame | str | pathlib.Path) – Raw QC table or CSV/parquet path.

  • source (str | None, optional) – Optional source override.

  • table – Required function argument.

  • source – Optional function argument. Defaults to None.

Returns:

Normalized QC table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.normalize_metric_table(df: DataFrame) DataFrame

Normalize common legacy metric-table column names.

Parameters:
  • df (pandas.DataFrame) – Metric table with either public or legacy column names.

  • df – Required function argument.

Returns:

Copy of the table with public column names where possible.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.normalize_metric_waveform_inventory(table: DataFrame | str | Path, *, source: str | None = None, synthetic_max_frequency_hz: float | None = None) DataFrame

Normalize an observed or synthetic waveform inventory.

Parameters:
  • table (pandas.DataFrame | str | pathlib.Path) – Raw inventory table or CSV/parquet path.

  • source (str | None, optional) – Optional source override, usually "observed" or "synthetic".

  • synthetic_max_frequency_hz (float | None, optional) – Optional synthetic max-frequency default for rows missing a value.

  • table – Required function argument.

  • source – Optional function argument. Defaults to None.

  • synthetic_max_frequency_hz – Optional function argument. Defaults to None.

Returns:

Normalized inventory with public metric workflow columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.normalize_model_alias(alias: str) str

Normalize model alias spelling.

Parameters:
  • alias (str) – User-supplied model alias.

  • alias – Required function argument.

Returns:

Normalized alias.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.normalize_station_table(df: DataFrame, *, extra_columns: Sequence[str] | None = None) DataFrame

Normalize station metadata columns.

Parameters:
  • df (pandas.DataFrame) – Raw station table.

  • extra_columns (collections.abc.Sequence[str] | None, optional) – Extra columns to preserve.

  • df – Required function argument.

  • extra_columns – Optional function argument. Defaults to None.

Returns:

Table with public station columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.prepare_event_metadata(event_metadata: DataFrame | None = None, *, required_columns: Sequence[str] = ('event_id', 'event_lat', 'event_lon'), keep_extra_columns: bool = True) DataFrame

Prepare an event metadata table with canonical event columns.

Parameters:
  • event_metadata (pandas.DataFrame | None, optional) – Input table with event identifiers and hypocenter/source metadata. When omitted, paths.event_metadata is read from the active config.

  • required_columns (Sequence, optional) – Canonical columns required in the output.

  • keep_extra_columns (bool, optional) – Whether to retain input columns not used for canonical names.

  • event_metadata – Optional function argument. Defaults to None.

  • required_columns – Optional function argument. Defaults to ('event_id', 'event_lat', 'event_lon').

  • keep_extra_columns – Optional function argument. Defaults to True.

Returns:

Prepared table containing at least event_id, event_lat, and event_lon.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.prepare_event_station_table(event_station_metadata: DataFrame | None = None, *, station_metadata: DataFrame | None = None, event_metadata: DataFrame | None = None) DataFrame

Prepare event-station rows and optionally join station/event metadata.

Parameters:
  • event_station_metadata (pandas.DataFrame | None, optional) – Table containing event and station identifiers. When omitted and both station_metadata and event_metadata are provided, every event-station pair is generated from those tables. When omitted without both metadata tables, paths.event_station_table is read from the active config.

  • station_metadata (pandas.DataFrame | None, optional) – Optional prepared station metadata table.

  • event_metadata (pandas.DataFrame | None, optional) – Optional prepared event metadata table.

  • event_station_metadata – Optional function argument. Defaults to None.

  • station_metadata – Optional function argument. Defaults to None.

  • event_metadata – Optional function argument. Defaults to None.

Returns:

Event-station table with canonical identifiers and joined metadata.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.prepare_station_metadata(station_metadata: DataFrame | None = None, *, required_columns: Sequence[str] = ('station', 'lat', 'lon'), keep_extra_columns: bool = True) DataFrame

Prepare a station metadata table with canonical station columns.

Parameters:
  • station_metadata (pandas.DataFrame | None, optional) – Input table with station identifiers and coordinates. When omitted, paths.station_metadata is read from the active config.

  • required_columns (Sequence, optional) – Canonical columns required in the output.

  • keep_extra_columns (bool, optional) – Whether to retain input columns not used for canonical names.

  • station_metadata – Optional function argument. Defaults to None.

  • required_columns – Optional function argument. Defaults to ('station', 'lat', 'lon').

  • keep_extra_columns – Optional function argument. Defaults to True.

Returns:

Prepared table containing at least station, lat, and lon.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.preprocess_stream(stream: Any, preprocessing: WaveformPreprocessing | None = None, *, config: Any | None = None) Any

Apply waveform preprocessing to every trace in a stream-like object.

Parameters:
  • stream (Any) – ObsPy stream, iterable of traces, or one trace-like mapping/object.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional preprocessing settings. When omitted, the active config is used when available.

  • config (Any | None, optional) – Optional Spatial-VTK config used only when preprocessing is omitted.

  • stream – Required function argument.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

Returns:

A copied stream or list of copied trace-like mappings with updated samples, sampling rate, and delta metadata.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.preprocess_waveform_files(event_station_records: DataFrame | str | Path, output_root: str | Path | None = None, *, source_columns: Mapping[str, str] | None = None, preprocessing: WaveformPreprocessing | None = None, config: Any | None = None, event_id_col: str = 'event_id', overwrite: bool = False, continue_on_error: bool = False, replace_input_columns: bool = True, drop_unprocessed_rows: bool = True, verbose: bool = False, event_station_name: str = 'event_station_records_preprocessed.csv', manifest_name: str = 'waveform_preprocessing_manifest.csv', trace_metadata_name: str = 'trace_metadata_preprocessed.csv') WaveformPreprocessingWorkflowResult

Preprocess waveform files and write reusable processed copies.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – DataFrame or CSV/Parquet path with event IDs and waveform paths.

  • output_root (str | pathlib.Path | None, optional) – Folder where processed waveforms and metadata tables will be written. When omitted, outputs.preprocessed_waveforms is read from config or the active Spatial-VTK config.

  • source_columns (Optional, optional) – Optional mapping such as {"observed": "observed_mseed"}. When omitted, common observed/synthetic waveform column names are detected.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Explicit preprocessing settings. When omitted, settings are read from config or from the active Spatial-VTK config.

  • config (Any | None, optional) – Optional Spatial-VTK config used only when preprocessing is omitted.

  • event_id_col (str, optional) – Column containing event IDs.

  • overwrite (bool, optional) – Replace existing processed waveform files when true.

  • continue_on_error (bool, optional) – Record failed files in the manifest and continue when true. The default is to raise a clear error so downstream steps do not use missing files.

  • replace_input_columns (bool, optional) – When true, the original waveform path columns are replaced with processed paths while raw paths are preserved in *_raw_waveform.

  • drop_unprocessed_rows (bool, optional) – When true, rows that did not resolve to any processed waveform path are removed from the returned/written event-station table. Disable this only when you need to audit the full input table, including unavailable waveform records.

  • verbose (bool, optional) – Print progress messages while resolving and preprocessing waveform files. This is useful in notebooks for long-running ASDF/MiniSEED preprocessing.

  • event_station_name (str, optional) – Output filenames written under output_root/metadata.

  • manifest_name (str, optional) – Output filenames written under output_root/metadata.

  • trace_metadata_name (str, optional) – Output filenames written under output_root/metadata.

  • event_station_records – Required function argument.

  • output_root – Optional function argument. Defaults to None.

  • source_columns – Optional function argument. Defaults to None.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

  • event_id_col – Optional function argument. Defaults to 'event_id'.

  • overwrite – Optional function argument. Defaults to False.

  • continue_on_error – Optional function argument. Defaults to False.

  • replace_input_columns – Optional function argument. Defaults to True.

  • drop_unprocessed_rows – Optional function argument. Defaults to True.

  • verbose – Optional function argument. Defaults to False.

  • event_station_name – Optional function argument. Defaults to 'event_station_records_preprocessed.csv'.

  • manifest_name – Optional function argument. Defaults to 'waveform_preprocessing_manifest.csv'.

  • trace_metadata_name – Optional function argument. Defaults to 'trace_metadata_preprocessed.csv'.

Returns:

Updated table, manifest, trace metadata, and their written paths.

Return type:

WaveformPreprocessingWorkflowResult

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.preprocessing.WaveformPreprocessingWorkflowResult

spatial_vtk.io.read_artifact_manifest(manifest_path: str | Path) dict[str, Any]

Read a JSON artifact manifest.

Parameters:
  • manifest_path (str | pathlib.Path) – Manifest JSON path.

  • manifest_path – Required function argument.

Returns:

Parsed manifest payload.

Return type:

dict

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.read_config_table(dotted_key: str, *, cfg: SpatialVTKConfig | None = None, must_exist: bool = True, **kwargs: Any) DataFrame

Read a table path from the active config.

Parameters:
  • dotted_key (str) – Config key that points to a table, such as "paths.station_metadata".

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active/discoverable config is used.

  • must_exist (bool, optional) – Whether to raise an error if the configured path is missing.

  • **kwargs – Additional read options forwarded to read_table.

  • dotted_key – Required function argument.

  • cfg – Optional function argument. Defaults to None.

  • must_exist – Optional function argument. Defaults to True.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Loaded table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.read_event_metadata(path: str | Path, **kwargs) DataFrame

Read and prepare an event metadata CSV file.

Parameters:
  • path (str | pathlib.Path) – CSV path.

  • **kwargs – Extra arguments passed to prepare_event_metadata.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Prepared event metadata table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.read_event_patch_table(path: str | Path | None = None, **kwargs) DataFrame

Read an optional event patch/context table.

Parameters:
  • path (str | pathlib.Path | None, optional) – Event patch table path. When omitted, the public example path is used.

  • **kwargs – Additional arguments forwarded to pandas.read_csv.

  • path – Optional function argument. Defaults to None.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Event patch table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.read_event_station_table(path: str | Path, **kwargs) DataFrame

Read and prepare an event-station metadata CSV file.

Parameters:
  • path (str | pathlib.Path) – CSV path.

  • **kwargs – Extra arguments passed to prepare_event_station_table.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Prepared event-station table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.read_events(path: str | Path | None = None, **kwargs) DataFrame

Read and standardize an event catalog.

Parameters:
  • path (str | pathlib.Path | None, optional) – Event catalog path. When omitted, the public example path is used.

  • **kwargs – Additional arguments forwarded to pandas.read_csv.

  • path – Optional function argument. Defaults to None.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Event table with standardized event and coordinate columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.read_json(path: str | PathLike[str]) Any

Read one JSON file.

Parameters:
  • path (str | os.PathLike[str]) – JSON path.

  • path – Required function argument.

Returns:

Decoded JSON payload.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.read_station_metadata(path: str | Path, **kwargs) DataFrame

Read and prepare a station metadata CSV file.

Parameters:
  • path (str | pathlib.Path) – CSV path.

  • **kwargs – Extra arguments passed to prepare_station_metadata.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Prepared station metadata table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.read_stations(path: str | Path, **kwargs) DataFrame

Read and standardize a station catalog.

Parameters:
  • path (str | pathlib.Path) – Station catalog path.

  • **kwargs – Additional arguments forwarded to pandas.read_csv.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Station table with standardized station and coordinate columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.read_table(path: str | Path, **kwargs: Any) DataFrame

Read one CSV or Parquet table from disk.

Parameters:
  • path (str | pathlib.Path) – Input table path ending in .csv, .parquet, or .pq.

  • **kwargs – Additional keyword arguments forwarded to pandas.read_csv for CSV files or pandas.read_parquet for Parquet files.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Loaded table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.read_waveform_file(path: str | Path, format: str | None = None) Any

Read one waveform file.

Parameters:
  • path (str | pathlib.Path) – Waveform path. ObsPy-supported formats are read with ObsPy when it is installed. .npz and .npy files are supported without ObsPy.

  • format (str | None, optional) – Optional format string forwarded to obspy.read.

  • path – Required function argument.

  • format – Optional function argument. Defaults to None.

Returns:

ObsPy Stream when ObsPy handles the file, a list of lightweight mapping traces for NumPy files, or the raw object returned by ObsPy for non-stream readers.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.resolve_model_aliases(requested: list[str], input_syn_path: str | Path, *, model_folders: dict[str, str] | None = None, allow_ambiguous: bool = False) ModelResolution

Resolve requested model aliases to folders under a synthetic root.

Parameters:
  • requested (list) – Requested aliases or folder names.

  • input_syn_path (str | pathlib.Path) – Synthetic root directory or path template containing {model}.

  • model_folders (dict[str, str] | None, optional) – Optional explicit alias-to-folder mapping.

  • allow_ambiguous (bool, optional) – If true, keep all folder matches by adding variant suffixes.

  • requested – Required function argument.

  • input_syn_path – Required function argument.

  • model_folders – Optional function argument. Defaults to None.

  • allow_ambiguous – Optional function argument. Defaults to False.

Returns:

Resolved aliases, selected folders, and any ambiguous matches.

Return type:

ModelResolution

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.model_aliases.ModelResolution

spatial_vtk.io.scan_synthetic_model_folders(input_syn_path: str | Path) list[ModelFolderCandidate]

Scan a synthetic root and classify immediate child folders.

Parameters:
  • input_syn_path (str | pathlib.Path) – Synthetic root directory or path template containing {model}.

  • input_syn_path – Required function argument.

Returns:

Classified folders sorted by folder name.

Return type:

list of ModelFolderCandidate

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.io.stable_hash(value: Mapping[str, Any], *, length: int | None = None) str

Return a stable SHA256 hash for one mapping.

Parameters:
  • value (Mapping) – Mapping to hash.

  • length (int | None, optional) – Optional prefix length.

  • value – Required function argument.

  • length – Optional function argument. Defaults to None.

Returns:

Full hash or hash prefix.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.stream_station_table(stream: Any) DataFrame

Build a deduplicated station table from stream trace metadata.

Parameters:
  • stream (Any) – Stream-like object.

  • stream – Required function argument.

Returns:

One row per network/station pair.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.synthetic_reader_for(info: SyntheticFormatInfo) SyntheticReader

Return a normalized reader for one synthetic format.

Parameters:
Returns:

Reader that exposes a common read() method.

Return type:

SyntheticReader

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.synthetic_formats.SyntheticReader

spatial_vtk.io.trace_metadata_table(stream: Any, *, source: str | Path | None = None, event_id: str | None = None) DataFrame

Extract normalized metadata from trace-like objects.

Parameters:
  • stream (Any) – ObsPy stream, iterable of trace-like objects, or one trace mapping.

  • source (str | pathlib.Path | None, optional) – Optional source label or path copied into the output.

  • event_id (str | None, optional) – Optional event identifier copied into the output.

  • stream – Required function argument.

  • source – Optional function argument. Defaults to None.

  • event_id – Optional function argument. Defaults to None.

Returns:

One row per trace with stable public metadata columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.utc_run_id() str

Return a compact UTC timestamp suitable for run-directory names.

Parameters:

None

Returns:

Timestamp such as 20260512T024500Z.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.waveform_preprocessing_from_config(config: Any | None = None) WaveformPreprocessing

Read waveform preprocessing settings from a Spatial-VTK config.

Parameters:
  • config (Any | None, optional) – Optional SpatialVTKConfig. When omitted, the active or discovered config is used when available.

  • config – Optional function argument. Defaults to None.

Returns:

Parsed lowpass cutoff and filter order.

Return type:

WaveformPreprocessing

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.waveforms.WaveformPreprocessing

spatial_vtk.io.waveform_preprocessing_label(preprocessing: WaveformPreprocessing | None = None, *, config: Any | None = None) str

Return a human-readable label for configured waveform preprocessing.

Parameters:
  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Explicit preprocessing settings. When omitted, settings are read from config or from the active Spatial-VTK config.

  • config (Any | None, optional) – Optional config used only when preprocessing is omitted.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

Returns:

Label suitable for waveform figure subtitles.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.wide_to_long_metrics(df: DataFrame, residual_mode: str = 'logratio', *, context_cols: Sequence[str] | None = None) DataFrame

Convert wide *_obs/*_syn metric columns into tidy long form.

Parameters:
  • df (pandas.DataFrame) – Metric table with observed and synthetic metric columns.

  • residual_mode (str, optional) – "logratio" for log10(synthetic / observed) or "diff" for synthetic - observed.

  • context_cols (collections.abc.Sequence[str] | None, optional) – Metadata columns to preserve in each long-form row.

  • df – Required function argument.

  • residual_mode – Optional function argument. Defaults to 'logratio'.

  • context_cols – Optional function argument. Defaults to None.

Returns:

Long-form metric table with metric, value_obs, value_syn, residual, and optional score columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.write_artifact_manifest(artifact_path: str | Path, spec: ArtifactSpec, *, extra: Mapping[str, Any] | None = None) Path

Write a JSON manifest next to a planned artifact.

Parameters:
  • artifact_path (str | pathlib.Path) – Output artifact path.

  • spec (spatial_vtk.io.artifacts.ArtifactSpec) – Artifact planning record.

  • extra (Optional, optional) – Optional additional manifest fields.

  • artifact_path – Required function argument.

  • spec – Required function argument.

  • extra – Optional function argument. Defaults to None.

Returns:

Written manifest path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.write_json(path: str | PathLike[str], payload: Mapping[str, Any] | list[Any]) Path

Atomically write one JSON payload.

Parameters:
  • path (str | os.PathLike[str]) – Destination JSON path.

  • payload (Union) – JSON-serializable mapping or list.

  • path – Required function argument.

  • payload – Required function argument.

Returns:

Written path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.write_master_event_list(df: DataFrame, path: str | Path, *, overwrite: bool = True) Path

Write a master event list CSV.

Parameters:
  • df (pandas.DataFrame) – Required function argument.

  • path (str | pathlib.Path) – Required function argument.

  • overwrite (bool, optional) – Optional function argument. Defaults to True.

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.write_master_station_list(df: DataFrame, path: str | Path, *, overwrite: bool = True) Path

Write a master station list CSV.

Parameters:
  • df (pandas.DataFrame) – Required function argument.

  • path (str | pathlib.Path) – Required function argument.

  • overwrite (bool, optional) – Optional function argument. Defaults to True.

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.write_named_tables(tables: dict[str, DataFrame], paths: SimpleNamespace | dict[str, str | Path], *, index: bool = False) dict[str, Path]

Write a set of named tables to matching named paths.

Parameters:
  • tables (dict) – Mapping from logical table name to dataframe.

  • paths (types.SimpleNamespace | dict[str, str | pathlib.Path]) – Namespace or mapping with one path per table name.

  • index (bool, optional) – Whether to include dataframe indexes.

  • tables – Required function argument.

  • paths – Required function argument.

  • index – Optional function argument. Defaults to False.

Returns:

Written paths keyed by table name.

Return type:

dict[str, pathlib.Path]

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.write_output_table(key: str, df: DataFrame, *, outpath: str | Path | None = None, cfg: SpatialVTKConfig | None = None, index: bool = False) Path

Write a standard table using the output registry and config.

Parameters:
  • key (str) – Registered table key such as "prepared_stations".

  • df (pandas.DataFrame) – Table to write.

  • outpath (str | pathlib.Path | None, optional) – Optional explicit output path. This always wins.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active/discoverable config is used.

  • index (bool, optional) – Whether to include the dataframe index.

  • key – Required function argument.

  • df – Required function argument.

  • outpath – Optional function argument. Defaults to None.

  • cfg – Optional function argument. Defaults to None.

  • index – Optional function argument. Defaults to False.

Returns:

Written table path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.write_output_tables(tables: dict[str, DataFrame] | None = None, *, cfg: SpatialVTKConfig | None = None, index: bool = False, **named_tables: DataFrame) dict[str, Path]

Write one or more standard output tables by artifact key.

Parameters:
  • tables (dict[str, pandas.DataFrame] | None, optional) – Optional mapping from registered output keys to dataframes.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active/discoverable config is used.

  • index (bool, optional) – Whether to include dataframe indexes.

  • **named_tables – Additional registered output keys passed as keyword arguments.

  • tables – Optional function argument. Defaults to None.

  • cfg – Optional function argument. Defaults to None.

  • index – Optional function argument. Defaults to False.

  • named_tables (pandas.DataFrame) – Additional keyword arguments passed to the function.

Returns:

Written table paths keyed by artifact name.

Return type:

dict[str, pathlib.Path]

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.write_station_event_kml(stations: DataFrame, events: DataFrame, output_path: str | Path, *, station_col: str = 'station', station_lat_col: str = 'lat', station_lon_col: str = 'lon', event_col: str = 'event_id', event_lat_col: str = 'lat', event_lon_col: str = 'lon') Path

Write station and event coordinates to a simple KML document.

Parameters:
  • stations (pandas.DataFrame) – Tables containing station/event names and coordinates.

  • events (pandas.DataFrame) – Tables containing station/event names and coordinates.

  • output_path (str | pathlib.Path) – Destination KML path.

  • station_col (str, optional) – Station name and coordinate columns.

  • station_lat_col (str, optional) – Station name and coordinate columns.

  • station_lon_col (str, optional) – Station name and coordinate columns.

  • event_col (str, optional) – Event name and coordinate columns.

  • event_lat_col (str, optional) – Event name and coordinate columns.

  • event_lon_col (str, optional) – Event name and coordinate columns.

  • stations – Required function argument.

  • events – Required function argument.

  • output_path – Required function argument.

  • station_col – Optional function argument. Defaults to 'station'.

  • station_lat_col – Optional function argument. Defaults to 'lat'.

  • station_lon_col – Optional function argument. Defaults to 'lon'.

  • event_col – Optional function argument. Defaults to 'event_id'.

  • event_lat_col – Optional function argument. Defaults to 'lat'.

  • event_lon_col – Optional function argument. Defaults to 'lon'.

Returns:

Path to the written KML file.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.write_table(df: DataFrame, path: str | Path, *, index: bool = False) Path

Write one table based on the destination file extension.

Parameters:
  • df (pandas.DataFrame) – Table to write.

  • path (str | pathlib.Path) – Output path ending in .csv or .parquet. Paths without an extension are written as CSV.

  • index (bool, optional) – Whether to include the dataframe index.

  • df – Required function argument.

  • path – Required function argument.

  • index – Optional function argument. Defaults to False.

Returns:

Written path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.write_trace_metadata_csv(stream: Any, path: str | Path, *, source: str | Path | None = None, event_id: str | None = None) Path

Write trace metadata to CSV.

Parameters:
  • stream (Any) – Stream-like object to inspect.

  • path (str | pathlib.Path) – Output CSV path.

  • source (str | pathlib.Path | None, optional) – Optional metadata copied into every row.

  • event_id (str | None, optional) – Optional metadata copied into every row.

  • stream – Required function argument.

  • path – Required function argument.

  • source – Optional function argument. Defaults to None.

  • event_id – Optional function argument. Defaults to None.

Returns:

Written CSV path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.written_files_table(written: dict[str, str | Path], *, descriptions: dict[str, str] | None = None, relative_to: str | Path | None = None) DataFrame

Build a readable table of written files.

Parameters:
  • written (dict) – Mapping from logical output name to written file path.

  • descriptions (dict[str, str] | None, optional) – Optional mapping from logical output name to display description.

  • relative_to (str | pathlib.Path | None, optional) – Optional root used to display relative paths.

  • written – Required function argument.

  • descriptions – Optional function argument. Defaults to None.

  • relative_to – Optional function argument. Defaults to None.

Returns:

Two-column manifest with File and Description.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

Metadata and Inventories

Metadata preparation helpers for public Spatial-VTK workflows.

Metadata Overview

This module standardizes common station, event, and event-station metadata tables so notebooks and scripts can start from different column naming conventions without one-off renaming code.

spatial_vtk.io.metadata.prepare_event_metadata(event_metadata: DataFrame | None = None, *, required_columns: Sequence[str] = ('event_id', 'event_lat', 'event_lon'), keep_extra_columns: bool = True) DataFrame

Prepare an event metadata table with canonical event columns.

Parameters:
  • event_metadata (pandas.DataFrame | None, optional) – Input table with event identifiers and hypocenter/source metadata. When omitted, paths.event_metadata is read from the active config.

  • required_columns (Sequence, optional) – Canonical columns required in the output.

  • keep_extra_columns (bool, optional) – Whether to retain input columns not used for canonical names.

  • event_metadata – Optional function argument. Defaults to None.

  • required_columns – Optional function argument. Defaults to ('event_id', 'event_lat', 'event_lon').

  • keep_extra_columns – Optional function argument. Defaults to True.

Returns:

Prepared table containing at least event_id, event_lat, and event_lon.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.metadata.prepare_event_station_table(event_station_metadata: DataFrame | None = None, *, station_metadata: DataFrame | None = None, event_metadata: DataFrame | None = None) DataFrame

Prepare event-station rows and optionally join station/event metadata.

Parameters:
  • event_station_metadata (pandas.DataFrame | None, optional) – Table containing event and station identifiers. When omitted and both station_metadata and event_metadata are provided, every event-station pair is generated from those tables. When omitted without both metadata tables, paths.event_station_table is read from the active config.

  • station_metadata (pandas.DataFrame | None, optional) – Optional prepared station metadata table.

  • event_metadata (pandas.DataFrame | None, optional) – Optional prepared event metadata table.

  • event_station_metadata – Optional function argument. Defaults to None.

  • station_metadata – Optional function argument. Defaults to None.

  • event_metadata – Optional function argument. Defaults to None.

Returns:

Event-station table with canonical identifiers and joined metadata.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.metadata.prepare_station_metadata(station_metadata: DataFrame | None = None, *, required_columns: Sequence[str] = ('station', 'lat', 'lon'), keep_extra_columns: bool = True) DataFrame

Prepare a station metadata table with canonical station columns.

Parameters:
  • station_metadata (pandas.DataFrame | None, optional) – Input table with station identifiers and coordinates. When omitted, paths.station_metadata is read from the active config.

  • required_columns (Sequence, optional) – Canonical columns required in the output.

  • keep_extra_columns (bool, optional) – Whether to retain input columns not used for canonical names.

  • station_metadata – Optional function argument. Defaults to None.

  • required_columns – Optional function argument. Defaults to ('station', 'lat', 'lon').

  • keep_extra_columns – Optional function argument. Defaults to True.

Returns:

Prepared table containing at least station, lat, and lon.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.metadata.read_event_metadata(path: str | Path, **kwargs) DataFrame

Read and prepare an event metadata CSV file.

Parameters:
  • path (str | pathlib.Path) – CSV path.

  • **kwargs – Extra arguments passed to prepare_event_metadata.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Prepared event metadata table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.metadata.read_event_station_table(path: str | Path, **kwargs) DataFrame

Read and prepare an event-station metadata CSV file.

Parameters:
  • path (str | pathlib.Path) – CSV path.

  • **kwargs – Extra arguments passed to prepare_event_station_table.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Prepared event-station table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.metadata.read_station_metadata(path: str | Path, **kwargs) DataFrame

Read and prepare a station metadata CSV file.

Parameters:
  • path (str | pathlib.Path) – CSV path.

  • **kwargs – Extra arguments passed to prepare_station_metadata.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Prepared station metadata table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

Lightweight input file inventory helpers.

Inventory Overview

This module records what observed and synthetic input files are present before metric calculation. It is intentionally lightweight and does not read waveform samples.

spatial_vtk.io.inventory.build_file_inventory(root: str | Path, *, dataset: str, suffixes: Iterable[str] = frozenset({'.asdf', '.h5', '.hdf5', '.json', '.ms', '.mseed', '.pkl', '.sac'}), relative_to: str | Path | None = None, include_sha256: bool = True) DataFrame

Build a lightweight inventory of files under one input folder.

Parameters:
  • root (str | pathlib.Path) – Folder to scan recursively.

  • dataset (str) – Dataset label recorded in the output, such as "observed" or "synthetic".

  • suffixes (Iterable, optional) – File suffixes to include.

  • relative_to (str | pathlib.Path | None, optional) – Optional base path used to store relative paths.

  • include_sha256 (bool, optional) – Whether to compute file hashes.

  • root – Required function argument.

  • dataset – Required function argument.

  • suffixes – Optional function argument. Defaults to frozenset({'.hdf5', '.h5', '.pkl', '.sac', '.asdf', '.mseed', '.ms', '.json'}).

  • relative_to – Optional function argument. Defaults to None.

  • include_sha256 – Optional function argument. Defaults to True.

Returns:

Inventory table with dataset, path, filename, suffix, size, and optional SHA-256 columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.inventory.build_observed_synthetic_inventory(observed_root: str | Path, synthetic_root: str | Path, *, suffixes: Iterable[str] = frozenset({'.asdf', '.h5', '.hdf5', '.json', '.ms', '.mseed', '.pkl', '.sac'}), relative_to: str | Path | None = None, include_sha256: bool = True) DataFrame

Build one combined inventory for observed and synthetic inputs.

Parameters:
  • observed_root (str | pathlib.Path) – Observed input folder.

  • synthetic_root (str | pathlib.Path) – Synthetic input folder.

  • suffixes (Iterable, optional) – File suffixes to include.

  • relative_to (str | pathlib.Path | None, optional) – Optional base path used to store relative paths.

  • include_sha256 (bool, optional) – Whether to compute file hashes.

  • observed_root – Required function argument.

  • synthetic_root – Required function argument.

  • suffixes – Optional function argument. Defaults to frozenset({'.hdf5', '.h5', '.pkl', '.sac', '.asdf', '.mseed', '.ms', '.json'}).

  • relative_to – Optional function argument. Defaults to None.

  • include_sha256 – Optional function argument. Defaults to True.

Returns:

Combined observed/synthetic file inventory.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.inventory.compute_sha256(path: str | Path, *, chunk_size: int = 1048576) str

Compute the SHA-256 digest for one file.

Parameters:
  • path (str | pathlib.Path) – File to hash.

  • chunk_size (int, optional) – Number of bytes read per chunk.

  • path – Required function argument.

  • chunk_size – Optional function argument. Defaults to 1048576.

Returns:

Hexadecimal SHA-256 digest.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

Master station and event list construction helpers.

Master Lists Overview

This module normalizes common station and event metadata column names into the stable public Spatial-VTK table schemas.

Master Lists Examples

Build lists from CSV files:

build_master_station_list(station_tables=[pd.read_csv("stations.csv")]) build_master_event_list(event_tables=[pd.read_csv("events.csv")])

spatial_vtk.io.master_lists.build_arg_parser() ArgumentParser

Build the module-level CLI parser.

Returns:

Return value produced by the function.

Return type:

argparse.ArgumentParser

spatial_vtk.io.master_lists.build_master_event_list(*, event_tables: Sequence[DataFrame | str | Path] | None = None, event_records: Sequence[Mapping[str, Any]] | None = None, extra_columns: Sequence[str] | None = None) DataFrame

Build a deduplicated master event list.

Parameters:
  • event_tables (collections.abc.Sequence[pandas.DataFrame | str | pathlib.Path] | None, optional) – Event metadata tables or CSV paths.

  • event_records (collections.abc.Sequence[collections.abc.Mapping[str, Any]] | None, optional) – Optional mapping records.

  • extra_columns (collections.abc.Sequence[str] | None, optional) – Optional extra columns to preserve.

  • event_tables – Optional function argument. Defaults to None.

  • event_records – Optional function argument. Defaults to None.

  • extra_columns – Optional function argument. Defaults to None.

Returns:

Deduplicated event table with public columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.master_lists.build_master_station_list(*, station_tables: Sequence[DataFrame | str | Path] | None = None, streams: Sequence[Any] | None = None, extra_columns: Sequence[str] | None = None) DataFrame

Build a deduplicated master station list.

Parameters:
  • station_tables (collections.abc.Sequence[pandas.DataFrame | str | pathlib.Path] | None, optional) – Station metadata tables or CSV paths.

  • streams (collections.abc.Sequence[Any] | None, optional) – Optional waveform streams whose trace metadata includes station fields.

  • extra_columns (collections.abc.Sequence[str] | None, optional) – Optional extra columns to preserve when present in station tables.

  • station_tables – Optional function argument. Defaults to None.

  • streams – Optional function argument. Defaults to None.

  • extra_columns – Optional function argument. Defaults to None.

Returns:

Deduplicated station table with public columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.master_lists.main(argv: Sequence[str] | None = None) int

Run the master-list CLI wrapper.

Parameters:

argv (collections.abc.Sequence[str] | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

int

spatial_vtk.io.master_lists.normalize_event_table(df: DataFrame, *, extra_columns: Sequence[str] | None = None) DataFrame

Normalize event metadata columns.

Parameters:
  • df (pandas.DataFrame) – Raw event table.

  • extra_columns (collections.abc.Sequence[str] | None, optional) – Extra columns to preserve.

  • df – Required function argument.

  • extra_columns – Optional function argument. Defaults to None.

Returns:

Table with public event columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.master_lists.normalize_station_table(df: DataFrame, *, extra_columns: Sequence[str] | None = None) DataFrame

Normalize station metadata columns.

Parameters:
  • df (pandas.DataFrame) – Raw station table.

  • extra_columns (collections.abc.Sequence[str] | None, optional) – Extra columns to preserve.

  • df – Required function argument.

  • extra_columns – Optional function argument. Defaults to None.

Returns:

Table with public station columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.master_lists.write_master_event_list(df: DataFrame, path: str | Path, *, overwrite: bool = True) Path

Write a master event list CSV.

Parameters:
  • df (pandas.DataFrame) – Required function argument.

  • path (str | pathlib.Path) – Required function argument.

  • overwrite (bool, optional) – Optional function argument. Defaults to True.

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.master_lists.write_master_station_list(df: DataFrame, path: str | Path, *, overwrite: bool = True) Path

Write a master station list CSV.

Parameters:
  • df (pandas.DataFrame) – Required function argument.

  • path (str | pathlib.Path) – Required function argument.

  • overwrite (bool, optional) – Optional function argument. Defaults to True.

Returns:

Return value produced by the function.

Return type:

pathlib.Path

Metric workflow input table normalization helpers.

Metric Inputs Overview

This module defines the public table contracts used before metric calculation: waveform inventories for observed/synthetic files and side-specific QC tables.

Metric Inputs Examples

Normalize a waveform inventory:

inventory = normalize_metric_waveform_inventory(raw_df, source="observed")

spatial_vtk.io.metric_inputs.comparison_qc_passed(obs_row: dict[str, Any] | None, syn_row: dict[str, Any] | None) bool

Return whether both observed and synthetic QC rows pass.

Parameters:
  • obs_row (dict[str, Any] | None) – Required function argument.

  • syn_row (dict[str, Any] | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

bool

spatial_vtk.io.metric_inputs.metric_qc_lookup(qc_table: DataFrame | str | Path | None) dict[tuple[str, str, str, str, str, str, str], dict[str, Any]]

Build a lookup for side-specific metric QC decisions.

Parameters:
  • qc_table (pandas.DataFrame | str | pathlib.Path | None) – QC table or path. None returns an empty lookup.

  • qc_table – Required function argument.

Returns:

Normalized lookup keyed by source/event/station/component/group/metric/period.

Return type:

dict

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.metric_inputs.metric_qc_passed(row: dict[str, Any] | Series | None) bool

Return whether a normalized QC row passed.

Parameters:

row (dict[str, Any] | pandas.Series | None) – Required function argument.

Returns:

Return value produced by the function.

Return type:

bool

spatial_vtk.io.metric_inputs.normalize_metric_qc_table(table: DataFrame | str | Path, *, source: str | None = None) DataFrame

Normalize a side-specific metric QC table.

Parameters:
  • table (pandas.DataFrame | str | pathlib.Path) – Raw QC table or CSV/parquet path.

  • source (str | None, optional) – Optional source override.

  • table – Required function argument.

  • source – Optional function argument. Defaults to None.

Returns:

Normalized QC table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.metric_inputs.normalize_metric_waveform_inventory(table: DataFrame | str | Path, *, source: str | None = None, synthetic_max_frequency_hz: float | None = None) DataFrame

Normalize an observed or synthetic waveform inventory.

Parameters:
  • table (pandas.DataFrame | str | pathlib.Path) – Raw inventory table or CSV/parquet path.

  • source (str | None, optional) – Optional source override, usually "observed" or "synthetic".

  • synthetic_max_frequency_hz (float | None, optional) – Optional synthetic max-frequency default for rows missing a value.

  • table – Required function argument.

  • source – Optional function argument. Defaults to None.

  • synthetic_max_frequency_hz – Optional function argument. Defaults to None.

Returns:

Normalized inventory with public metric workflow columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

Waveforms and Preprocessing

Waveform loading and trace-metadata helpers.

Waveforms Overview

This module provides the public waveform I/O surface used by Spatial-VTK workflows. It prefers ObsPy for real seismology formats and includes small NumPy fallbacks for tests, examples, and simple arrays.

Waveforms Examples

Load one file and inspect its trace metadata:

stream = read_waveform_file("event.mseed") metadata = trace_metadata_table(stream, event_id="ci123")

class spatial_vtk.io.waveforms.PreprocessedWaveform(data: ndarray, dt: float, sampling_rate_hz: float, processing_label: str)

Preprocessed waveform samples and updated timing metadata.

Parameters:
  • data (numpy.ndarray) – One-dimensional preprocessed samples.

  • dt (float) – Updated sample interval in seconds.

  • sampling_rate_hz (float) – Updated sampling rate in Hz.

  • processing_label (str) – Human-readable description of the applied preprocessing.

Returns:

Immutable preprocessing result.

Return type:

PreprocessedWaveform

class spatial_vtk.io.waveforms.WaveformPreprocessing(lowpass_hz: float | None = None, highpass_hz: float | None = None, bandpass_low_hz: float | None = None, bandpass_high_hz: float | None = None, resample_hz: float | None = None, filter_order: int = 4)

Configured preprocessing applied before waveform QC, metrics, or figures.

Parameters:
  • lowpass_hz (float | None) – Optional lowpass cutoff in Hz. None means no configured lowpass.

  • highpass_hz (float | None) – Optional highpass cutoff in Hz. None means no configured highpass.

  • bandpass_low_hz – Optional bandpass corner frequencies in Hz. Set both values to apply a bandpass instead of separate highpass/lowpass steps.

  • bandpass_high_hz – Optional bandpass corner frequencies in Hz. Set both values to apply a bandpass instead of separate highpass/lowpass steps.

  • resample_hz (float | None) – Optional target sampling rate in Hz after filtering.

  • filter_order (int) – Butterworth filter order used for waveform filters.

Returns:

Immutable preprocessing settings.

Return type:

WaveformPreprocessing

spatial_vtk.io.waveforms.apply_waveform_preprocessing(data: Any, dt: float, preprocessing: WaveformPreprocessing | None = None, *, config: Any | None = None) ndarray

Apply configured waveform preprocessing to one sample array.

Parameters:
  • data (Any) – One-dimensional waveform samples.

  • dt (float) – Sample interval in seconds.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Explicit preprocessing settings. When omitted, settings are read from config or from the active Spatial-VTK config.

  • config (Any | None, optional) – Optional config used only when preprocessing is omitted.

  • data – Required function argument.

  • dt – Required function argument.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

Returns:

Preprocessed waveform samples.

Return type:

numpy.ndarray

Returns:

Return value produced by the function.

Return type:

numpy.ndarray

spatial_vtk.io.waveforms.apply_waveform_preprocessing_with_metadata(data: Any, dt: float, preprocessing: WaveformPreprocessing | None = None, *, config: Any | None = None) PreprocessedWaveform

Apply configured preprocessing and return updated timing metadata.

Parameters:
  • data (Any) – One-dimensional waveform samples.

  • dt (float) – Sample interval in seconds.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Explicit preprocessing settings. When omitted, settings are read from config or from the active Spatial-VTK config.

  • config (Any | None, optional) – Optional config used only when preprocessing is omitted.

  • data – Required function argument.

  • dt – Required function argument.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

Returns:

Filtered/resampled samples plus updated dt and sampling rate.

Return type:

PreprocessedWaveform

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.waveforms.PreprocessedWaveform

spatial_vtk.io.waveforms.build_arg_parser() ArgumentParser

Build the module-level CLI parser.

Returns:

Return value produced by the function.

Return type:

argparse.ArgumentParser

spatial_vtk.io.waveforms.load_waveform_collection(paths: Sequence[str | Path], format: str | None = None) Any

Load several waveform files and combine them when possible.

Parameters:
  • paths (collections.abc.Sequence) – File paths to load.

  • format (str | None, optional) – Optional ObsPy format forwarded to read_waveform_file().

  • paths – Required function argument.

  • format – Optional function argument. Defaults to None.

Returns:

Combined ObsPy stream when all inputs are ObsPy streams, otherwise a flat list of trace-like objects.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.waveforms.main(argv: Sequence[str] | None = None) int

Run the waveform metadata CLI wrapper.

Parameters:

argv (collections.abc.Sequence[str] | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

int

spatial_vtk.io.waveforms.preprocess_stream(stream: Any, preprocessing: WaveformPreprocessing | None = None, *, config: Any | None = None) Any

Apply waveform preprocessing to every trace in a stream-like object.

Parameters:
  • stream (Any) – ObsPy stream, iterable of traces, or one trace-like mapping/object.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Optional preprocessing settings. When omitted, the active config is used when available.

  • config (Any | None, optional) – Optional Spatial-VTK config used only when preprocessing is omitted.

  • stream – Required function argument.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

Returns:

A copied stream or list of copied trace-like mappings with updated samples, sampling rate, and delta metadata.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.waveforms.read_waveform_file(path: str | Path, format: str | None = None) Any

Read one waveform file.

Parameters:
  • path (str | pathlib.Path) – Waveform path. ObsPy-supported formats are read with ObsPy when it is installed. .npz and .npy files are supported without ObsPy.

  • format (str | None, optional) – Optional format string forwarded to obspy.read.

  • path – Required function argument.

  • format – Optional function argument. Defaults to None.

Returns:

ObsPy Stream when ObsPy handles the file, a list of lightweight mapping traces for NumPy files, or the raw object returned by ObsPy for non-stream readers.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.waveforms.select_waveform_trace(stream_or_path: Any, *, station: str | None = None, component: str | None = None, prefer_channel_prefixes: Sequence[str] = ('BH', 'HN')) Any

Select one trace from a stream or waveform file.

Parameters:
  • stream_or_path (Any) – ObsPy stream, iterable of traces, one trace-like object, or a path accepted by read_waveform_file().

  • station (str | None, optional) – Optional station code to match.

  • component (str | None, optional) – Optional component suffix to match, such as "Z", "R", or "T".

  • prefer_channel_prefixes (collections.abc.Sequence, optional) – Channel-prefix ordering used when several traces match.

  • stream_or_path – Required function argument.

  • station – Optional function argument. Defaults to None.

  • component – Optional function argument. Defaults to None.

  • prefer_channel_prefixes – Optional function argument. Defaults to ('BH', 'HN').

Returns:

Selected trace-like object.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.waveforms.stream_station_table(stream: Any) DataFrame

Build a deduplicated station table from stream trace metadata.

Parameters:
  • stream (Any) – Stream-like object.

  • stream – Required function argument.

Returns:

One row per network/station pair.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.waveforms.trace_metadata_table(stream: Any, *, source: str | Path | None = None, event_id: str | None = None) DataFrame

Extract normalized metadata from trace-like objects.

Parameters:
  • stream (Any) – ObsPy stream, iterable of trace-like objects, or one trace mapping.

  • source (str | pathlib.Path | None, optional) – Optional source label or path copied into the output.

  • event_id (str | None, optional) – Optional event identifier copied into the output.

  • stream – Required function argument.

  • source – Optional function argument. Defaults to None.

  • event_id – Optional function argument. Defaults to None.

Returns:

One row per trace with stable public metadata columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.waveforms.waveform_preprocessing_from_config(config: Any | None = None) WaveformPreprocessing

Read waveform preprocessing settings from a Spatial-VTK config.

Parameters:
  • config (Any | None, optional) – Optional SpatialVTKConfig. When omitted, the active or discovered config is used when available.

  • config – Optional function argument. Defaults to None.

Returns:

Parsed lowpass cutoff and filter order.

Return type:

WaveformPreprocessing

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.waveforms.WaveformPreprocessing

spatial_vtk.io.waveforms.waveform_preprocessing_label(preprocessing: WaveformPreprocessing | None = None, *, config: Any | None = None) str

Return a human-readable label for configured waveform preprocessing.

Parameters:
  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Explicit preprocessing settings. When omitted, settings are read from config or from the active Spatial-VTK config.

  • config (Any | None, optional) – Optional config used only when preprocessing is omitted.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

Returns:

Label suitable for waveform figure subtitles.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.waveforms.write_trace_metadata_csv(stream: Any, path: str | Path, *, source: str | Path | None = None, event_id: str | None = None) Path

Write trace metadata to CSV.

Parameters:
  • stream (Any) – Stream-like object to inspect.

  • path (str | pathlib.Path) – Output CSV path.

  • source (str | pathlib.Path | None, optional) – Optional metadata copied into every row.

  • event_id (str | None, optional) – Optional metadata copied into every row.

  • stream – Required function argument.

  • path – Required function argument.

  • source – Optional function argument. Defaults to None.

  • event_id – Optional function argument. Defaults to None.

Returns:

Written CSV path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

File-based waveform preprocessing workflow.

Preprocessing Overview

This module makes waveform preprocessing a first-class Spatial-VTK workflow step. It filters and/or resamples observed and synthetic waveform files once, writes processed copies, and returns an updated event-station table that later QC, metric, and figure steps can consume directly.

Preprocessing Examples

Preprocess observed and synthetic paths listed in an event-station table:

preprocess_waveform_files("event_stations.csv", "outputs/preprocessed", config=cfg)

Use explicit preprocessing settings:

preprocess_waveform_files(records, "outputs/preprocessed", preprocessing=WaveformPreprocessing(lowpass_hz=1.0, resample_hz=20.0))

class spatial_vtk.io.preprocessing.WaveformPreprocessingWorkflowResult(event_station_records: DataFrame, event_station_path: Path, manifest: DataFrame, manifest_path: Path, trace_metadata: DataFrame, trace_metadata_path: Path)

Outputs written by preprocess_waveform_files().

Parameters:
  • event_station_records (pandas.DataFrame) – Updated event-station table.

  • event_station_path (pathlib.Path) – Path where the updated event-station table was written.

  • manifest (pandas.DataFrame) – One row per source waveform file.

  • manifest_path (pathlib.Path) – Path where the preprocessing manifest was written.

  • trace_metadata (pandas.DataFrame) – One row per processed trace.

  • trace_metadata_path (pathlib.Path) – Path where trace metadata was written.

Returns:

Immutable workflow result with dataframes and written paths.

Return type:

WaveformPreprocessingWorkflowResult

spatial_vtk.io.preprocessing.preprocess_waveform_files(event_station_records: DataFrame | str | Path, output_root: str | Path | None = None, *, source_columns: Mapping[str, str] | None = None, preprocessing: WaveformPreprocessing | None = None, config: Any | None = None, event_id_col: str = 'event_id', overwrite: bool = False, continue_on_error: bool = False, replace_input_columns: bool = True, drop_unprocessed_rows: bool = True, verbose: bool = False, event_station_name: str = 'event_station_records_preprocessed.csv', manifest_name: str = 'waveform_preprocessing_manifest.csv', trace_metadata_name: str = 'trace_metadata_preprocessed.csv') WaveformPreprocessingWorkflowResult

Preprocess waveform files and write reusable processed copies.

Parameters:
  • event_station_records (pandas.DataFrame | str | pathlib.Path) – DataFrame or CSV/Parquet path with event IDs and waveform paths.

  • output_root (str | pathlib.Path | None, optional) – Folder where processed waveforms and metadata tables will be written. When omitted, outputs.preprocessed_waveforms is read from config or the active Spatial-VTK config.

  • source_columns (Optional, optional) – Optional mapping such as {"observed": "observed_mseed"}. When omitted, common observed/synthetic waveform column names are detected.

  • preprocessing (spatial_vtk.io.waveforms.WaveformPreprocessing | None, optional) – Explicit preprocessing settings. When omitted, settings are read from config or from the active Spatial-VTK config.

  • config (Any | None, optional) – Optional Spatial-VTK config used only when preprocessing is omitted.

  • event_id_col (str, optional) – Column containing event IDs.

  • overwrite (bool, optional) – Replace existing processed waveform files when true.

  • continue_on_error (bool, optional) – Record failed files in the manifest and continue when true. The default is to raise a clear error so downstream steps do not use missing files.

  • replace_input_columns (bool, optional) – When true, the original waveform path columns are replaced with processed paths while raw paths are preserved in *_raw_waveform.

  • drop_unprocessed_rows (bool, optional) – When true, rows that did not resolve to any processed waveform path are removed from the returned/written event-station table. Disable this only when you need to audit the full input table, including unavailable waveform records.

  • verbose (bool, optional) – Print progress messages while resolving and preprocessing waveform files. This is useful in notebooks for long-running ASDF/MiniSEED preprocessing.

  • event_station_name (str, optional) – Output filenames written under output_root/metadata.

  • manifest_name (str, optional) – Output filenames written under output_root/metadata.

  • trace_metadata_name (str, optional) – Output filenames written under output_root/metadata.

  • event_station_records – Required function argument.

  • output_root – Optional function argument. Defaults to None.

  • source_columns – Optional function argument. Defaults to None.

  • preprocessing – Optional function argument. Defaults to None.

  • config – Optional function argument. Defaults to None.

  • event_id_col – Optional function argument. Defaults to 'event_id'.

  • overwrite – Optional function argument. Defaults to False.

  • continue_on_error – Optional function argument. Defaults to False.

  • replace_input_columns – Optional function argument. Defaults to True.

  • drop_unprocessed_rows – Optional function argument. Defaults to True.

  • verbose – Optional function argument. Defaults to False.

  • event_station_name – Optional function argument. Defaults to 'event_station_records_preprocessed.csv'.

  • manifest_name – Optional function argument. Defaults to 'waveform_preprocessing_manifest.csv'.

  • trace_metadata_name – Optional function argument. Defaults to 'trace_metadata_preprocessed.csv'.

Returns:

Updated table, manifest, trace metadata, and their written paths.

Return type:

WaveformPreprocessingWorkflowResult

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.preprocessing.WaveformPreprocessingWorkflowResult

Tables and Artifacts

Generic table loading and reshaping helpers.

spatial_vtk.io.tables.aggregate_metric_by_station_over_events(df: DataFrame, *, metric_col: str, model_col: str = 'model', station_col: str = 'station', latitude_col: str = 'sta_lat', longitude_col: str = 'sta_lon', event_col: str = 'event_id') DataFrame

Average a metric by station after first averaging within each event.

Parameters:
  • df (pandas.DataFrame) – Metric table.

  • metric_col (str) – Numeric column to aggregate.

  • model_col (str, optional) – Column names defining model, station, coordinates, and event.

  • station_col (str, optional) – Column names defining model, station, coordinates, and event.

  • latitude_col (str, optional) – Column names defining model, station, coordinates, and event.

  • longitude_col (str, optional) – Column names defining model, station, coordinates, and event.

  • event_col (str, optional) – Column names defining model, station, coordinates, and event.

  • df – Required function argument.

  • metric_col – Required function argument.

  • model_col – Optional function argument. Defaults to 'model'.

  • station_col – Optional function argument. Defaults to 'station'.

  • latitude_col – Optional function argument. Defaults to 'sta_lat'.

  • longitude_col – Optional function argument. Defaults to 'sta_lon'.

  • event_col – Optional function argument. Defaults to 'event_id'.

Returns:

Station-level metric table with n_events when event information is available.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.tables.load_csv_bundle(sources: str | Path | Sequence[str | Path] | dict[str, Any], *, base_dir: str | Path | None = None, source_column: str = '__source_csv__') DataFrame

Load one or more CSV files into a single table.

Parameters:
  • sources (str | pathlib.Path | collections.abc.Sequence[str | pathlib.Path] | dict[str, Any]) – CSV path, glob pattern, sequence of paths, or config dictionary with one of input_csv, csv_dir, or csv_files.

  • base_dir (str | pathlib.Path | None, optional) – Directory used to resolve relative paths.

  • source_column (str, optional) – Name of the column recording the source CSV path.

  • sources – Required function argument.

  • base_dir – Optional function argument. Defaults to None.

  • source_column – Optional function argument. Defaults to '__source_csv__'.

Returns:

Concatenated CSV table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.tables.load_output_table(key: str, *, cfg: SpatialVTKConfig | None = None, **kwargs: Any) DataFrame

Load a standard output table by artifact key.

Parameters:
  • key (str) – Registered output key such as "prepared_stations".

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active/discoverable config is used.

  • **kwargs – Additional read options forwarded to read_table.

  • key – Required function argument.

  • cfg – Optional function argument. Defaults to None.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Loaded table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.tables.normalize_metric_table(df: DataFrame) DataFrame

Normalize common legacy metric-table column names.

Parameters:
  • df (pandas.DataFrame) – Metric table with either public or legacy column names.

  • df – Required function argument.

Returns:

Copy of the table with public column names where possible.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.tables.read_config_table(dotted_key: str, *, cfg: SpatialVTKConfig | None = None, must_exist: bool = True, **kwargs: Any) DataFrame

Read a table path from the active config.

Parameters:
  • dotted_key (str) – Config key that points to a table, such as "paths.station_metadata".

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active/discoverable config is used.

  • must_exist (bool, optional) – Whether to raise an error if the configured path is missing.

  • **kwargs – Additional read options forwarded to read_table.

  • dotted_key – Required function argument.

  • cfg – Optional function argument. Defaults to None.

  • must_exist – Optional function argument. Defaults to True.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Loaded table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.tables.read_table(path: str | Path, **kwargs: Any) DataFrame

Read one CSV or Parquet table from disk.

Parameters:
  • path (str | pathlib.Path) – Input table path ending in .csv, .parquet, or .pq.

  • **kwargs – Additional keyword arguments forwarded to pandas.read_csv for CSV files or pandas.read_parquet for Parquet files.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Loaded table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.tables.wide_to_long_metrics(df: DataFrame, residual_mode: str = 'logratio', *, context_cols: Sequence[str] | None = None) DataFrame

Convert wide *_obs/*_syn metric columns into tidy long form.

Parameters:
  • df (pandas.DataFrame) – Metric table with observed and synthetic metric columns.

  • residual_mode (str, optional) – "logratio" for log10(synthetic / observed) or "diff" for synthetic - observed.

  • context_cols (collections.abc.Sequence[str] | None, optional) – Metadata columns to preserve in each long-form row.

  • df – Required function argument.

  • residual_mode – Optional function argument. Defaults to 'logratio'.

  • context_cols – Optional function argument. Defaults to None.

Returns:

Long-form metric table with metric, value_obs, value_syn, residual, and optional score columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.tables.write_named_tables(tables: dict[str, DataFrame], paths: SimpleNamespace | dict[str, str | Path], *, index: bool = False) dict[str, Path]

Write a set of named tables to matching named paths.

Parameters:
  • tables (dict) – Mapping from logical table name to dataframe.

  • paths (types.SimpleNamespace | dict[str, str | pathlib.Path]) – Namespace or mapping with one path per table name.

  • index (bool, optional) – Whether to include dataframe indexes.

  • tables – Required function argument.

  • paths – Required function argument.

  • index – Optional function argument. Defaults to False.

Returns:

Written paths keyed by table name.

Return type:

dict[str, pathlib.Path]

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.tables.write_output_table(key: str, df: DataFrame, *, outpath: str | Path | None = None, cfg: SpatialVTKConfig | None = None, index: bool = False) Path

Write a standard table using the output registry and config.

Parameters:
  • key (str) – Registered table key such as "prepared_stations".

  • df (pandas.DataFrame) – Table to write.

  • outpath (str | pathlib.Path | None, optional) – Optional explicit output path. This always wins.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active/discoverable config is used.

  • index (bool, optional) – Whether to include the dataframe index.

  • key – Required function argument.

  • df – Required function argument.

  • outpath – Optional function argument. Defaults to None.

  • cfg – Optional function argument. Defaults to None.

  • index – Optional function argument. Defaults to False.

Returns:

Written table path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.tables.write_output_tables(tables: dict[str, DataFrame] | None = None, *, cfg: SpatialVTKConfig | None = None, index: bool = False, **named_tables: DataFrame) dict[str, Path]

Write one or more standard output tables by artifact key.

Parameters:
  • tables (dict[str, pandas.DataFrame] | None, optional) – Optional mapping from registered output keys to dataframes.

  • cfg (spatial_vtk.config.runtime.SpatialVTKConfig | None, optional) – Optional config object. When omitted, the active/discoverable config is used.

  • index (bool, optional) – Whether to include dataframe indexes.

  • **named_tables – Additional registered output keys passed as keyword arguments.

  • tables – Optional function argument. Defaults to None.

  • cfg – Optional function argument. Defaults to None.

  • index – Optional function argument. Defaults to False.

  • named_tables (pandas.DataFrame) – Additional keyword arguments passed to the function.

Returns:

Written table paths keyed by artifact name.

Return type:

dict[str, pathlib.Path]

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.tables.write_table(df: DataFrame, path: str | Path, *, index: bool = False) Path

Write one table based on the destination file extension.

Parameters:
  • df (pandas.DataFrame) – Table to write.

  • path (str | pathlib.Path) – Output path ending in .csv or .parquet. Paths without an extension are written as CSV.

  • index (bool, optional) – Whether to include the dataframe index.

  • df – Required function argument.

  • path – Required function argument.

  • index – Optional function argument. Defaults to False.

Returns:

Written path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.tables.written_files_table(written: dict[str, str | Path], *, descriptions: dict[str, str] | None = None, relative_to: str | Path | None = None) DataFrame

Build a readable table of written files.

Parameters:
  • written (dict) – Mapping from logical output name to written file path.

  • descriptions (dict[str, str] | None, optional) – Optional mapping from logical output name to display description.

  • relative_to (str | pathlib.Path | None, optional) – Optional root used to display relative paths.

  • written – Required function argument.

  • descriptions – Optional function argument. Defaults to None.

  • relative_to – Optional function argument. Defaults to None.

Returns:

Two-column manifest with File and Description.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

Reusable output-path namespace helpers.

Output Paths Overview

This module gives notebooks, scripts, and CLI wrappers a small common way to name output files without repeatedly spelling out filenames in each workflow.

Output Paths Examples

Create explicit CSV paths:

tables = default_output_paths(output_root, ["prepared_stations", "prepared_events"]) stations.to_csv(tables.prepared_stations, index=False)

spatial_vtk.io.output_paths.default_output_paths(output_dir: str | Path, names: Iterable[str], *, suffix: str = '.csv', create_dir: bool = True) SimpleNamespace

Return a namespace of standard output paths.

Parameters:
  • output_dir (str | pathlib.Path) – Directory where output files should be written.

  • names (Iterable) – Basenames without extension, such as "qc_inventory".

  • suffix (str, optional) – File extension to append when a name has no extension.

  • create_dir (bool, optional) – Whether to create output_dir.

  • output_dir – Required function argument.

  • names – Required function argument.

  • suffix – Optional function argument. Defaults to '.csv'.

  • create_dir – Optional function argument. Defaults to True.

Returns:

Namespace with one attribute per normalized name.

Return type:

types.SimpleNamespace

Returns:

Return value produced by the function.

Return type:

types.SimpleNamespace

Deterministic output artifact planning helpers.

Artifacts Overview

This module creates stable file paths and JSON manifests for public Spatial-VTK outputs such as figures, metrics, dashboard tables, and spatial statistics.

class spatial_vtk.io.artifacts.ArtifactRecord(artifact_path: str, kind: str, name: str, status: str = 'planned', artifact_hash: str = '', metadata: Mapping[str, Any] | None = None, recorded_at: str = '')

Record one workflow artifact in a public registry.

Parameters:
  • artifact_path (str) – Artifact path on disk.

  • kind (str) – Broad artifact group such as "metrics" or "qc".

  • name (str) – Human-readable artifact name.

  • status (str) – Status label such as "planned", "written", or "missing".

  • artifact_hash (str) – Optional deterministic hash from an artifact spec.

  • metadata (Mapping[str, Any] | None) – Optional user-provided metadata.

  • recorded_at (str) – UTC ISO timestamp.

Returns:

Immutable registry entry.

Return type:

ArtifactRecord

class spatial_vtk.io.artifacts.ArtifactRegistry(registry_path: str | Path)

Append-only JSON-lines artifact registry.

Parameters:

registry_path – Path to the registry file.

Returns:

Registry object used to append and inspect artifact records.

Return type:

ArtifactRegistry

missing() list[ArtifactRecord]

Return registry records whose paths do not exist.

Parameters:

None

Returns:

Records for missing files.

Return type:

list of ArtifactRecord

Returns:

Return value produced by the function.

Return type:

list

record(artifact_path: str | Path, *, kind: str, name: str, status: str = 'written', spec: ArtifactSpec | None = None, metadata: Mapping[str, Any] | None = None) ArtifactRecord

Append one artifact record.

Parameters:
  • artifact_path (str | pathlib.Path) – Output path being recorded.

  • kind (str) – Registry classification fields.

  • name (str) – Registry classification fields.

  • status (str, optional) – Registry classification fields.

  • spec (spatial_vtk.io.artifacts.ArtifactSpec | None, optional) – Optional artifact spec used to compute a stable hash.

  • metadata (Optional, optional) – Optional metadata copied into the registry.

  • artifact_path – Required function argument.

  • kind – Required function argument.

  • name – Required function argument.

  • status – Optional function argument. Defaults to 'written'.

  • spec – Optional function argument. Defaults to None.

  • metadata – Optional function argument. Defaults to None.

Returns:

Appended record.

Return type:

ArtifactRecord

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.artifacts.ArtifactRecord

records() list[ArtifactRecord]

Read all registry records.

Parameters:

None

Returns:

Registry records in file order.

Return type:

list of ArtifactRecord

Returns:

Return value produced by the function.

Return type:

list

to_frame()

Return records as a pandas DataFrame.

Parameters:

None

Returns:

Registry table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

Any

class spatial_vtk.io.artifacts.ArtifactSpec(kind: str, name: str, scope: Mapping[str, Any] | None = None, config: Mapping[str, Any] | None = None, extension: str = '.json', subdir: str | None = None)

Describe one planned output artifact.

Parameters:
  • kind (str) – Broad artifact type such as "figure", "metrics", or "dashboard".

  • name (str) – Human-readable artifact name.

  • scope (Mapping[str, Any] | None) – Stable row/plot identity such as metric, event, station, or model.

  • config (Mapping[str, Any] | None) – Compute-relevant configuration.

  • extension (str) – Output filename extension, including the leading dot.

  • subdir (str | None) – Optional subdirectory under the artifact root.

Returns:

Immutable artifact planning record.

Return type:

ArtifactSpec

payload() dict[str, Any]

Return the deterministic payload used for hashing.

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.artifacts.artifact_manifest_path(artifact_path: str | Path) Path

Return the JSON manifest sidecar path for one artifact path.

Parameters:

artifact_path (str | pathlib.Path) – Required function argument.

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.artifacts.artifact_path_for_spec(root: str | Path, spec: ArtifactSpec, *, hash_length: int = 10) Path

Return the deterministic output path for one artifact spec.

Parameters:
  • root (str | pathlib.Path) – Artifact root directory.

  • spec (spatial_vtk.io.artifacts.ArtifactSpec) – Artifact planning record.

  • hash_length (int, optional) – Hash prefix length in the filename.

  • root – Required function argument.

  • spec – Required function argument.

  • hash_length – Optional function argument. Defaults to 10.

Returns:

Planned artifact path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.artifacts.canonical_json(value: Mapping[str, Any]) str

Serialize one mapping to stable JSON.

Parameters:
  • value (Mapping) – Mapping to serialize.

  • value – Required function argument.

Returns:

Deterministic compact JSON string.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.artifacts.read_artifact_manifest(manifest_path: str | Path) dict[str, Any]

Read a JSON artifact manifest.

Parameters:
  • manifest_path (str | pathlib.Path) – Manifest JSON path.

  • manifest_path – Required function argument.

Returns:

Parsed manifest payload.

Return type:

dict

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.artifacts.slugify(value: object, *, max_length: int = 96) str

Return a filesystem-safe lowercase token.

Parameters:
  • value (object) – Raw label.

  • max_length (int, optional) – Maximum returned string length.

  • value – Required function argument.

  • max_length – Optional function argument. Defaults to 96.

Returns:

Safe filename token.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.artifacts.stable_hash(value: Mapping[str, Any], *, length: int | None = None) str

Return a stable SHA256 hash for one mapping.

Parameters:
  • value (Mapping) – Mapping to hash.

  • length (int | None, optional) – Optional prefix length.

  • value – Required function argument.

  • length – Optional function argument. Defaults to None.

Returns:

Full hash or hash prefix.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.artifacts.write_artifact_manifest(artifact_path: str | Path, spec: ArtifactSpec, *, extra: Mapping[str, Any] | None = None) Path

Write a JSON manifest next to a planned artifact.

Parameters:
  • artifact_path (str | pathlib.Path) – Output artifact path.

  • spec (spatial_vtk.io.artifacts.ArtifactSpec) – Artifact planning record.

  • extra (Optional, optional) – Optional additional manifest fields.

  • artifact_path – Required function argument.

  • spec – Required function argument.

  • extra – Optional function argument. Defaults to None.

Returns:

Written manifest path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

Shared helpers for chunked compute workflows.

Compute Manifest Overview

This module provides small, dependency-light utilities for commands that split large work into resumable chunks. It handles JSON manifests, atomic CSV writes, run-directory creation, and SLURM array script generation.

Compute Manifest Examples

Create a run directory:

run_dir = ensure_run_dir("metrics", "abcdef")

Write a SLURM array script:

write_slurm_array_script(path, chunks_path="chunks.json", worker_command="python worker.py")

spatial_vtk.io.compute_manifest.atomic_write_csv(df: DataFrame, path: str | PathLike[str]) Path

Atomically write a dataframe to CSV.

Parameters:
  • df (pandas.DataFrame) – Dataframe to write.

  • path (str | os.PathLike[str]) – Destination CSV path.

  • df – Required function argument.

  • path – Required function argument.

Returns:

Written CSV path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.compute_manifest.ensure_run_dir(output_root: str | PathLike[str], workflow: str, config_hash: str, *, run_id: str | None = None, work_root: str | PathLike[str] | None = None) Path

Create and return a compute-workflow run directory.

Parameters:
  • output_root (str | os.PathLike[str]) – Metrics or figure output root used when work_root is not supplied.

  • workflow (str) – Workflow name, for example "metrics_export".

  • config_hash (str) – Hash identifying the effective run configuration.

  • run_id (str | None, optional) – Optional human-readable run identifier.

  • work_root (str | os.PathLike[str] | None, optional) – Optional explicit parent work directory.

  • output_root – Required function argument.

  • workflow – Required function argument.

  • config_hash – Required function argument.

  • run_id – Optional function argument. Defaults to None.

  • work_root – Optional function argument. Defaults to None.

Returns:

Created run directory.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.compute_manifest.read_json(path: str | PathLike[str]) Any

Read one JSON file.

Parameters:
  • path (str | os.PathLike[str]) – JSON path.

  • path – Required function argument.

Returns:

Decoded JSON payload.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.compute_manifest.utc_run_id() str

Return a compact UTC timestamp suitable for run-directory names.

Parameters:

None

Returns:

Timestamp such as 20260512T024500Z.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.compute_manifest.write_json(path: str | PathLike[str], payload: Mapping[str, Any] | list[Any]) Path

Atomically write one JSON payload.

Parameters:
  • path (str | os.PathLike[str]) – Destination JSON path.

  • payload (Union) – JSON-serializable mapping or list.

  • path – Required function argument.

  • payload – Required function argument.

Returns:

Written path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.compute_manifest.write_slurm_array_script(path: str | PathLike[str], *, chunks_path: str | PathLike[str], worker_command: str, num_chunks: int, max_concurrent: int, job_name: str = 'svtk-metrics-export', time_limit: str = '12:00:00', cpus_per_task: int = 1, memory: str = '8G') Path

Write a conservative SLURM array script for chunk workers.

Parameters:
  • path (str | os.PathLike[str]) – Script destination path.

  • chunks_path (str | os.PathLike[str]) – JSON file containing chunk specs.

  • worker_command (str) – Command prefix that accepts --chunks-json and --chunk-index.

  • num_chunks (int) – Number of chunks in the array.

  • max_concurrent (int) – Maximum concurrent array tasks.

  • job_name (str, optional) – SLURM resource settings.

  • time_limit (str, optional) – SLURM resource settings.

  • cpus_per_task (int, optional) – SLURM resource settings.

  • memory (str, optional) – SLURM resource settings.

  • path – Required function argument.

  • chunks_path – Required function argument.

  • worker_command – Required function argument.

  • num_chunks – Required function argument.

  • max_concurrent – Required function argument.

  • job_name – Optional function argument. Defaults to 'svtk-metrics-export'.

  • time_limit – Optional function argument. Defaults to '12:00:00'.

  • cpus_per_task – Optional function argument. Defaults to 1.

  • memory – Optional function argument. Defaults to '8G'.

Returns:

Written script path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

Workflow plan objects read from public Spatial-VTK configuration.

class spatial_vtk.io.plans.MetricCompleteness(expected: int, present: int, missing: int, key_columns: tuple[str, ...])

Summary of expected, present, and missing metric rows.

class spatial_vtk.io.plans.MetricPlan(metrics: tuple[str, ...], passbands: tuple[tuple[float, float], ...], components: tuple[str, ...], models: tuple[str, ...], metric_groups: tuple[str, ...] = (), transforms: tuple[str, ...] = (), spectral_periods_s: tuple[float, ...] = (), output_mode: str = 'full', synthetic_max_frequency_hz: float | None = None, waveform_lowpass_hz: float | None = None, waveform_resample_hz: float | None = None, waveform_filter_order: int | None = None, output_path: Path | None = None)

Resolved metric-calculation plan.

Parameters:
  • metrics (tuple[str, ...]) – Metric names/codes to calculate.

  • passbands (tuple[tuple[float, float], ...]) – Period passbands as (period_min_s, period_max_s) pairs.

  • components (tuple[str, ...]) – Waveform components to process.

  • models (tuple[str, ...]) – Synthetic model aliases or names.

  • output_path (pathlib.Path | None) – Optional output path for the metric table.

Returns:

Immutable metric-calculation plan.

Return type:

MetricPlan

property transform_columns: tuple[str, ...]

Return requested transform output columns.

spatial_vtk.io.plans.build_arg_parser() ArgumentParser

Build the module-level metric-plan CLI parser.

Returns:

Return value produced by the function.

Return type:

argparse.ArgumentParser

spatial_vtk.io.plans.compare_metric_plan_to_table(expected_df: DataFrame, metrics_df: DataFrame, *, key_columns: Sequence[str] = ('event_id', 'station', 'component', 'model', 'passband', 'metric_group', 'metric', 'period_s')) tuple[DataFrame, MetricCompleteness]

Compare expected metric rows to an existing metrics table.

Parameters:
  • expected_df (pandas.DataFrame) – Expected key table.

  • metrics_df (pandas.DataFrame) – Existing metrics table.

  • key_columns (Sequence, optional) – Columns used for comparison.

  • expected_df – Required function argument.

  • metrics_df – Required function argument.

  • key_columns – Optional function argument. Defaults to ('event_id', 'station', 'component', 'model', 'passband', 'metric_group', 'metric', 'period_s').

Returns:

Missing-row table and completeness summary.

Return type:

tuple

Returns:

Return value produced by the function.

Return type:

tuple

spatial_vtk.io.plans.expected_metric_rows_from_inventory(inventory_df: DataFrame, plan: MetricPlan, *, model_column: str = 'model') DataFrame

Build expected metric row keys from an inventory table and plan.

Parameters:
  • inventory_df (pandas.DataFrame) – QC inventory with event_id, station, and component fields.

  • plan (spatial_vtk.io.plans.MetricPlan) – Resolved metric plan.

  • model_column (str, optional) – Output column used for model identity.

  • inventory_df – Required function argument.

  • plan – Required function argument.

  • model_column – Optional function argument. Defaults to 'model'.

Returns:

Expected row-key table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.plans.main(argv: Sequence[str] | None = None) int

Run the metric completeness CLI wrapper.

Parameters:

argv (Optional, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

int

spatial_vtk.io.plans.metric_plan_from_config(config: SpatialVTKConfig, *, command: str = 'metrics.calculate', overrides: dict[str, Any] | None = None) MetricPlan

Build a metric plan from public config sections and run defaults.

Parameters:
  • config (spatial_vtk.config.runtime.SpatialVTKConfig) – Loaded Spatial-VTK config.

  • command (str, optional) – Dotted command key used to merge run defaults.

  • overrides (dict[str, Any] | None, optional) – Explicit values for this run. These override the config file and any selected run scenario.

  • config – Required function argument.

  • command – Optional function argument. Defaults to 'metrics.calculate'.

  • overrides – Optional function argument. Defaults to None.

Returns:

Resolved metric plan.

Return type:

MetricPlan

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.plans.MetricPlan

Catalogs and File Formats

Catalog readers for event, station, and geologic context tables.

spatial_vtk.io.catalogs.context_dataset_paths(root: str | Path | None = None) dict[str, Path]

Return default public context-dataset paths.

Parameters:
  • root (str | pathlib.Path | None, optional) – Optional repository root containing examples/data.

  • root – Optional function argument. Defaults to None.

Returns:

Named paths for events, geology, regions, subbasins, and event patches.

Return type:

dict

Returns:

Return value produced by the function.

Return type:

dict

spatial_vtk.io.catalogs.read_event_patch_table(path: str | Path | None = None, **kwargs) DataFrame

Read an optional event patch/context table.

Parameters:
  • path (str | pathlib.Path | None, optional) – Event patch table path. When omitted, the public example path is used.

  • **kwargs – Additional arguments forwarded to pandas.read_csv.

  • path – Optional function argument. Defaults to None.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Event patch table.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.catalogs.read_events(path: str | Path | None = None, **kwargs) DataFrame

Read and standardize an event catalog.

Parameters:
  • path (str | pathlib.Path | None, optional) – Event catalog path. When omitted, the public example path is used.

  • **kwargs – Additional arguments forwarded to pandas.read_csv.

  • path – Optional function argument. Defaults to None.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Event table with standardized event and coordinate columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

spatial_vtk.io.catalogs.read_stations(path: str | Path, **kwargs) DataFrame

Read and standardize a station catalog.

Parameters:
  • path (str | pathlib.Path) – Station catalog path.

  • **kwargs – Additional arguments forwarded to pandas.read_csv.

  • path – Required function argument.

  • kwargs (Any) – Additional keyword arguments passed to the function.

Returns:

Station table with standardized station and coordinate columns.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame

Synthetic data format inspection and Salvus handling choices.

Synthetic Formats Overview

This module classifies configured synthetic roots into normalized waveform products such as MiniSEED, ASDF, or HDF5/H5, and raw Salvus output layouts that need coordinate and metadata correction before use.

Synthetic Formats Examples

Inspect one root:

info = inspect_synthetic_format("data/examples/synthetics")

class spatial_vtk.io.synthetic_formats.SalvusConversionRequest(components: tuple[str, ...] = ('N', 'E', 'R', 'T', 'Z'), event_latitude: float | None = None, event_longitude: float | None = None, station_coordinates: Mapping[str, Mapping[str, float]] | None = None, starttime_override: str | None = None, component_map: Mapping[str, str] | None = None, acceleration_scale: float = 100.0, input_acceleration_units: str = 'm/s^2', output_acceleration_units: str = 'cm/s^2')

Options for normalizing raw Salvus-style XYZ synthetic traces.

Parameters:
  • components (tuple[str, ...]) – Desired output components. Supported values are N, E, Z, R, and T.

  • event_latitude – Event coordinates used to rotate north/east traces to radial/transverse.

  • event_longitude – Event coordinates used to rotate north/east traces to radial/transverse.

  • station_coordinates (Mapping[str, Mapping[str, float]] | None) – Mapping from station code, NET.STA, or NET.STA.LOC to latitude and longitude values.

  • starttime_override (str | None) – Explicit start time to apply to every output trace.

  • component_map (Mapping[str, str] | None) – Optional raw-to-normalized component map. Defaults to X -> E, Y -> N, and Z -> Z.

  • acceleration_scale (float) – Multiplicative scale applied while reading raw point/acceleration values. Salvus receiver files are commonly written in m/s^2 while validation workflows often compare acceleration in cm/s^2; the default therefore multiplies by 100.

  • input_acceleration_units – Human-readable unit labels recorded on converted traces and ASDF metadata.

  • output_acceleration_units – Human-readable unit labels recorded on converted traces and ASDF metadata.

class spatial_vtk.io.synthetic_formats.SyntheticFormatInfo(root: str, format: str, normalized: bool, needs_salvus_handling: bool, handling_mode: str | None = None, converted_root: str | None = None)

Description of one synthetic data layout.

class spatial_vtk.io.synthetic_formats.SyntheticReadRequest(event_id: str | None = None, station: str | None = None, component: str | None = None, stations: tuple[str, ...] | None = None, components: tuple[str, ...] | None = None, pattern: str | None = None)

Request used by the normalized synthetic reader interface.

class spatial_vtk.io.synthetic_formats.SyntheticReader(info: SyntheticFormatInfo)

Read normalized synthetic waveforms through one format-aware interface.

list_stations(request: SyntheticReadRequest | None = None) list[str]

List station codes available for one request without loading all traces.

Parameters:

request (spatial_vtk.io.synthetic_formats.SyntheticReadRequest | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

list

read(request: SyntheticReadRequest | None = None)

Read synthetic traces for one request.

Parameters:
Returns:

ObsPy Stream for MiniSEED and best-effort ASDF reads. HDF5/Salvus readers raise clear errors until a project-specific schema adapter is supplied.

Return type:

object

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.synthetic_formats.inspect_synthetic_format(input_syn_path: str | Path) SyntheticFormatInfo

Inspect one synthetic path and classify its format.

Parameters:
  • input_syn_path (str | pathlib.Path) – Synthetic root, file, or glob template.

  • input_syn_path – Required function argument.

Returns:

Format classification.

Return type:

SyntheticFormatInfo

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.synthetic_formats.SyntheticFormatInfo

spatial_vtk.io.synthetic_formats.normalize_salvus_outputs_once(info: SyntheticFormatInfo, *, output_root: str | Path) SyntheticFormatInfo

Record the intended converted output root for raw Salvus products.

Parameters:
  • info – Raw Salvus format info.

  • output_root – Root where normalized products should be written.

Returns:

Updated info pointing to the converted root.

Return type:

SyntheticFormatInfo

Notes

This function records the conversion target. Use write_normalized_salvus_mseed() when an ObsPy stream has already been loaded and should be normalized into a reusable MiniSEED product. Raw HDF5 file layouts still require a schema adapter before they can be converted.

Parameters:
Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.synthetic_formats.SyntheticFormatInfo

spatial_vtk.io.synthetic_formats.normalize_salvus_stream(stream: Any, request: SalvusConversionRequest | None = None)

Return a normalized ObsPy Stream for raw Salvus-style synthetic traces.

Parameters:
  • stream (Any) – ObsPy Stream containing XYZ or already normalized component suffixes.

  • request (spatial_vtk.io.synthetic_formats.SalvusConversionRequest | None, optional) – Conversion options. Defaults to N/E/R/T/Z output with X -> E, Y -> N, and Z -> Z mapping.

  • stream – Required function argument.

  • request – Optional function argument. Defaults to None.

Returns:

New stream with requested components. The input stream is not mutated.

Return type:

obspy.Stream

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.synthetic_formats.prompt_for_salvus_handling(info: SyntheticFormatInfo) SyntheticFormatInfo

Prompt for handling raw Salvus outputs.

Parameters:
Returns:

Updated info with handling mode, or the input info for normalized data.

Return type:

SyntheticFormatInfo

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.synthetic_formats.SyntheticFormatInfo

spatial_vtk.io.synthetic_formats.read_salvus_receivers_h5(path: str | Path, *, origin_time: str, request: SalvusConversionRequest | None = None, receiver_start: int = 0, receiver_stop: int | None = None)

Read and normalize one Salvus receivers.h5 file.

Parameters:
  • path (str | pathlib.Path) – HDF5 file containing the supported Salvus receiver schema with names_ELASTIC_point and point/acceleration.

  • origin_time (str) – Authoritative event origin time. The file’s start_time_in_seconds offset is applied relative to this time.

  • request (spatial_vtk.io.synthetic_formats.SalvusConversionRequest | None, optional) – Optional conversion request. When omitted, only N/E/Z components are returned because R/T requires station coordinates.

  • receiver_start (int, optional) – Optional receiver index range for chunked conversion.

  • receiver_stop (int | None, optional) – Optional receiver index range for chunked conversion.

  • path – Required function argument.

  • origin_time – Required function argument.

  • request – Optional function argument. Defaults to None.

  • receiver_start – Optional function argument. Defaults to 0.

  • receiver_stop – Optional function argument. Defaults to None.

Returns:

Normalized stream for the requested receiver range.

Return type:

obspy.Stream

Returns:

Return value produced by the function.

Return type:

Any

spatial_vtk.io.synthetic_formats.synthetic_reader_for(info: SyntheticFormatInfo) SyntheticReader

Return a normalized reader for one synthetic format.

Parameters:
Returns:

Reader that exposes a common read() method.

Return type:

SyntheticReader

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.synthetic_formats.SyntheticReader

spatial_vtk.io.synthetic_formats.write_normalized_salvus_mseed(stream: Any, output_root: str | Path, *, event_id: str, request: SalvusConversionRequest | None = None) Path

Normalize a Salvus-style stream and write one event MiniSEED file.

Parameters:
  • stream (Any) – ObsPy Stream containing raw Salvus-style traces.

  • output_root (str | pathlib.Path) – Directory where the normalized event file should be written.

  • event_id (str) – Event identifier used in the output filename.

  • request (spatial_vtk.io.synthetic_formats.SalvusConversionRequest | None, optional) – Optional conversion request.

  • stream – Required function argument.

  • output_root – Required function argument.

  • event_id – Required function argument.

  • request – Optional function argument. Defaults to None.

Returns:

Written MiniSEED path.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

spatial_vtk.io.synthetic_formats.write_salvus_receivers_h5_mseed(path: str | Path, output_root: str | Path, *, event_id: str, origin_time: str, request: SalvusConversionRequest | None = None, receiver_start: int = 0, receiver_stop: int | None = None) Path

Convert one Salvus receivers.h5 file into normalized MiniSEED.

Parameters:
  • path (str | pathlib.Path) – Required function argument.

  • output_root (str | pathlib.Path) – Required function argument.

  • event_id (str) – Required function argument.

  • origin_time (str) – Required function argument.

  • request (spatial_vtk.io.synthetic_formats.SalvusConversionRequest | None, optional) – Optional function argument. Defaults to None.

  • receiver_start (int, optional) – Optional function argument. Defaults to 0.

  • receiver_stop (int | None, optional) – Optional function argument. Defaults to None.

Returns:

Return value produced by the function.

Return type:

pathlib.Path

Synthetic model alias discovery and folder resolution.

class spatial_vtk.io.model_aliases.ModelFolderCandidate(folder: str, base_model: str, basin_scope: str, has_ely: bool, implementation_tokens: tuple[str, ...], default_alias: str)

Describe one discovered synthetic model folder.

class spatial_vtk.io.model_aliases.ModelResolution(models: tuple[str, ...], model_folders: dict[str, str], ambiguous: dict[str, tuple[ModelFolderCandidate, ...]])

Resolved model aliases and backing folders.

spatial_vtk.io.model_aliases.available_base_models(input_syn_path: str | Path) list[str]

Return discovered base model families.

Parameters:
  • input_syn_path (str | pathlib.Path) – Synthetic root directory or template path.

  • input_syn_path – Required function argument.

Returns:

Sorted base-model family names.

Return type:

list of str

Returns:

Return value produced by the function.

Return type:

list

spatial_vtk.io.model_aliases.classify_model_folder(folder_name: str) ModelFolderCandidate | None

Classify one folder name into a model candidate.

Parameters:
  • folder_name (str) – Folder name to classify.

  • folder_name – Required function argument.

Returns:

Classified model candidate, or None when no known model token is found.

Return type:

ModelFolderCandidate or None

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.model_aliases.ModelFolderCandidate | None

spatial_vtk.io.model_aliases.normalize_model_alias(alias: str) str

Normalize model alias spelling.

Parameters:
  • alias (str) – User-supplied model alias.

  • alias – Required function argument.

Returns:

Normalized alias.

Return type:

str

Returns:

Return value produced by the function.

Return type:

str

spatial_vtk.io.model_aliases.resolve_model_aliases(requested: list[str], input_syn_path: str | Path, *, model_folders: dict[str, str] | None = None, allow_ambiguous: bool = False) ModelResolution

Resolve requested model aliases to folders under a synthetic root.

Parameters:
  • requested (list) – Requested aliases or folder names.

  • input_syn_path (str | pathlib.Path) – Synthetic root directory or path template containing {model}.

  • model_folders (dict[str, str] | None, optional) – Optional explicit alias-to-folder mapping.

  • allow_ambiguous (bool, optional) – If true, keep all folder matches by adding variant suffixes.

  • requested – Required function argument.

  • input_syn_path – Required function argument.

  • model_folders – Optional function argument. Defaults to None.

  • allow_ambiguous – Optional function argument. Defaults to False.

Returns:

Resolved aliases, selected folders, and any ambiguous matches.

Return type:

ModelResolution

Returns:

Return value produced by the function.

Return type:

spatial_vtk.io.model_aliases.ModelResolution

spatial_vtk.io.model_aliases.scan_synthetic_model_folders(input_syn_path: str | Path) list[ModelFolderCandidate]

Scan a synthetic root and classify immediate child folders.

Parameters:
  • input_syn_path (str | pathlib.Path) – Synthetic root directory or path template containing {model}.

  • input_syn_path – Required function argument.

Returns:

Classified folders sorted by folder name.

Return type:

list of ModelFolderCandidate

Returns:

Return value produced by the function.

Return type:

list

Geospatial and Layout Exports

KML export helpers for station and event context maps.

spatial_vtk.io.kml.write_station_event_kml(stations: DataFrame, events: DataFrame, output_path: str | Path, *, station_col: str = 'station', station_lat_col: str = 'lat', station_lon_col: str = 'lon', event_col: str = 'event_id', event_lat_col: str = 'lat', event_lon_col: str = 'lon') Path

Write station and event coordinates to a simple KML document.

Parameters:
  • stations (pandas.DataFrame) – Tables containing station/event names and coordinates.

  • events (pandas.DataFrame) – Tables containing station/event names and coordinates.

  • output_path (str | pathlib.Path) – Destination KML path.

  • station_col (str, optional) – Station name and coordinate columns.

  • station_lat_col (str, optional) – Station name and coordinate columns.

  • station_lon_col (str, optional) – Station name and coordinate columns.

  • event_col (str, optional) – Event name and coordinate columns.

  • event_lat_col (str, optional) – Event name and coordinate columns.

  • event_lon_col (str, optional) – Event name and coordinate columns.

  • stations – Required function argument.

  • events – Required function argument.

  • output_path – Required function argument.

  • station_col – Optional function argument. Defaults to 'station'.

  • station_lat_col – Optional function argument. Defaults to 'lat'.

  • station_lon_col – Optional function argument. Defaults to 'lon'.

  • event_col – Optional function argument. Defaults to 'event_id'.

  • event_lat_col – Optional function argument. Defaults to 'lat'.

  • event_lon_col – Optional function argument. Defaults to 'lon'.

Returns:

Path to the written KML file.

Return type:

pathlib.Path

Returns:

Return value produced by the function.

Return type:

pathlib.Path

File-layout inspection helpers for station/event datasets.

spatial_vtk.io.layouts.inspect_station_event_layouts(root: str | Path, *, suffixes: tuple[str, ...] = ('.csv', '.json', '.geojson', '.mseed', '.h5', '.hdf5', '.asdf'), max_files: int | None = None) DataFrame

Inventory station/event files below a root directory.

Parameters:
  • root (str | pathlib.Path) – Directory to inspect.

  • suffixes (tuple, optional) – File suffixes to include.

  • max_files (int | None, optional) – Optional cap on the number of files returned.

  • root – Required function argument.

  • suffixes – Optional function argument. Defaults to ('.csv', '.json', '.geojson', '.mseed', '.h5', '.hdf5', '.asdf').

  • max_files – Optional function argument. Defaults to None.

Returns:

File inventory with relative path, suffix, size, and light metadata.

Return type:

pandas.DataFrame

Returns:

Return value produced by the function.

Return type:

pandas.DataFrame