Snowflake Export Example

Export study records to Snowflake using the staged Parquet + COPY INTO path.

Prerequisites

  • imednet[snowflake] installed:

    pip install 'imednet[snowflake]'
    
  • Access to a Snowflake account with an internal stage and sufficient privileges to execute PUT and COPY INTO.

Environment variables

export IMEDNET_API_KEY=...
export IMEDNET_SECURITY_KEY=...      # if required by your environment
export SNOWFLAKE_ACCOUNT=myorg-myaccount
export SNOWFLAKE_USER=loader
export SNOWFLAKE_PASSWORD=...        # never hardcode
export SNOWFLAKE_DATABASE=IMEDNET_DB
export SNOWFLAKE_SCHEMA=PUBLIC
export SNOWFLAKE_WAREHOUSE=COMPUTE_WH
export SNOWFLAKE_STAGE=MY_STAGE
export SNOWFLAKE_TABLE=RECORDS
export IMEDNET_STUDY_KEY=MY_STUDY

Description

Records are fetched from the iMednet API, serialised to Parquet in a local staging directory, uploaded to the Snowflake internal stage via PUT, and bulk-loaded with COPY INTO. See Snowflake Export for a full explanation of the framework, idempotency, and cleanup behaviour.

"""Export study records to Snowflake using staged Parquet + COPY INTO.

Requires:
    pip install 'imednet[snowflake]'

Environment variables:
    IMEDNET_API_KEY       – iMednet API key
    IMEDNET_SECURITY_KEY  – iMednet security key (if required)
    SNOWFLAKE_PASSWORD    – Snowflake user password (never hardcode)

Usage::

    python examples/snowflake_export.py
"""

from __future__ import annotations

import os

from imednet import ImednetSDK
from imednet.integrations import SnowflakeSinkConfig, export_to_snowflake

# ---------------------------------------------------------------------------
# SDK initialisation — credentials are read from environment variables only.
# ---------------------------------------------------------------------------
sdk = ImednetSDK(
    api_key=os.environ["IMEDNET_API_KEY"],
    security_key=os.environ.get("IMEDNET_SECURITY_KEY", ""),
)

# ---------------------------------------------------------------------------
# Snowflake sink configuration.
# Credentials are passed via environment variables so they never appear in
# source code, logs, or exception messages.
# ---------------------------------------------------------------------------
config = SnowflakeSinkConfig(
    account=os.environ.get("SNOWFLAKE_ACCOUNT", "myorg-myaccount"),
    user=os.environ.get("SNOWFLAKE_USER", "loader"),
    **{"password": os.environ["SNOWFLAKE_PASSWORD"]},
    database=os.environ.get("SNOWFLAKE_DATABASE", "IMEDNET_DB"),
    schema=os.environ.get("SNOWFLAKE_SCHEMA", "PUBLIC"),
    warehouse=os.environ.get("SNOWFLAKE_WAREHOUSE", "COMPUTE_WH"),
    stage=os.environ.get("SNOWFLAKE_STAGE", "MY_STAGE"),
    table=os.environ.get("SNOWFLAKE_TABLE", "RECORDS"),
    # Optional: explicit local staging directory.  When omitted a temporary
    # directory is created and cleaned up automatically on exit.
    # local_staging_dir="/tmp/imednet_stage",
    #
    # Optional: JSONL manifest file for auditing loaded batches.
    # manifest_path="/tmp/imednet_manifest.jsonl",
    #
    # Idempotent mode (default True) uses FORCE = FALSE in COPY INTO so that
    # re-running this script does not reload already-ingested files.
    idempotent=True,
)

STUDY_KEY = os.environ.get("IMEDNET_STUDY_KEY", "MY_STUDY")

# ---------------------------------------------------------------------------
# Export
# ---------------------------------------------------------------------------
rows_loaded = export_to_snowflake(sdk, STUDY_KEY, config=config)
print(f"Loaded {rows_loaded} rows from study '{STUDY_KEY}' into Snowflake.")