Governance — Trust, Traceability & Audit Trail

The governance module provides three core capabilities required for regulated clinical data environments:

  1. Configuration Version Control — immutable, SHA-256-signed study configuration history with diff and rollback.

  2. Publisher Wizard (Streamlit page) — multi-stage validation checklist and security-gated publish workflow.

  3. Data Lineage Explorer (Streamlit page) — interactive drill-down from aggregated dashboard metrics to the underlying raw EDC record payloads.

These features are implemented across two packages:

  • imednet-workflows — the backend ConfigVersionStore class.

  • imednet-streamlit — the two Streamlit dashboard pages.

Configuration Version Control

imednet_workflows.config_version_control provides ConfigVersionStore, a thread-safe SQLite-backed ledger for StudyConfiguration versions.

Key properties

  • Immutability — commit rows are append-only; no history block can be edited or deleted in place.

  • Hash integrity — each commit is identified by the SHA-256 digest of its serialised JSON body. Attempting to commit an unchanged configuration raises ValueError.

  • Rollback safetyrollback_config() is non-destructive; it returns the historical StudyConfiguration without touching any existing rows.

Quick-start

from imednet.models.study_config import StudyConfiguration, MappingRule
from imednet_workflows import ConfigVersionStore

store = ConfigVersionStore()          # default: ~/.imednet/config_versions.sqlite3

config = StudyConfiguration(
    studyKey="MY_STUDY",
    mappings=[
        MappingRule(
            domain="AE",
            targetField="aeTerm",
            sourceFormKey="AE_FORM",
            sourceVariableName="ae_term",
        )
    ],
)

# Commit a version
commit_id = store.commit_config(
    study_key="MY_STUDY",
    config=config,
    user="alice",
    desc="Initial mapping configuration",
)
print(commit_id)   # SHA-256 hex digest

# Browse history
for entry in store.get_history("MY_STUDY"):
    print(entry["version_tag"], entry["commit_id"][:12], entry["timestamp"])

# Diff two versions
diff = store.diff_configs(commit_a, commit_b)
print(diff["added"], diff["removed"], diff["changed"])

# Rollback (read-only)
old_config = store.rollback_config("MY_STUDY", commit_id)

API reference

class imednet_workflows.config_version_control.ConfigVersionStore[source]

Bases: object

Immutable append-only store for StudyConfiguration versions.

Each call to commit_config() creates a new entry signed with a SHA-256 digest of the serialised configuration body. History is strictly read-only — individual commit rows may never be edited or deleted.

Parameters:

db_path (str | Path) – Filesystem path for the SQLite database. Defaults to ~/.imednet/config_versions.sqlite3.

__init__(db_path=PosixPath('/home/runner/.imednet/config_versions.sqlite3'))[source]
Parameters:

db_path (str | Path) –

Return type:

None

commit_config(study_key, config, user, desc)[source]

Serialise config, compute its SHA-256 hash, and persist the commit.

Parameters:
  • study_key (str) – Identifies the study this configuration belongs to.

  • config (StudyConfiguration) – The StudyConfiguration to store.

  • user (str) – Identifier of the person or process making the change.

  • desc (str) – Human-readable description of what changed.

Return type:

str

Returns:

The commit_id (SHA-256 hex digest of the serialised JSON body).

Raises:

ValueError – If a commit with the same content hash already exists for this study, indicating a no-op duplicate.

diff_configs(commit_a, commit_b)[source]

Compute a property-level diff between two commits.

Compares the flat JSON key space of the two commits. Returns a dict with three sub-keys:

  • added — keys present in b but not in a.

  • removed — keys present in a but not in b.

  • changed — keys present in both but with different values.

Parameters:
  • commit_a (str) – SHA-256 commit ID of the before state.

  • commit_b (str) – SHA-256 commit ID of the after state.

Return type:

dict[str, Any]

Returns:

Dict with added, removed, and changed sub-dicts.

Raises:

KeyError – If either commit ID is not found in the store.

get_history(study_key)[source]

Return all commits for study_key, ordered oldest-first.

Parameters:

study_key (str) – The study whose history should be retrieved.

Return type:

list[dict[str, Any]]

Returns:

A list of dicts, each with keys commit_id, study_key, version_tag, modified_by, description, and timestamp. The config_data body is intentionally omitted to keep the payload small — use rollback_config() to retrieve the full body for a specific commit.

rollback_config(study_key, commit_id)[source]

Restore and return the StudyConfiguration stored at commit_id.

This method is non-destructive — it does not modify any existing history rows. The caller is responsible for creating a new commit via commit_config() if they wish to record the rollback.

Parameters:
  • study_key (str) – The study the commit must belong to.

  • commit_id (str) – The SHA-256 commit ID to restore.

Return type:

StudyConfiguration

Returns:

The deserialised StudyConfiguration.

Raises:

KeyError – If the commit is not found or does not belong to the requested study.

Publisher Wizard (Streamlit page)

The Publisher Wizard (imednet_streamlit.pages.publisher_wizard) is a Streamlit dashboard page that wraps the configuration version control system with a security-gated publish workflow.

Workflow stages

  1. Identity — the user enters a username and selects a role. Only manager and admin roles may proceed to publish.

  2. History — a select box lists all committed versions for the active study key. The user chooses the version to deploy.

  3. Raw JSON viewer — the full configuration JSON can be inspected in an expandable panel.

  4. Historical diff — a side-by-side diff between any two historical commits is rendered before approval.

  5. Standards-readiness checklist — automated checks verify:

    • Field mappings are defined.

    • Terminology normalisation rules are present.

    • Dashboard widgets are configured.

    • The version tag is well-formed (semver-like).

    • The study key is non-empty.

  6. Approve & Publish — an authorised user clicks the guarded button. On success a new commit is recorded in the ledger with a bumped patch version, providing a full audit trail of the publish event.

Data Lineage Explorer (Streamlit page)

The Data Lineage Explorer (imednet_streamlit.pages.data_lineage) makes every aggregated metric traceable back to its source data.

Three-pane lineage view

Selecting a record index opens a side-by-side view:

  • Left pane — raw EDC record payload from the local cache database. Sensitive field names (api_key, token, secret, etc.) are automatically redacted before display.

  • Centre pane — the mapping rules from the active StudyConfiguration that were applied to this domain.

  • Right pane — the structured canonical Pydantic model (AdverseEvent, ProtocolDeviation, or DeviceDeficiency).

Credential safety

The lineage view never exposes credentials. The _redact_sensitive helper strips any dict key whose name contains the substrings password, token, secret, api_key, apikey, key, or credential before the raw payload is rendered.