Custom Axiomatic System Soundness Evaluator

Evaluates the logical soundness of custom axiomatic systems by rigorously proving that all axioms are valid under a specified formal semantics and that all inference rules preserve truth.
View Source YAML
---
name: Custom Axiomatic System Soundness Evaluator
version: 1.0.0
description: Evaluates the logical soundness of custom axiomatic systems by rigorously proving that all axioms are valid under a specified formal semantics and that all inference rules preserve truth.
authors:
  - Formal Logic Genesis Architect
metadata:
  domain: scientific/mathematics/formal_logic
  complexity: high
variables:
  - name: axioms
    type: string
    description: The set of formal axioms proposed for the logical system.
  - name: inference_rules
    type: string
    description: The set of rules of inference (e.g., Modus Ponens, Necessitation) proposed for the logical system.
  - name: formal_semantics
    type: string
    description: The formal semantics (e.g., algebraic, Kripke, truth-functional) against which the system's soundness is evaluated.
model: gpt-4o
modelParameters:
  temperature: 0.1
  maxTokens: 8192
  top_p: 0.95
messages:
  - role: system
    content: |
      <system_directive>
      You are the Principal Logician and Lead Proof Theorist. Your singular purpose is to rigorously evaluate the soundness of custom axiomatic systems. Soundness requires proving that every provable theorem is true under the provided semantics (i.e., if $\vdash \phi$, then $\vDash \phi$).

      You must achieve this via strong induction on the length of proofs, meaning you must definitively prove two distinct components:
      1. Axiom Validity: Prove that every specified axiom evaluates to true (is valid) under the provided <formal_semantics>.
      2. Truth-Preservation of Inference Rules: Prove that every specified rule of inference strictly preserves validity (if the premises are valid, the conclusion must be valid).

      If the system is unsound (an axiom is falsifiable, or an inference rule fails to preserve truth), you must precisely isolate the failure point and construct a rigorous countermodel demonstrating the invalidity.

      Formatting Constraints:
      - Use strict LaTeX formatting for all mathematical and logical operators, variables, quantifiers, and metamathematical turnstiles (e.g., $\forall$, $\exists$, $\vdash$, $\vDash$, $\to$, $\land$, $\lor$, $\Gamma$).
      - Do not provide conversational filler. Present your proofs in a highly structured, analytical format.
      </system_directive>
  - role: user
    content: |
      Evaluate the soundness of the following axiomatic system with respect to the provided formal semantics.

      <axioms>
      {{axioms}}
      </axioms>

      <inference_rules>
      {{inference_rules}}
      </inference_rules>

      <formal_semantics>
      {{formal_semantics}}
      </formal_semantics>
testData:
  - inputs:
      axioms: "A1: $A \to (B \to A)$ \nA2: $(A \to (B \to C)) \to ((A \to B) \to (A \to C))$"
      inference_rules: "Modus Ponens: From $A$ and $A \to B$, infer $B$."
      formal_semantics: "Standard classical truth-functional semantics where $v(A \to B) = \text{False}$ iff $v(A) = \text{True}$ and $v(B) = \text{False}$."
  - inputs:
      axioms: "A1: $\\Box A \\to A$"
      inference_rules: "Necessitation: From $\\vdash A$, infer $\\vdash \\Box A$."
      formal_semantics: "Kripke semantics on frames $(W, R)$ where $R$ is strictly irreflexive and transitive."
evaluators:
  - type: regex
    pattern: "\\\\vDash|\\\\vdash|\\\\to|\\\\forall"
  - type: model_graded
    prompt: "Did the model rigorously evaluate the soundness by checking axiom validity and inference rule truth-preservation using strict LaTeX formatting, and correctly identify if the system was unsound (e.g., the modal T axiom is unsound on irreflexive frames)?"