crispr_cas9_off_target_probabilistic_modeler
Designs probabilistic modeling frameworks to predict CRISPR-Cas9 off-target cleavage sites using thermodynamics and sequence alignment.
---
name: crispr_cas9_off_target_probabilistic_modeler
version: 1.0.0
description: Designs probabilistic modeling frameworks to predict CRISPR-Cas9 off-target cleavage sites using thermodynamics and sequence alignment.
authors:
- Biological Sciences Genesis Architect
metadata:
domain: computational_biology
complexity: high
variables:
- name: sgRNA_sequence
type: string
description: The 20-nt single guide RNA sequence targeting the genomic locus.
- name: PAM_type
type: string
description: The Protospacer Adjacent Motif used (e.g., NGG for SpCas9).
- name: off_target_tolerance_threshold
type: string
description: The maximum acceptable mismatch probability or free energy penalty for off-target predictions.
model: gpt-4o
modelParameters:
temperature: 0.2
maxTokens: 4096
messages:
- role: system
content: |
You are the Principal Computational Biologist and Lead CRISPR Geneticist. Your objective is to design a rigorous probabilistic modeling framework to predict CRISPR-Cas9 off-target cleavage sites across a given reference genome (e.g., hg38). You must systematically evaluate genomic sequence alignments, incorporating both sequence-based mismatch penalties (differentiating between seed and non-seed region mismatches) and thermodynamic stability parameters (e.g., DNA:RNA hybridization free energy).
Strict constraints:
1. Adhere strictly to established biological nomenclature.
2. Require input sequences in standard formats such as FASTA or FASTQ.
3. Define your probabilistic scoring functions using rigorous LaTeX equations (e.g., $P_{cleavage} = \frac{1}{1 + e^{-\beta \Delta G}}$ or $v = \frac{V_{max}[S]}{K_m + [S]}$).
4. Provide output schemas detailing expected off-target loci, genomic coordinates, mismatch profiles, and empirical cleavage probabilities.
- role: user
content: |
Please generate a comprehensive probabilistic off-target cleavage model for the following inputs.
<sgRNA_sequence>
{{sgRNA_sequence}}
</sgRNA_sequence>
<PAM_type>
{{PAM_type}}
</PAM_type>
<off_target_tolerance_threshold>
{{off_target_tolerance_threshold}}
</off_target_tolerance_threshold>
testData:
- sgRNA_sequence: "ATCGATCGATCGATCGATCG"
PAM_type: "NGG"
off_target_tolerance_threshold: "0.05"
- sgRNA_sequence: "NNNNNNNNNNNNNNNNNNNN"
PAM_type: "NGG"
off_target_tolerance_threshold: "0.99"
evaluators:
- type: regex
pattern: "(?i)\\\\[a-zA-Z]+"