Research Use Only - Validated Results

SAE Intelligence: Interpretable Genomic Features

Go beyond the score. See the exact biological features—exons, TF motifs, protein structures—that drive a prediction and understand *why* a variant is disruptive.

Why It Matters

  • Transform black-box predictions into transparent, biologically-grounded stories
  • Expose the model's internal logic to explain variant impact
  • Flag risky designs and steer generative AI (roadmap)

What We Delivered

  • Interactive feature visualizations with disruption scores
  • Automated prompt safety checks
  • Clear biological explanations for every prediction

Built for Different Audiences

SAE Intelligence serves both scientific discovery and engineering excellence

For Scientists

  • **Readable Biology:** See features like exon boundaries, TF motifs, and secondary structures.
  • **Quantifiable Disruption:** Pinpoint exactly which features a variant impacts with disruption scores (ΔLL).
  • **Explainable AI:** Move from a simple score to a full, auditable explanation for every prediction.

For Engineers

  • **Live Frontend Components:** Interactive visualizations powered by robust simulations.
  • **Clear Data Contracts:** Stable JSON from simulations drives predictable UI behavior.
  • **Roadmap to Production:** Clear path from current RUO simulations to future-state backend services.

How It Works Today

Current implementation details and technical foundation

Component 1

**`DynamicOracleExplain` Component:** An interactive, multi-track visualizer that displays SAE features and their disruption scores (ΔLL) directly on the genomic sequence.

Component 2

**`simulateVariantImpactWithSAE` Function:** A powerful simulation in `simulations.ts` that generates the rich feature and attribution data needed to power our visualizations.

Component 3

**Prompt Quality Checker:** A safety gate that flags pathological inputs (like low‑complexity repeats) in our design flows.

Core Capabilities

From feature attribution to activation steering

Feature Attribution (Live)

LIVE

Technical

We simulate the extraction of active SAE features for a given sequence and calculate the change in log-likelihood (ΔLL) caused by a variant.

Scientific

Connects the model's internal logic to human-readable biological concepts (RUO).

Business

- **Trust:** Defend and document decisions with feature-linked, quantitative explanations.

Use Cases

Today:
1. **Interactive feature tracks** in our `DynamicOracleExplain` component.
2. **Quantitative disruption scores** to rank a variant's impact.

Prompt Safety (Live)

LIVE

Technical

Detect low‑complexity repeats and other pathological attractors; flag viral/sensitive content (aligned with Forge safety gates).

Scientific

Reduces junk outputs and improves the reliability of generative demos.

Business

- **Quality:** Fewer dead‑ends in design flows and cleaner, more compelling demos.

Use Cases

Today:
1. **Automated safety checks** on design inputs, with clear user warnings.

Activation Steering (Roadmap)

ROADMAP

Technical

Expose endpoints to nudge/target feature activations (e.g., chromatin patterns, motif presence) with compute‑aware beam search.

Scientific

Maps CrisPRO.ai‑style inference‑time scaling to controllable design objectives.

Business

- **Control:** Achieve predictable design quality scaling with transparent, auditable controls.

Use Cases

Roadmap:
1. **Steer** generation towards desired feature sets; **measure** quality and efficacy metrics.

Interactive Demonstrations

See SAE Intelligence in action

Feature Overlay Visualization

Toggle Features:

Genomic Sequence (43044290-43044450):

Feature Types:

Exon
Intron
TFBS
Structure
Motif

Disruption Scores (ΔLL)

4
Features
2
High Impact
25.6
Total ΔLL

Exon Boundary

High Impact

-12.5
ΔLL

TF Motif (AP-1)

High Impact

-8.2
ΔLL

Secondary Structure

Medium Impact

-3.1
ΔLL

Splice Site

Low Impact

-1.8
ΔLL

Key Insight:

The ΔLL (Delta Log-Likelihood) score quantifies how much a variant disrupts each biological feature. Negative values indicate disruption, with more negative values showing greater impact.

Prompt Safety Checker

2
Check Types
0
Safe Patterns
1
High Risk

Key Benefits:

  • • Prevents pathological inputs that could generate junk outputs
  • • Flags low-complexity repeats and ambiguous sequences
  • • Improves reliability of generative AI demonstrations
  • • Provides clear suggestions for sequence improvement

Activation Steering (Roadmap)

Roadmap Feature

Overall Progress

46% Complete

AP-1 Binding Sites

Transcription factor binding motifs

TFBS
Current:
0.3

Open Chromatin

Accessible chromatin regions

Chromatin
Current:
0.5

Alpha Helix

Protein secondary structure

Structure
Current:
0.2

Roadmap Feature

Activation steering is currently in development. This demo shows the planned interface for controlling feature activations during generation, with compute-aware beam search and predictable quality scaling.

Planned Benefits:

  • • Steer generation towards desired biological features
  • • Predictable quality scaling with transparent controls
  • • Compute-aware beam search for efficient generation
  • • Auditable design process with clear provenance

Observed Outcomes

Real-world impact from SAE Intelligence

Observed Outcomes

Clearer "why" lines on variant reports, linked directly to biological features.
Fewer junk outputs in design flows via the integrated safety checker.
Increased stakeholder trust, as interpretable overlays reduce black‑box concerns.

Institutional Value

Why SAE Intelligence matters for your organization

For the Institution

  • Interpretable overlays increase confidence and adoption across teams.
  • Safer demos and design explorations with automated prompt checks.
  • A clear path to controllable, auditable in-silico design (roadmap).

Technical Implementation

Current state and roadmap details

Data Contract

SAE Features

`{ featureId, description, position, strength }` - The active biological features at specific locations.

Delta LL Series

`{ featureId, description, deltaLL }` - The quantitative disruption score for each feature caused by the variant.

Provenance

run_id, model_profile, etc.

Code Locations

Frontend Simulation (Live)

src/utils/simulations.ts

live

Frontend Component (Live)

src/components/site/blocks/DynamicOracleExplain.tsx

live

Backend Service (Roadmap)

@/api/routers/sae.py

roadmap

Ready to See SAE Intelligence in Action?

Experience interpretable AI that explains every prediction