Runtime Validation

Hard post-generation gate for production AI systems

Overview

Runtime validation creates a hard safety gate between AI generation and user output:

LLM → Validator → Output

Unlike static analysis (which checks your code), runtime validation checks AI-generated content before showing it to users.

Why Runtime Validation?

The Problem

AI models can generate unsafe content despite prompts and constraints:

// Your carefully crafted prompt
const prompt = "Provide supportive wellness insights only.";
 
const llmOutput = await callLLM(prompt);
// But LLM still generates: "You have cardiovascular disease."

Prompts aren't enough. You need a hard gate.

The Solution

Runtime validation blocks unsafe content:

import { createValidator } from '@the-governor-hq/constitution-core';
 
const validator = createValidator({ domain: 'wearables' });
 
const llmOutput = await callLLM(prompt);
const result = await validator.validate(llmOutput);
 
if (!result.safe) {
  return result.safeAlternative; // Safe fallback
}
 
return result.output; // Safe to show

How It Works

Fast Pattern Matching

Primary validation uses regex and keyword matching (under 10ms):

const result = await validator.validate("Your HRV looks good!");
// ✓ Safe (no violations)
 
const unsafe = await validator.validate("You have heart disease.");
// ✗ Blocked (medical diagnosis detected)

Checks:

Medical diagnosis patterns
Disease/condition naming
Treatment/cure claims
Authoritative language
Supplement dosing
Emergency language

Optional LLM Judge

For edge cases, use an LLM to judge safety:

const validator = createValidator({
  domain: 'wearables',
  useLLMJudge: true,
  strictMode: true,
  apiKey: process.env.OPENAI_API_KEY
});
 
const result = await validator.validate(llmOutput);

LLM Judge:

More thorough than patterns
Catches subtle violations
~500ms latency
API costs apply

When to use: Critical production flows, regulatory compliance

Deployment Modes

Choose the right mode for your infrastructure:

🪶 Lightweight Mode (Default - v3.4.0+)

Fast regex patterns only - ideal for small Node.js projects, serverless, CPU-constrained environments:

const validator = createValidator({
  domain: 'wearables',
  useSemanticSimilarity: false  // Default (no ML model)
});

Characteristics:

⚡ Ultra-fast: <10ms validation
📦 Zero dependencies: No ML models or vector DB
🪶 Lightweight: ~50KB bundle size
🌐 English-only: Pattern matching works best in English
⚠️ Basic evasion protection: Can be bypassed with spacing (d i a g n o s e) or misspellings

Best for:

Small Node.js apps
Serverless functions (AWS Lambda, Vercel, Cloudflare Workers)
Development/testing
English-only applications
CPU/memory-constrained environments

🛡️ Enhanced Mode (Opt-in)

Semantic similarity + multilingual support - for production deployments requiring maximum safety:

const validator = createValidator({
  domain: 'wearables',
  useSemanticSimilarity: true,  // Opt-in (downloads ~420MB ML model)
  semanticThreshold: 0.75
});

Characteristics:

🌍 Multilingual: Validates medical advice in 50+ languages
🛡️ Adversarial protection: Prevents spacing, special char, misspelling attacks
🧠 Semantic understanding: Catches concept-level violations
📦 Heavy: ~420MB ML model download on first use
⏱️ Slower: 100-300ms validation latency
💻 CPU-intensive: Uses transformers.js for embeddings

Best for:

Production deployments with non-English users
Security-critical applications
High-volume services with ML infrastructure
Preventing adversarial attacks

Model info: Uses paraphrase-multilingual-MiniLM-L12-v2 (384-dim embeddings, 50+ languages)

🔍 LLM Judge Mode

Most thorough validation - uses an external LLM to judge safety:

const validator = createValidator({
  domain: 'wearables',
  useLLMJudge: true,
  llmProvider: 'groq',  // or 'openai', 'anthropic'
  apiKey: process.env.GROQ_API_KEY
});

Characteristics:

🎯 Most accurate: Catches subtle violations patterns miss
💸 API costs: ~$0.0001-0.001 per validation
⏱️ Slowest: ~500ms latency
🔑 Requires API key

Best for:

Critical production flows
Regulatory compliance
Edge cases patterns can't catch
When accuracy > speed

🚀 Hybrid Mode (Recommended for Production)

Combine modes for optimal performance:

const validator = createValidator({
  domain: 'wearables',
  useSemanticSimilarity: true,  // Adversarial protection
  useLLMJudge: true,             // Edge case fallback
  apiKey: process.env.GROQ_API_KEY
});

Flow:

Fast patterns (5ms) - catch obvious violations
Semantic similarity (150ms) - catch obfuscated/multilingual
LLM judge (500ms) - final check for edge cases

🔮 Remote API Mode (Coming Soon)

Call remote semantic similarity API instead of running ML locally:

const validator = createValidator({
  domain: 'wearables',
  useSemanticSimilarity: true,
  semanticApiUrl: 'https://api.governor-hq.com/semantic'  // Not yet available
});

When available:

All benefits of enhanced mode
No local ML model download
Faster cold starts
Serverless-friendly

Basic Setup

Install

npm install @the-governor-hq/constitution-core

Create Validator

import { createValidator } from '@the-governor-hq/constitution-core';
 
const validator = createValidator({
  domain: 'wearables',      // or 'bci', 'therapy', 'core'
  onViolation: 'block',     // 'block' | 'sanitize' | 'warn' | 'log'
  strictMode: false,        // Enable LLM judge
  useLLMJudge: false,       // For edge cases
});

Validate AI Output

async function generateInsight(userData: UserData): Promise<string> {
  // Generate with LLM
  const llmOutput = await callLLM(userData);
  
  // Validate before returning
  const result = await validator.validate(llmOutput);
  
  if (!result.safe) {
    logger.warn('Blocked unsafe insight', {
      violations: result.violations,
      original: llmOutput
    });
    
    return result.safeAlternative;
  }
  
  return result.output;
}

Configuration Options

ValidatorConfig

interface ValidatorConfig {
  /** Domain rules: 'wearables' | 'bci' | 'therapy' | 'core' */
  domain?: string;
  
  /** Action when violation detected */
  onViolation?: 'block' | 'sanitize' | 'warn' | 'log';
  
  /** Use LLM judge for validation */
  useLLMJudge?: boolean;
  
  /** Strict mode (enables LLM judge) */
  strictMode?: boolean;
  
  /** OpenAI API key for LLM judge */
  apiKey?: string;
  
  /** Custom safe fallback message */
  defaultSafeMessage?: string;
  
  /** Custom validation rules */
  customRules?: ValidationRule[];
}

Violation Modes

Block Mode (Production)

Replace unsafe content with safe alternative:

const validator = createValidator({
  domain: 'wearables',
  onViolation: 'block'
});
 
const result = await validator.validate("You have sleep apnea.");
console.log(result.output);
// "Your sleep patterns show interruptions. Consider discussing with a doctor."

Sanitize Mode

Attempt to fix unsafe patterns automatically:

const validator = createValidator({
  domain: 'wearables',
  onViolation: 'sanitize'
});
 
const result = await validator.validate("You have insomnia and need CBT.");
console.log(result.output);
// "You've mentioned difficulty sleeping. Consider talking to a professional."

Warn Mode (Development)

Log warnings but allow content:

const validator = createValidator({
  domain: 'wearables',
  onViolation: 'warn'
});
 
const result = await validator.validate("You have high blood pressure.");
// Logs warning but returns original
console.log(result.output);
// "You have high blood pressure." (+ warning logged)

Use in development/staging to identify issues.

Log Mode (Analytics)

Silently log violations for monitoring:

const validator = createValidator({
  domain: 'wearables',
  onViolation: 'log'
});
 
const result = await validator.validate(llmOutput);
// Violations logged silently, content allowed

Use for A/B testing or gradual rollout.

ValidationResult

Every validation returns:

interface ValidationResult {
  safe: boolean;              // Is content safe?
  output: string;             // Original (if safe)
  violations: Violation[];    // List of violations
  safeAlternative?: string;   // Safe fallback
  metadata: {
    processingTime: number;   // ms
    patternsChecked: number;
    usedLLMJudge: boolean;
  };
}

Usage:

const result = await validator.validate(llmOutput);
 
if (!result.safe) {
  // Log violations
  console.error('Violations:', result.violations.map(v => v.id));
  
  // Return safe alternative
  return result.safeAlternative;
}
 
// Safe to show
return result.output;

Production Patterns

API Endpoint

app.post('/api/chat', async (req, res) => {
  const { message } = req.body;
  
  // Generate AI response
  const aiResponse = await callLLM(message);
  
  // Validate before sending
  const result = await validator.validate(aiResponse);
  
  res.json({
    message: result.safe ? result.output : result.safeAlternative,
    blocked: !result.safe,
    violations: result.violations.map(v => v.id)
  });
});

Background Jobs

async function processUserInsights(userId: string) {
  const userData = await getUserData(userId);
  const insight = await generateInsight(userData);
  
  // Validate before saving
  const result = await validator.validate(insight);
  
  if (!result.safe) {
    await saveInsight(userId, result.safeAlternative, {
      blocked: true,
      violations: result.violations
    });
  } else {
    await saveInsight(userId, result.output);
  }
}

Batch Validation

async function validateBatch(texts: string[]): Promise<ValidationResult[]> {
  return Promise.all(
    texts.map(text => validator.validate(text))
  );
}
 
const results = await validateBatch([
  "Your HRV is low.",
  "You have cardiovascular disease.", // Will be blocked
  "Consider talking to a doctor."
]);

Custom Rules

Add domain-specific validation:

const customRules: ValidationRule[] = [
  {
    id: 'no-age-diagnosis',
    pattern: /\b(aging|old age|elderly)\b.*\b(disease|disorder)\b/i,
    severity: 'error',
    message: 'Age-related medical claims not allowed',
    safeAlternative: 'Changes can occur over time. Discuss with a healthcare provider.'
  }];
 
const validator = createValidator({
  domain: 'wearables',
  customRules
});

Performance

Pattern Matching

Speed: <10ms per validation
Throughput: 100+ validations/second
Cost: Free (no API calls)

const start = Date.now();
const result = await validator.validate(text);
console.log(`Validated in ${Date.now() - start}ms`);
// Validated in 7ms

LLM Judge

Speed: ~500ms per validation
Thoroughness: Higher (catches edge cases)
Cost: API usage (OpenAI)

const validator = createValidator({
  domain: 'wearables',
  useLLMJudge: true,
  apiKey: process.env.OPENAI_API_KEY
});
 
const start = Date.now();
const result = await validator.validate(text);
console.log(`Validated in ${Date.now() - start}ms`);
// Validated in 523ms

Caching

Cache validation results for repeated content:

import NodeCache from 'node-cache';
 
const cache = new NodeCache({ stdTTL: 3600 });
 
async function validateWithCache(text: string) {
  const cached = cache.get(text);
  if (cached) return cached as ValidationResult;
  
  const result = await validator.validate(text);
  cache.set(text, result);
  
  return result;
}

Monitoring

Track violations in production:

const validator = createValidator({
  domain: 'wearables',
  onViolation: (violations, original) => {
    // Send to monitoring
    metrics.increment('ai.safety.violations', {
      count: violations.length,
      rules: violations.map(v => v.id).join(',')
    });
    
    // Log for debugging
    logger.warn('Safety violation detected', {
      violations: violations.map(v => ({ id: v.id, message: v.message })),
      original,
      timestamp: new Date()
    });
    
    return 'Unable to generate safe response.';
  }
});

Testing

Unit Tests

import { createValidator } from '@the-governor-hq/constitution-core';
 
describe('Runtime Validator', () => {
  const validator = createValidator({ domain: 'wearables' });
  
  it('blocks medical diagnosis', async () => {
    const result = await validator.validate("You have sleep apnea.");
    expect(result.safe).toBe(false);
    expect(result.violations[0].id).toContain('diagnosis');
  });
  
  it('allows observation language', async () => {
    const result = await validator.validate("Your sleep shows interruptions.");
    expect(result.safe).toBe(true);
  });
});

Integration Tests

describe('API Safety', () => {
  it('blocks unsafe AI responses', async () => {
    const response = await request(app)
      .post('/api/chat')
      .send({ message: 'What does my data mean?' });
    
    expect(response.body.message).not.toContain('disease');
    expect(response.body.message).not.toContain('diagnosis');
  });
});

Summary

Runtime validation:

✅ Hard gate between AI and user
✅ Fast pattern matching (<10ms)
✅ Optional LLM judge for edge cases
✅ Multiple violation modes
✅ Custom rules support
✅ Production-ready

Use cases:

API endpoints serving AI content
Background job processing
Batch insight generation
Real-time chat applications

Best practices:

Use block mode in production
Cache validation results
Monitor violations
Test edge cases
Start with patterns, add LLM judge if needed

Next: Evaluation System for testing safety compliance

CLI Tools Evaluation System