NEXUS Runtime Changelog

v1.6.0 May 2026

Reliability Score, Root Cause Analysis, GitOps CI/CD Pipeline & Interactive Run Detail

New Improvement

NEXUS Reliability Score — New signature metric that combines gateway availability, latency stability, policy friction, prompt eval health, and NEXUS AI deployment health into a transparent 0-100 score.
AI root cause analysis — Observability now explains why AI failed by correlating request logs, provider errors, rate limits, policy violations, eval failures, and NEXUS AI deployment failures into categorized RCA signals with evidence and next actions.
GitOps pipeline runner — Three-step async pipeline (validate → deploy_prompt → verify) executes fire-and-forget via setImmediate. Each step records start time, completion time, and output independently. The HTTP response returns 202 before any pipeline logic runs.
Interactive run detail drawer — Clicking any pipeline run row opens a slide-in detail panel showing step-by-step progress, output for each step, duration, branch, commit SHA, and triggered-by info. Auto-refreshes every 3 seconds for active runs.
HMAC-SHA256 webhook validation — GitHub push webhooks are validated with X-Hub-Signature-256 before any pipeline logic executes. Invalid signatures return 400 immediately.
Rollback integration — Release Management rows link prompt names directly to the Prompts page with the relevant prompt pre-selected.
Fix: deployed_by NOT NULL constraint — Pipeline runs triggered by webhooks now pass triggered_by as deployed_by. The column also accepts NULL for webhook-triggered deployments.

v1.5.0 April 2026

Policy Engine & Budget Governance

New

Policy engine — Intercepts gateway requests with ordered, composable rules. Policies execute by priority (lower = first). Four built-in types: PII detection, model restriction, spend limit, and time restriction.
PII detection — Scans request content for credit card numbers, SSNs, email addresses, and phone numbers. Configurable action: block or log-only.
Spend limit policies — Block requests when a tenant's daily/monthly spend exceeds a configured threshold. Tracked via the tenant_daily_spend rollup table.
Budget alerts — Email alerts fire when spend crosses configured thresholds (80%, 100% of monthly budget). Background budget-checker runs every 5 minutes.
Quota enforcement middleware — enforceQuota() checks the API key's monthly request quota before routing. Returns QUOTA_EXCEEDED with HTTP 429 on breach.

v1.4.0 March 2026

Eval Suites with gate_publish

New

Eval suites — Create eval suites with N test cases per prompt. Each case specifies a variable map and one or more assertions. Results show per-case pass/fail with output excerpts.
Assertion types — pass_fail (boolean), json_schema (validates output against a JSON Schema), regex (pattern match), and semantic (embedding-based similarity check).
gate_publish — Mark an eval suite with gate_publish: true to block publishing a prompt version if any case fails. Enforces quality gates without manual review steps.
Eval run history — Full run history with timestamps, pass rates, and per-case diffs. Archived runs are queryable for regression tracking.

v1.3.0 February 2026

Observability Dashboard & Render Logs

New Improvement

Observability dashboard — Six metric panels: per-provider latency (p50/p95/p99), token throughput, error rate by status code, circuit breaker state timeline, daily cost trend, and per-model request volume.
Provider health polling — Portal polls /v1/gateway/health every 30 seconds to display real-time circuit breaker state per provider.
Prompt render logs — Every prompt render is logged with a SHA-256 hash of the variable map for deduplication. Logs are partitioned by month.
Request log partitioning — request_logs table partitioned by day with 90-day retention. Old partitions drop automatically.
Live stats polling — Dashboard stats refresh every 10 seconds using React Query with refetchInterval.

v1.2.0 January 2026

Multi-Provider AI Router with Circuit Breaker

New

Circuit breaker per provider — Three-state machine (closed → open → half-open). Opens after 5 consecutive 5xx errors or timeouts. Stays open for 30 seconds. Two probe successes close it. 4xx errors never trip the circuit.
Automatic failover — executeWithFailover() builds a provider chain by weight, skips open circuits, retries on 408/429/5xx. Custom error types: ProviderHttpError, ProviderTimeoutError, GatewayError, AllProvidersFailedError.
Provider registry — Static model-to-provider map with exact and prefix matching. Stores per-provider base URL, API key env var, timeout, and weight.
Response headers — x-nexus-provider, x-nexus-attempt, x-nexus-latency-ms attached to every response.
Streaming pass-through — SSE streams piped directly from provider to client without buffering. Zero additional latency on streaming paths.

v1.1.0 December 2025

Prompt Registry: Semver, Deployments & Rollback

New

Semver versioning — Prompts follow semantic versioning. Versions are immutable once published. A partial unique index enforces the single-draft constraint at the database level.
Per-environment deployments — prompt_deployments stores one active deployment per (tenant, prompt, env). Supports dev, staging, and prod independently.
One-step rollback — previous_version_id stored at deploy time enables instant rollback without scanning deployment history.
Handlebars-style renderer — Resolves {{varName}} and {{varName | default}}. SHA-256 of variable map deduplicates render logs.
Audit log — All create/update/publish/deploy/rollback actions logged to prompt_audit_log fire-and-forget. Audit failures never block the primary operation.

v1.0.0 November 2025

Initial Release — Multi-Tenant Platform Foundation

Launch

Multi-tenant architecture — All resources scoped to tenants. Tenant isolation at the database level via tenant_id on every table. requireTenant middleware enforces membership on all protected routes.
API key management — SHA-256 key hashing only — plaintext is never stored. Keys have configurable scopes and quotas. Key prefix (first 8 chars) shown in the portal.
Google & GitHub OAuth — Session-based auth for portal access. OAuth is optional — API key auth still works without it.
Rate limiting — Per-key sliding-window rate limiting backed by Redis. Configurable RPM limits.
Developer portal — React + React Query frontend with spend dashboards, API key management, and router configuration. Local draft pattern for router config.
Database partitioning — request_logs partitioned by day, prompt_render_logs by month. Automated retention cleanup.