Ethics & Safety.

Principles, fairness, privacy, alignment.

What a CEO/CTO needs to know
If your safety review is a thing you will do before launch, it is a thing that will not happen. The regulator finds the issue you deferred.

User input passes a safety gate before it reaches the model, and red-team probes test that gate continuously.

[WHAT IT IS]

The engineer’s view, in plain language.

Bias audits, PII handling, and alignment evaluations are built into the eval suite from day one, not bolted on after launch when legal asks. Safety is a property of the architecture, not a checkbox at the end.

[HOW WE BUILD IT]

What “done right” looks like.

Safety evals alongside functional

A safety test set runs next to the functional one, so a release that regresses on bias or leakage fails the same gate as a broken feature.

Red-team in CI

Adversarial prompts run on every PR. PII detection covers both inputs and outputs, not just what the user typed.

Reviewed before release

Alignment checks run against the firm's policy stack, and the client's InfoSec signs off before each release, not after the incident.

[MATURITY LADDER]

Where does your build sit?

Four rungs from absent to production-grade. Level 3 is the target, and the only one that survives a real production incident.

Absent

No safety evals. Bias and leakage are discovered by users or regulators.

Ad-hoc

Occasional manual spot-checks before big launches, nothing automated.

Managed

A safety test set exists but does not gate releases, and review is inconsistent.

L3Target

Production-grade

Safety evals gate every release, red-team runs in CI, InfoSec signs off pre-release.

[VALIDATE IT YOURSELF]

How to check it’s really there.

You do not need to read the code. Ask these questions and demand these artifacts. Vague answers are the finding.

★ Ask your team

?Does a safety regression fail the build the same way a broken feature does?
?Do we red-team adversarial prompts on every change, or only before launch?
?Who signs off on safety before a release ships?

★ Demand to see

A safety eval suite wired into CI
Red-team prompt sets + PII detection on inputs and outputs
A pre-release InfoSec sign-off record

● WHAT L0 LOOKS LIKE

The failure mode, in production.

"We will do the safety review before launch," said three months ago, never started. A regulator finds the issue first, and the engagement turns into a remediation project.

Useful for a CEO or CTO sizing up an AI build? Share the Ethics & Safety layer.

View .md

← Layer 3Algorithmic Fundamentals Layer 5 →Code as Liability

Want this layer audited in your stack?

We run the K-Framework against your AI build and hand you the gap list, ranked by what it will cost you in production.

Book a K-Framework audit →All 16 layers