Designing Approval Workflows for ML Models: What Compliance Teams Need from the Sign-Off Interface
When we started designing the compliance review interface for Cognify, we made the mistake that most ML tool designers make: we assumed that compliance reviewers were a variant of ML engineers who needed a simplified version of an engineering dashboard. We were wrong, and the feedback from the first few compliance officers who used the early prototype was unambiguous about why.
Compliance teams don't want a simplified ML dashboard. They don't want to navigate to a run ID, expand a metrics panel, and form their own opinion about whether the evaluation results are acceptable. What they want is structured, auditable, workflow-oriented interface: here is what you need to review, here is the context you need to evaluate it, here are your options (approve, reject, request clarification), and when you choose one of those options, the decision is recorded immutably with your identity and timestamp attached.
This is a different UX paradigm — closer to a document management and approval system than an engineering monitoring tool. This post covers what we learned designing it, what compliance teams actually need from a model sign-off interface, and the specific requirements that distinguish a real compliance approval workflow from a model registry stage transition.
The ML dashboard problem for compliance teams
ML dashboards are optimized for exploration. You can filter runs, sort by metric, compare parameters, drill down into artifacts. This is valuable for engineers because their job is to understand the full landscape of training runs and make good engineering decisions.
Compliance reviewers have a different job. They're not exploring a space of possible models — they're evaluating a specific model version against specific criteria to make a binary decision (approve or not approve). Their workflow is more like a structured review of a deliverable than an open-ended analysis. The information they need to see is fixed (the compliance documentation package for this specific run), the criteria they're evaluating against are defined (policy thresholds, regulatory requirements), and the outcome they're producing is a formal decision record.
Exposing compliance teams to a full ML dashboard creates several problems. They see information that's irrelevant to their review (loss curves, GPU utilization, hyperparameter sweep artifacts from rejected candidate runs) alongside the information they need. The relevant information isn't structured around their review criteria. There's no guided workflow — no indication of what they should look at, in what order, or what decision options are available. And critically, when they make a decision, there's no place to record it that creates a defensible approval record.
What a structured review interface requires
A compliance review interface built around the reviewers' actual workflow has several required elements:
Review package presentation. The reviewer should see a structured package — the compliance documentation for the specific model version they're reviewing — presented in a fixed, consistent format. Not a link to the MLflow run. Not a zip file of artifacts. A structured document where each section corresponds to a review criterion: training data provenance, evaluation results against policy benchmarks, known limitations, deployment scope, regulatory cross-references.
Contextual annotation. The reviewer should be able to add comments to specific sections of the review package. If they have a question about a data source authorization record, they annotate that section — creating a documented request for clarification that the ML team can respond to. The comment thread is part of the approval record, not a separate email or Jira ticket.
Decision action with identity binding. The approval or rejection action should be bound to the reviewer's authenticated identity — not just a button press, but a signed assertion that this specific person reviewed this specific version of this documentation and made this specific decision. The identity binding is what gives the approval record legal significance.
Immutability after decision. The approval record — the documentation package, the reviewer's identity, the decision, the timestamp, and any comments — should be locked after the decision is made. The reviewer should not be able to change their approval to a rejection retroactively. The ML team should not be able to modify the documentation after it's been approved. The record becomes evidence, not a live document.
Role-based gates and sequencing
Real compliance approval workflows at regulated enterprises typically involve multiple reviewers with different roles, in a specific sequence. A model deployed in a bank's credit division might require: a data governance review (confirming the training data provenance), a model risk management review (evaluating performance against MRM thresholds), a fair lending review (confirming disparate impact analysis), and a final approval by the AI governance committee.
These reviews need to happen in sequence — the data governance review confirms the training data before the model risk review evaluates model performance, because the MRM team's evaluation is only meaningful if the data provenance is confirmed. And each review step needs to be gated: the model risk reviewer shouldn't be able to approve before the data governance reviewer has completed their review.
The workflow also needs exception handling. What happens if one reviewer rejects but others approve? What if a reviewer is out of office and a deadline is approaching? What if the ML team makes a minor documentation correction after one reviewer has already approved — does that approval still stand, or does it need to be re-obtained?
We're not saying there's a single right answer to these workflow questions — different enterprises have different governance models, and a compliance tool should be configurable to the organization's specific approval chain. What matters is that the tool enforces the defined workflow rather than allowing it to be circumvented, and that every workflow event (submission, review completion, approval, rejection, clarification request) is logged in the audit trail.
E-signature requirements
Some regulated enterprise contexts require e-signatures on model approval records. Financial services organizations operating under specific control frameworks sometimes require that approval records include an e-signature that meets the definition in E-SIGN Act or equivalent electronic signature standards. Healthcare organizations may have similar requirements for clinical AI approvals.
An e-signature in this context means more than a typed name or a checkbox. It means an authenticated signature that can be traced to a specific individual's verified identity, with a timestamp that can be independently verified, and a mechanism to detect if the signed document has been modified after signing.
In Cognify's review interface, the approval action binds the reviewer's authenticated session identity (their organization SSO credentials, if SAML 2.0 is configured) to the approval record, with a SHA-256 hash of the documentation package computed at the time of signing. If the documentation package is modified after signing, the hash will no longer match and the signature will be flagged as invalid. This provides the integrity guarantee that makes an electronic approval record defensible as evidence.
Rejection and revision cycles
A compliance workflow that only supports approval is incomplete. Models get rejected — documentation is insufficient, evaluation results don't meet policy thresholds, a data source authorization is questionable. The rejection itself needs to be documented: what specifically was rejected, what the reviewer's concerns were, and what would need to change for the model to be approvable.
After rejection, the ML team makes changes and resubmits. The resubmission is a new version of the compliance package, with changes documented. The reviewer who rejected should be able to see exactly what changed between the rejected version and the resubmission — not just the new documentation, but a structured diff showing what was added, removed, or modified. This diff is itself part of the compliance record: it demonstrates that the concerns raised in the rejection were specifically addressed.
The revision cycle can repeat. A model with complex compliance challenges might go through several rounds before approval. Each round — the rejection, the changes, the resubmission — is part of the model's compliance history and should be retained as a complete record. The fact that a model required three review cycles to get approved is not a mark against it; it's evidence that the compliance process was working as intended.
The approval record itself
The output of the approval workflow — the final approval record — is a first-class compliance artifact. It contains: the model version identifier, the version of the compliance documentation package that was approved (with hash), the identity of each reviewer and their role, the date and time of each review action, any conditions placed on the approval (e.g., "approved for use in X department only, requires re-review after 12 months"), and the hash-chain linkage to the immutable audit log.
This record is what the compliance team attaches to the model deployment decision. It's what the ML team cites when a downstream party asks "was this model approved?" It's what the regulator examines during a model risk review. Its usefulness depends entirely on it being accurate, complete, and tamper-evident — which is why the immutability guarantees of the underlying audit log matter, not just for engineering reasons but for the entire value of the compliance workflow.
Integrating with GRC systems
Many regulated enterprises maintain GRC (governance, risk, and compliance) systems — Vanta, Drata, ServiceNow GRC, or Archer — as their central compliance evidence repository. Model approval records need to flow into these systems, not exist only in the ML compliance tool.
The integration pattern Cognify supports exports the approval record (and the linked compliance package) in a structured format that GRC systems can ingest: a signed JSON document with defined schema, plus a reference to the immutable audit log entry that records the approval event. The GRC system can store the exported document as compliance evidence without needing to connect directly to Cognify's systems.
This export-and-import pattern also addresses the long-term retention question: even if Cognify is no longer in use in five years, the exported compliance documents in the GRC system provide the evidentiary record that a regulator needs. The compliance tool produces the evidence; the GRC system archives it. Both roles are distinct and both need to be filled for a complete compliance program.