Abstract air-gapped infrastructure visualization — perimeter boundary with internal audit trail pathways
Technical 12 min read

On-Premises LLM Fine-Tuning: Why Air-Gapped Environments Still Need Audit Infrastructure

Darius Wren

CTO & Co-Founder, Cognify

There's a common assumption among security-conscious enterprises that running ML infrastructure on-premises or in an air-gapped environment substantially addresses compliance obligations. The reasoning goes: if sensitive training data never leaves the perimeter, the primary compliance risk is mitigated. What the infrastructure stays in stays in.

This reasoning addresses one real risk — data exfiltration — while leaving another category of compliance obligation unaddressed: the documentation and audit trail requirements that apply regardless of where the training runs. Regulators examining a fine-tuned model for compliance aren't primarily concerned with whether your training data left the building. They're concerned with whether you can demonstrate that the right data was used, the right process was followed, and the right approvals were obtained. That documentation need to exist and be accessible regardless of the perimeter around the training environment.

Why on-prem doesn't mean compliance solved

On-premises or air-gapped training addresses specific compliance risks: data residency requirements, preventing training data from being processed on shared cloud infrastructure covered by terms that may not satisfy regulatory requirements, reducing exposure under HIPAA's breach notification rules by keeping PHI within organizational control. These are legitimate reasons to run training on-premises, and many regulated enterprises have them.

What on-premises training doesn't address:

Documentation requirements. SR 11-7, HIPAA audit controls, the EU AI Act technical documentation requirements — none of these are satisfied by keeping data on-premises. The documentation obligations apply regardless of deployment architecture.

Approval workflow requirements. A compliance approval for a fine-tuned model requires a designated reviewer to evaluate the training documentation and formally sign off. That process has to happen whether training runs on-premises or in the cloud. And the approval record — who approved, what they reviewed, when — needs to be preserved.

Retention obligations. Model documentation retention requirements (3-10 years depending on industry and jurisdiction) apply to on-premises systems as much as cloud systems. On-premises environments that lack proper archival infrastructure can actually make retention compliance harder, not easier — there's no equivalent of S3 Object Lock or Azure WORM to provide storage-level immutability.

Access to the audit trail for external parties. When a regulator or external auditor needs to review model documentation, they need access to the audit trail. If that trail exists only within an air-gapped environment, producing it for review requires a controlled export process with its own documentation requirements.

The audit trail still needs to leave

The key insight is that in an air-gapped environment, training data stays in and audit trail evidence comes out. These are two different data flows with different security and compliance requirements.

Training data: high sensitivity, zero egress, maximum perimeter enforcement. The dataset bytes and model weights should not leave the training environment. This is where air-gapped architectures serve their purpose.

Audit trail evidence: moderate sensitivity, controlled egress, structured export process. The metadata, hashes, provenance records, and compliance documentation need to be accessible outside the training environment for review, approval, and regulatory production. This is where air-gapped architectures create challenges if not specifically addressed.

We're not saying the audit trail needs to be in the cloud or accessible to external parties in real time — many compliance frameworks accept controlled, documented export processes. But the audit trail needs to be exportable in a way that's auditable itself: who exported what, when, and why, with cryptographic verification that the exported content matches the original records.

Specific challenges in air-gapped environments

Several specific challenges arise when building compliant ML audit infrastructure in air-gapped or highly restricted network environments:

No cloud-based audit storage. Standard compliance audit infrastructure often relies on cloud-based append-only storage with managed immutability (S3 Object Lock, Azure WORM). In an air-gapped environment, this storage has to be implemented on-premises with equivalent properties — which requires deliberate infrastructure investment, not a default.

No real-time external access for reviewers. Compliance reviewers at a healthcare system or bank may not have direct access to the air-gapped training environment. The approval workflow — which requires the reviewer to examine the training documentation and formally sign off — needs to work either within the air-gapped network (which may require provisioning access for compliance personnel) or through a controlled export-and-review process.

Dependency management for SDK and server. In environments without internet access, installing and updating software requires a curated internal package mirror or pre-bundled distribution. The Cognify SDK and on-premises server need to be deployable in environments where pip install cognify-sdk won't work because PyPI is inaccessible.

Clock synchronization. Audit timestamps require reliable time sources. Air-gapped environments that can't reach external NTP servers need an internal time authority. Timestamp accuracy matters for compliance records — a hash-chained audit log with inconsistent timestamps can create questions about record order and authenticity.

Cognify's on-premises server architecture

Cognify's Enterprise plan includes an on-premises server deployment option designed specifically for environments where training data cannot leave the network perimeter. The architecture has three components:

On-premises tracking server. Runs within the customer's network, receives SDK calls from the training environment, stores audit records in the customer's own storage (PostgreSQL for structured data, object storage compatible with S3 API for artifacts and compliance package exports). No training data or model weights transit this server — only metadata, hashes, and provenance records.

Air-gapped SDK mode. The Cognify SDK can be configured to communicate only with the on-premises tracking server, with no outbound connections to Cognify's cloud infrastructure. All logging happens within the perimeter. The SDK is distributed as a signed wheel package that can be hosted in an internal package repository.

Controlled export interface. When compliance packages need to be produced for review or regulatory production, the on-premises server provides an export API that generates signed, hash-verified compliance packages. The export itself is logged in the audit record — who initiated the export, what run IDs were included, when the export was completed. This creates an auditable record of documentation leaving the perimeter.

For environments that need physical air-gap (no network connectivity at all between training environment and tracking server), the SDK supports a batch mode where audit records are written to a local queue and periodically transferred to the tracking server through an authorized data transfer channel. The transfer itself is hash-verified to ensure record integrity.

Compliance package export without cloud connectivity

One of the practical advantages of the on-premises deployment for regulated enterprises is that compliance package exports don't require internet connectivity. The on-premises server has all the data needed to generate the compliance package — dataset provenance records, training configuration, eval results, approval records — and can generate and sign the package locally.

The generated compliance package is a self-contained artifact: a signed archive containing the structured documentation and the cryptographic evidence needed to verify its authenticity. An external auditor receiving this package can verify the hash chain using the published verification algorithm without needing access to Cognify's cloud infrastructure or to the customer's on-premises network.

This self-contained verification property matters for the long-term retention use case: a compliance package generated from an on-premises Cognify server in 2025 should be verifiable in 2035 by an examiner who has no access to either the original training environment or Cognify's current software. The verification algorithm is documented, the hash algorithm (SHA-256) is stable, and the package is self-describing.

Network-segmented deployments

A common pattern in large regulated enterprises is not a full air gap but network segmentation: the training environment is in a restricted segment with controlled egress, while compliance teams work in a separate network segment with different access privileges. The challenge is that the tracking server needs to be accessible to both segments — receiving logs from training in one segment and serving the compliance review interface in another.

This is typically solved with a DMZ-style architecture: the Cognify on-premises server runs in a controlled DMZ segment, with firewall rules permitting inbound SDK calls from the training segment (metadata and hashes only, not data) and inbound browser/API access from the compliance team's segment. Outbound connectivity from the DMZ to either segment is restricted.

The key constraint to preserve is that no actual training data flows through the tracking server to the compliance segment. All data stays in the training segment. What crosses the DMZ are hashes, metadata, and structured documentation — the audit trail, not the data it describes. This architecture can satisfy the requirements of compliance teams who need to review documentation without having access to the sensitive training data itself.

For enterprises evaluating on-premises deployment, the planning questions are: where will the tracking server run, who has network access to it, how will compliance packages be exported for external review, and what archival storage will be used for the long-retention documentation? Addressing these questions as part of the deployment architecture decision — rather than after the training infrastructure is already built — prevents the common outcome where a well-constructed on-premises training pipeline lacks the compliance infrastructure to turn its outputs into defensible regulatory documentation.