How I built PrisML: what’s actually implemented in the code today

After launching PrisML, I wanted to clearly document how it was actually built in practice.

This post is that behind-the-scenes: the architecture I chose, implementation order, what broke along the way, and what I’d do differently today.

And most importantly: what is truly implemented in the repository right now.

The problem I was trying to solve

I didn’t start by trying to build “yet another ML library.”

I started from a recurring product pain:

the model works in a notebook;
it becomes a separate service;
the app schema changes;
the model keeps running with outdated assumptions;
nobody notices until metrics drop.

PrisML came from this: treat training as compilation and inference as a local, type-safe function call.

How Prisma shaped my mental model

A key part of PrisML’s origin came from my experience with Prisma ORM.

What I like about Prisma is the mental model:

you describe intent in a schema;
generate type-safe code;
and when contracts break, the issue appears early in the dev cycle.

That flow directly influenced how I approached ML in this project.

Instead of treating models as a “detached service,” I wanted the same contract discipline for predictions:

TypeScript-first definitions;
compiled artifacts with explicit metadata;
compatibility checks against schema before inference.

At a high level, it was a mindset transfer: applying ORM-style predictability to ML.

The architecture I chose

I used a monorepo with separated responsibilities:

@vncsleal/prisml-core: types, defineModel, schema hashing, encoding, validations;
@vncsleal/prisml-cli: prisml train and prisml check;
@vncsleal/prisml-runtime: ONNX prediction session;
@vncsleal/prisml: umbrella entry package for consumption.

At first (v0.1.0), the core was these four packages. Then in v0.2.0, I added a fifth package:

@vncsleal/prisml-generator: Prisma generator integration for schema annotations.

That separation was critical for two reasons:

keeping runtime clean (no training dependencies);
isolating compilation complexity in the CLI.

The core flow is:

TypeScript model definition -> prisml train -> ONNX + metadata.json -> PredictionSession

And from v0.2.0 onward, the flow can also include the generator as an optional DX layer for schema annotations.

Phase 1: define the contract before training anything

Before touching Python, I defined the artifact contract (metadata.json) and typed errors.

Why?

Because if the contract is loose, everything else turns into guesswork.

I locked down early:

feature schema and vector ordering;
imputation rules;
categorical encoding;
Prisma schema SHA256 hash;
metrics and quality gates.

Only after that did I move to the training pipeline.

Phase 2: compilation pipeline (`prisml train`)

The train command became an explicit pipeline:

load config and schema;
validate model definitions;
materialize dataset through Prisma;
extract features via TS resolvers;
normalize into deterministic numeric vectors;
train through Python backend (scikit-learn);
export ONNX + metadata;
enforce quality gates and fail build when needed.

Concrete details from current implementation:

fixed-seed split (42) with 80/20 train/test;
X_train, X_test, y_train, y_test serialized to .dataset.json per model;
local Python backend with numpy, scikit-learn, skl2onnx, onnx;
currently supported algorithms: linear, tree, forest, gbm (regression and classification);
current imputation path is mostly constant (mean for numeric, mode for boolean), with explicit mode fallback for strings.

Main rule: anything that affects inference must be serialized into metadata.

No hidden behavior living only in code.

Why Python lives in CLI (not runtime)

This was a very intentional architecture choice.

In PrisML, Python exists to train and export models (scikit-learn + skl2onnx). That work is done by prisml train, which runs in the CLI/build layer.

At app runtime, the goal is different:

load already-compiled artifacts (.onnx + .metadata.json);
validate contracts (including schema hash);
run in-process inference with ONNX Runtime on Node.

Practical gains from this separation:

Simpler deploys
- Node app does not need Python in production.
Isolated dependencies
- training stack stays in build toolchain, not runtime critical path.
Clear compile-time vs runtime boundary
- training generates immutable artifacts; runtime only executes them.
Lower operational coupling
- training pipeline can evolve without turning inference into a separate service.

What I learned from ONNX in this project

For me, ONNX started as an “export format.”

In this project, it became something more important: a runtime boundary between training and inference.

A few practical lessons:

Portability helps, but contracts are mandatory
- exporting to ONNX solves execution in Node, but not feature semantics.
- that’s why .metadata.json became a required artifact.
Inference is simple when preparation is strict
- ONNX Runtime works well in-process.
- the hard part is guaranteeing runtime input vectors match training vectors exactly.
Product ML is less about models and more about consistency
- ONNX makes execution predictable;
- schema hash + serialized encoding/imputation makes predictions auditable.

In PrisML, ONNX is not “the whole solution.”

It’s the executable model format, while TypeScript + metadata + validation ensure it runs under the right contract.

Phase 3: minimal, predictable runtime

At runtime, I wanted a small API:

initialize model (metadata + onnx + current schema hash);
predict (predict / predictBatch);
fail with clear errors when contracts are broken.

Result: runtime doesn’t “guess.”

If there’s schema drift, invalid feature extraction, or incompatible values against serialized contract, it fails early with typed errors (SchemaDriftError, FeatureExtractionError, EncodingError, etc.).

Runtime also already implements atomic preflight for predictBatch: if one entity fails feature validation, the whole batch is aborted before ONNX inference starts.

The most important decision: block schema drift with hash checks

Schema drift protection became the core of the system.

During training, I compute and store Prisma schema hash. At runtime initialization, I compare with current hash.

If it doesn’t match, inference doesn’t run.

This removes one of the most expensive bugs in this kind of stack: a model “working” with outdated schema semantics.

What MVP already covers (and what it still doesn’t)

What is solid in code today:

artifact contract (.onnx + .metadata.json) with versioning;
Prisma schema SHA256 in training + runtime validation;
prisml check for schema-contract validation without retraining;
quality gates with explicit build failure;
Node in-process runtime with ONNX.

Important nuance: in current prisml check, schema-hash mismatch is a warning; hard failures are field/type/nullability incompatibilities.

What is still MVP-bounded (not fully expanded):

advanced resolver AST analysis is still an MVP stub;
static validation currently relies more on serialized contracts + schema/feature checks than deep resolver inference;
for dynamic cases, runtime is where stricter validation happens.

What was hardest

Three things were harder than they looked:

Feature determinism
- column order, encoding, and imputation must be identical in training and inference.
Feature contracts vs schema reality
- not every feature maps to a direct Prisma field.
- I had to clearly separate what is statically checkable in check vs what is only reliable at runtime.
CLI DX
- ML errors are often cryptic.
- I spent significant effort on actionable error messages with context (model, feature, threshold, etc.).

Mistakes I made

A few practical lessons:

I initially underestimated how central metadata contracts are;
I was too permissive in early inference paths, which could hide issues;
I kept CI/lint too loose at first and paid for avoidable failures.

In the end, what worked was:

strict contracts + explicit errors + simple public API.

What I’d do differently today

If I started over:

I’d invest earlier in prisml check for CI paths without full retraining;
I’d create schema-drift fixtures from day one;
I’d add runtime performance regression benchmarks earlier.

Conclusion

PrisML was not built to compete with large-scale ML platforms.

It was built for a specific scenario: product teams that want predictive power in TypeScript apps with less friction, less infrastructure, and less room for silent failure.

If that’s your context, production feedback is what helps the most right now.

Website: getprisml.vercel.app
GitHub: github.com/vncsleal/prisml

If you want the positioning post (more “why”), it’s here too: I built a compiler-first ML library for TypeScript builders