Back to Journal
AI Architecture

Complete Guide to LLM Fine-Tuning Production with Typescript

A comprehensive guide to implementing LLM Fine-Tuning Production using Typescript, covering architecture, code examples, and production-ready patterns.

Muneer Puthiya Purayil 20 min read

While Python dominates the LLM training ecosystem, TypeScript provides a strong foundation for the production infrastructure surrounding fine-tuning: data pipelines, evaluation services, API gateways, and monitoring dashboards. This guide covers building the complete fine-tuning production system in TypeScript, interfacing with Python training processes where necessary.

Data Pipeline in TypeScript

Training Data Management Service

typescript
1import { createHash } from "node:crypto";
2import { readFile, writeFile } from "node:fs/promises";
3 
4interface TrainingExample {
5 id: string;
6 instruction: string;
7 input: string;
8 output: string;
9 sourceSystem: string;
10 sourceDocumentId: string;
11 contentHash: string;
12 createdAt: string;
13 reviewStatus: "pending" | "approved" | "rejected";
14 annotatorId?: string;
15}
16 
17interface ValidationResult {
18 valid: boolean;
19 totalExamples: number;
20 issues: Array<{ lineNumber: number; issue: string }>;
21}
22 
23class TrainingDataPipeline {
24 async loadAndValidate(filePath: string): Promise<{
25 examples: TrainingExample[];
26 validation: ValidationResult;
27 }> {
28 const content = await readFile(filePath, "utf-8");
29 const lines = content.trim().split("\n");
30 const examples: TrainingExample[] = [];
31 const issues: ValidationResult["issues"] = [];
32 const seenHashes = new Set<string>();
33 
34 for (let i = 0; i < lines.length; i++) {
35 const lineNum = i + 1;
36 try {
37 const item = JSON.parse(lines[i]);
38 
39 if (!item.instruction || !item.output) {
40 issues.push({ lineNumber: lineNum, issue: "Missing required fields" });
41 continue;
42 }
43 
44 const contentHash = createHash("sha256")
45 .update(`${item.instruction}|${item.output}`)
46 .digest("hex");
47 
48 if (seenHashes.has(contentHash)) {
49 issues.push({ lineNumber: lineNum, issue: "Duplicate content" });
50 continue;
51 }
52 seenHashes.add(contentHash);
53 
54 if (item.output.split(/\s+/).length < 10) {
55 issues.push({ lineNumber: lineNum, issue: "Output too short" });
56 continue;
57 }
58 
59 examples.push({
60 id: `te_${contentHash.slice(0, 16)}`,
61 instruction: item.instruction,
62 input: item.input || "",
63 output: item.output,
64 sourceSystem: item.source_system || "manual",
65 sourceDocumentId: item.source_document_id || "",
66 contentHash,
67 createdAt: new Date().toISOString(),
68 reviewStatus: "pending",
69 annotatorId: item.annotator_id,
70 });
71 } catch {
72 issues.push({ lineNumber: lineNum, issue: "Invalid JSON" });
73 }
74 }
75 
76 return {
77 examples,
78 validation: {
79 valid: issues.length === 0,
80 totalExamples: examples.length,
81 issues,
82 },
83 };
84 }
85 
86 async formatForTraining(
87 examples: TrainingExample[],
88 outputPath: string,
89 format: "instruction" | "chat" = "instruction",
90 ): Promise<void> {
91 const formatted = examples
92 .filter((ex) => ex.reviewStatus === "approved")
93 .map((ex) => {
94 if (format === "chat") {
95 return JSON.stringify({
96 messages: [
97 { role: "system", content: "You are a helpful assistant." },
98 { role: "user", content: ex.input ? `${ex.instruction}\n\n${ex.input}` : ex.instruction },
99 { role: "assistant", content: ex.output },
100 ],
101 });
102 }
103 return JSON.stringify({
104 instruction: ex.instruction,
105 input: ex.input,
106 output: ex.output,
107 });
108 });
109 
110 await writeFile(outputPath, formatted.join("\n"));
111 }
112}
113 

Training Job Orchestration

TypeScript excels at orchestrating Python training processes and managing the lifecycle:

typescript
1import { spawn } from "node:child_process";
2import { EventEmitter } from "node:events";
3 
4interface TrainingConfig {
5 baseModel: string;
6 datasetPath: string;
7 outputDir: string;
8 loraR: number;
9 loraAlpha: number;
10 learningRate: number;
11 numEpochs: number;
12 batchSize: number;
13 gradientAccumulation: number;
14 useQlora: boolean;
15}
16 
17interface TrainingMetrics {
18 step: number;
19 loss: number;
20 learningRate: number;
21 epoch: number;
22 gpuMemoryUsed?: number;
23}
24 
25class TrainingOrchestrator extends EventEmitter {
26 private activeJobs: Map<string, { process: ReturnType<typeof spawn>; config: TrainingConfig }> = new Map();
27 
28 async startTraining(jobId: string, config: TrainingConfig): Promise<void> {
29 const args = [
30 "scripts/train.py",
31 "--base-model", config.baseModel,
32 "--dataset", config.datasetPath,
33 "--output-dir", config.outputDir,
34 "--lora-r", String(config.loraR),
35 "--lora-alpha", String(config.loraAlpha),
36 "--learning-rate", String(config.learningRate),
37 "--num-epochs", String(config.numEpochs),
38 "--batch-size", String(config.batchSize),
39 "--gradient-accumulation", String(config.gradientAccumulation),
40 ];
41 
42 if (config.useQlora) {
43 args.push("--use-qlora");
44 }
45 
46 const process = spawn("python", args, {
47 env: { ...process.env, PYTHONUNBUFFERED: "1" },
48 });
49 
50 this.activeJobs.set(jobId, { process, config });
51 
52 process.stdout.on("data", (data: Buffer) => {
53 const lines = data.toString().split("\n").filter(Boolean);
54 for (const line of lines) {
55 try {
56 const metrics: TrainingMetrics = JSON.parse(line);
57 this.emit("metrics", { jobId, metrics });
58 
59 if (metrics.loss > 10) {
60 this.emit("alert", {
61 jobId,
62 type: "loss_spike",
63 message: `Loss spike detected: ${metrics.loss}`,
64 });
65 }
66 } catch {
67 this.emit("log", { jobId, message: line });
68 }
69 }
70 });
71 
72 process.stderr.on("data", (data: Buffer) => {
73 this.emit("error", { jobId, message: data.toString() });
74 });
75 
76 process.on("exit", (code) => {
77 this.activeJobs.delete(jobId);
78 this.emit("complete", { jobId, exitCode: code });
79 });
80 }
81 
82 async stopTraining(jobId: string): Promise<void> {
83 const job = this.activeJobs.get(jobId);
84 if (job) {
85 job.process.kill("SIGTERM");
86 setTimeout(() => {
87 if (this.activeJobs.has(jobId)) {
88 job.process.kill("SIGKILL");
89 }
90 }, 30000);
91 }
92 }
93}
94 

Evaluation Service

typescript
1import Fastify from "fastify";
2 
3interface EvalRequest {
4 modelEndpoint: string;
5 evalDatasetPath: string;
6 maxExamples?: number;
7}
8 
9interface EvalResult {
10 accuracy: number;
11 formatCompliance: number;
12 avgSimilarity: number;
13 totalExamples: number;
14 perCategory: Record<string, number>;
15 failedExamples: Array<{
16 prompt: string;
17 expected: string;
18 generated: string;
19 category: string;
20 }>;
21}
22 
23const app = Fastify({ logger: true });
24 
25app.post<{ Body: EvalRequest }>("/evaluate", async (request): Promise<EvalResult> => {
26 const { modelEndpoint, evalDatasetPath, maxExamples = 100 } = request.body;
27 
28 const evalData = await loadEvalDataset(evalDatasetPath, maxExamples);
29 
30 let correct = 0;
31 let formatOk = 0;
32 const similarities: number[] = [];
33 const perCategory: Record<string, { correct: number; total: number }> = {};
34 const failedExamples: EvalResult["failedExamples"] = [];
35 
36 for (const item of evalData) {
37 const generated = await callModel(modelEndpoint, item.prompt);
38 const category = item.category || "general";
39 
40 if (!perCategory[category]) {
41 perCategory[category] = { correct: 0, total: 0 };
42 }
43 perCategory[category].total++;
44 
45 const isCorrect = normalizeText(generated) === normalizeText(item.expected);
46 const isFormatOk = checkFormat(generated, item.formatSpec);
47 const similarity = computeSimilarity(generated, item.expected);
48 
49 if (isCorrect) {
50 correct++;
51 perCategory[category].correct++;
52 } else {
53 failedExamples.push({
54 prompt: item.prompt.slice(0, 200),
55 expected: item.expected.slice(0, 200),
56 generated: generated.slice(0, 200),
57 category,
58 });
59 }
60 if (isFormatOk) formatOk++;
61 similarities.push(similarity);
62 }
63 
64 const total = evalData.length;
65 return {
66 accuracy: correct / total,
67 formatCompliance: formatOk / total,
68 avgSimilarity: similarities.reduce((a, b) => a + b, 0) / similarities.length,
69 totalExamples: total,
70 perCategory: Object.fromEntries(
71 Object.entries(perCategory).map(([k, v]) => [k, v.correct / v.total])
72 ),
73 failedExamples: failedExamples.slice(0, 20),
74 };
75});
76 
77async function callModel(endpoint: string, prompt: string): Promise<string> {
78 const response = await fetch(`${endpoint}/v1/chat/completions`, {
79 method: "POST",
80 headers: { "Content-Type": "application/json" },
81 body: JSON.stringify({
82 model: "default",
83 messages: [{ role: "user", content: prompt }],
84 temperature: 0.1,
85 max_tokens: 512,
86 }),
87 });
88 
89 const data = await response.json();
90 return data.choices[0].message.content;
91}
92 
93function normalizeText(text: string): string {
94 return text.trim().toLowerCase().replace(/\s+/g, " ");
95}
96 
97function computeSimilarity(a: string, b: string): number {
98 const na = normalizeText(a);
99 const nb = normalizeText(b);
100 const longer = na.length > nb.length ? na : nb;
101 const shorter = na.length > nb.length ? nb : na;
102 if (longer.length === 0) return 1.0;
103 
104 let matches = 0;
105 const words_a = na.split(" ");
106 const words_b = new Set(nb.split(" "));
107 for (const word of words_a) {
108 if (words_b.has(word)) matches++;
109 }
110 return matches / Math.max(words_a.length, words_b.size);
111}
112 
113function checkFormat(output: string, formatSpec?: { type: string }): boolean {
114 if (!formatSpec) return true;
115 if (formatSpec.type === "json") {
116 try { JSON.parse(output); return true; } catch { return false; }
117 }
118 return true;
119}
120 

Need a second opinion on your AI systems architecture?

I run free 30-minute strategy calls for engineering teams tackling this exact problem.

Book a Free Call

Model Registry API

typescript
1interface ModelVersion {
2 id: string;
3 modelName: string;
4 version: number;
5 stage: "development" | "staging" | "production";
6 adapterPath: string;
7 mergedModelPath?: string;
8 evalMetrics: EvalResult;
9 trainingConfig: TrainingConfig;
10 createdAt: string;
11 promotedBy?: string;
12 promotedAt?: string;
13}
14 
15class ModelRegistry {
16 private db: Pool;
17 
18 constructor(connectionString: string) {
19 this.db = new Pool({ connectionString });
20 }
21 
22 async registerModel(
23 modelName: string,
24 adapterPath: string,
25 evalMetrics: EvalResult,
26 trainingConfig: TrainingConfig,
27 ): Promise<ModelVersion> {
28 const latestVersion = await this.db.query(
29 "SELECT MAX(version) as max_version FROM model_versions WHERE model_name = $1",
30 [modelName]
31 );
32 
33 const version = (latestVersion.rows[0]?.max_version || 0) + 1;
34 
35 if (evalMetrics.accuracy < 0.85) {
36 throw new Error(
37 `Model accuracy ${evalMetrics.accuracy} below minimum threshold 0.85`
38 );
39 }
40 
41 const result = await this.db.query(
42 `INSERT INTO model_versions (model_name, version, stage, adapter_path, eval_metrics, training_config)
43 VALUES ($1, $2, 'development', $3, $4, $5) RETURNING *`,
44 [modelName, version, adapterPath, JSON.stringify(evalMetrics), JSON.stringify(trainingConfig)]
45 );
46 
47 return result.rows[0];
48 }
49 
50 async promoteModel(
51 modelName: string,
52 version: number,
53 targetStage: "staging" | "production",
54 approvedBy: string,
55 ): Promise<ModelVersion> {
56 const model = await this.getModelVersion(modelName, version);
57 
58 if (targetStage === "production" && model.stage !== "staging") {
59 throw new Error("Models must pass through staging before production");
60 }
61 
62 if (targetStage === "production" && model.evalMetrics.accuracy < 0.90) {
63 throw new Error(
64 `Production requires 90%+ accuracy. Current: ${model.evalMetrics.accuracy}`
65 );
66 }
67 
68 const result = await this.db.query(
69 `UPDATE model_versions SET stage = $1, promoted_by = $2, promoted_at = NOW()
70 WHERE model_name = $3 AND version = $4 RETURNING *`,
71 [targetStage, approvedBy, modelName, version]
72 );
73 
74 return result.rows[0];
75 }
76 
77 async getModelVersion(modelName: string, version: number): Promise<ModelVersion> {
78 const result = await this.db.query(
79 "SELECT * FROM model_versions WHERE model_name = $1 AND version = $2",
80 [modelName, version]
81 );
82 return result.rows[0];
83 }
84 
85 async getProductionModel(modelName: string): Promise<ModelVersion | null> {
86 const result = await this.db.query(
87 "SELECT * FROM model_versions WHERE model_name = $1 AND stage = 'production' ORDER BY version DESC LIMIT 1",
88 [modelName]
89 );
90 return result.rows[0] || null;
91 }
92}
93 

Monitoring Dashboard Backend

typescript
1import { WebSocketServer } from "ws";
2 
3interface TrainingMetricsEvent {
4 jobId: string;
5 timestamp: string;
6 step: number;
7 loss: number;
8 evalLoss?: number;
9 learningRate: number;
10 gpuMemoryPercent: number;
11 throughputTokensPerSec: number;
12}
13 
14class MonitoringService {
15 private wss: WebSocketServer;
16 private metricsBuffer: TrainingMetricsEvent[] = [];
17 
18 constructor(port: number) {
19 this.wss = new WebSocketServer({ port });
20 
21 this.wss.on("connection", (ws) => {
22 ws.send(JSON.stringify({
23 type: "history",
24 data: this.metricsBuffer.slice(-1000),
25 }));
26 });
27 }
28 
29 pushMetrics(event: TrainingMetricsEvent): void {
30 this.metricsBuffer.push(event);
31 
32 if (this.metricsBuffer.length > 10000) {
33 this.metricsBuffer = this.metricsBuffer.slice(-5000);
34 }
35 
36 const message = JSON.stringify({ type: "metrics", data: event });
37 for (const client of this.wss.clients) {
38 if (client.readyState === 1) {
39 client.send(message);
40 }
41 }
42 }
43}
44 

Conclusion

TypeScript's role in LLM fine-tuning production is not replacing Python for training — it's building the production infrastructure around it. Data pipelines, evaluation services, model registries, and monitoring dashboards are natural TypeScript territory. The result is a system where Python handles the math and TypeScript handles the operations, leveraging each language's strengths.

This architecture works particularly well for teams that are already TypeScript-centric. Rather than requiring every team member to learn Python for operational tasks, the Python surface area is contained to training scripts while TypeScript handles everything the broader engineering team interacts with.

FAQ

Need expert help?

Building with agentic AI?

I help teams ship production-grade systems. From architecture review to hands-on builds.

Muneer Puthiya Purayil

SaaS Architect & AI Systems Engineer. 10+ years shipping production infrastructure across fintech, automotive, e-commerce, and healthcare.

Engage

Start a
Conversation.

For teams building at scale: SaaS platforms, agentic AI systems, and enterprise mobile infrastructure. Scope and fit are evaluated before any engagement begins.

Limited availability · Q3 / Q4 2026