What is Agentic AI Workflows and why does it matter?

Agentic AI workflows use LLMs to autonomously sequence tool calls and decisions to accomplish a goal. For TypeScript teams, they matter because they enable automation of open-ended tasks (customer support, document processing, code generation) that cannot be handled by rigid rule-based logic — while staying within the existing TypeScript ecosystem, type system, and deployment infrastructure.

How does TypeScript compare for Agentic AI Workflows?

TypeScript's key advantage is compile-time type safety for workflow state. With typed `Annotation` definitions in LangGraph.js and Zod validation at LLM output boundaries, you catch state shape mismatches at development time rather than in production. The ecosystem is younger than Python's but the major frameworks (LangGraph.js, Vercel AI SDK) are production-ready. The main limitation: fewer ML-adjacent libraries (embeddings, evaluation, fine-tuning tooling) compared to Python.

What are common mistakes with Agentic AI Workflows?

TypeScript-specific pitfalls: (1) forgetting that TypeScript types are erased at runtime — always use Zod to validate LLM outputs, not just type assertions; (2) not handling the `Promise.race` timeout pattern correctly — omitting the cleanup of the losing promise can cause memory leaks; (3) using synchronous `require` or `readFileSync` in hot paths — these block the event loop; (4) not setting an explicit `timeout` on the LLM client constructor — without it, a hung LLM call will block a worker i

How long does it take to implement Agentic AI Workflows?

A single-agent TypeScript workflow with typed state, tool calling, retry logic, and async execution: 1–2 weeks for an engineer with solid TypeScript and async experience. A LangGraph.js multi-agent workflow with BullMQ-backed execution and LangFuse observability: 3–4 weeks. The most time-consuming part for TypeScript teams is usually the type plumbing for LangGraph's `Annotation` system — budget extra time the first time.

What infrastructure do I need for Agentic AI Workflows?

Minimum: Node.js 20+, an LLM API key, Redis (for BullMQ job queue). For production: BullMQ worker process separate from the API process, structured logging (Pino), LangFuse or LangSmith for traces, Prometheus metrics or Datadog agent. Compute: a single `t3.medium` EC2 instance handles ~20 concurrent workflows; scale horizontally by adding BullMQ worker replicas.

Complete Guide to Agentic AI Workflows with Typescript

Introduction

Why This Matters

TypeScript has quietly become the most practical language for production agentic AI systems in web-native and full-stack teams. The LangGraph.js port is now feature-complete with the Python version. The Anthropic and OpenAI SDKs are first-class TypeScript packages with full type definitions. Vercel's AI SDK has matured into a production-grade abstraction for streaming, tool calling, and multi-step agent loops. And the type system — the actual TypeScript type system, not the loose dynamic typing of plain JavaScript — gives you compile-time guarantees about workflow state that Python only provides at runtime with Pydantic.

For teams already building backend services in Node.js, NestJS, or Next.js, adding agentic AI in TypeScript is a smaller architectural leap than switching languages. The deployment infrastructure, the CI/CD pipeline, the monitoring stack — all of it carries over. You add LLM logic; you do not add a new language runtime.

That said, TypeScript's agentic AI ecosystem is still younger than Python's, and the ecosystem fragmentation is higher. This guide cuts through the noise: here is what works in production, here is the code, and here is how to harden it.

Who This Is For

This guide targets TypeScript engineers — backend (Node.js, NestJS, Express), full-stack (Next.js, Remix), or platform — who are adding agentic AI features to existing applications or building new AI-native products. Experience with async/await and generics is assumed. Familiarity with Zod for runtime type validation will help.

If you are a Python-first engineer evaluating whether TypeScript is viable for agentic AI, this guide will give you an honest answer with concrete code to evaluate.

What You Will Learn

Why TypeScript's type system is a genuine advantage for agentic workflow state management
A complete project structure for a production agentic workflow in TypeScript
Typed tool definitions, structured output validation with Zod, and retry logic
LangGraph.js for stateful multi-step workflows
Vercel AI SDK patterns for streaming agent responses in Next.js
Testing strategy: mocking LLM calls, testing tool implementations, integration test patterns

Core Concepts

Key Terminology

Agent: An LLM-powered system that autonomously decides which actions to take. The LLM receives a goal and context; it chooses to either respond directly or invoke a tool.

Tool: A typed function the LLM can call. In TypeScript, tools are defined with a name, description, and Zod schema for parameters. The LLM sees the schema as its calling interface.

Structured output: LLM response constrained to a TypeScript type or Zod schema. Essential for consuming LLM output in application code without string parsing.

Workflow state: A typed object that accumulates information across workflow steps. In LangGraph.js, this is your Annotation type; in custom implementations, it is a plain TypeScript interface.

Trace: A record of every LLM call and tool invocation in a single workflow run. Essential for debugging non-deterministic behavior.

Tool call: The JSON format LLMs use to request a tool invocation. The LLM generates { name: "search_kb", args: { query: "..." } }; your runtime executes the function and returns the result.

Mental Models

Type safety does not eliminate non-determinism. TypeScript gives you compile-time guarantees about the shape of your workflow state. It cannot guarantee what the LLM will put in that shape. A field typed as string will always be a string — but it might be the wrong string. Validation (Zod .parse()) at the LLM output boundary handles this.

Tools are contracts, not implementations. When you define a tool for the LLM, you are writing a contract: here is what you can call and what parameters it expects. The LLM calls against this contract; your implementation must honor it. Keep the contract (tool definition) stable; change the implementation freely.

Async is not optional. Every LLM call is an async operation. Every tool that touches an external service is async. TypeScript's async/await and Promise types are not stylistic choices for agentic code — they are requirements. A synchronous workflow step that calls an LLM will block the Node.js event loop.

Foundational Principles

Use Zod at every LLM output boundary. TypeScript types are erased at runtime. The only guarantee about an LLM's JSON output is that it is... JSON. Zod .parse() validates and narrows the type at runtime, giving you the type guarantee TypeScript cannot make across the LLM boundary.
Immutable state updates. Pass state into steps; return new state. Never mutate state objects in place. This makes workflow debugging tractable — each step's input and output is a discrete snapshot.
Typed errors, not thrown strings. Define an AgentError discriminated union type. Return errors as values where possible; only throw for genuinely exceptional conditions (programmer error, system corruption).
Separate tool definitions from tool implementations. The tool definition (name, description, Zod schema) goes next to the LLM call. The implementation (the actual business logic) goes in a testable function. They meet at one line of glue code.

Architecture Overview

High-Level Design

1┌─────────────────────────────────────────────┐

2│ API Layer (NestJS / Next.js) │

3│ POST /api/workflow → returns run_id │

4│ GET /api/workflow/:id → returns status │

5└─────────────────────┬───────────────────────┘

6 │

7┌─────────────────────▼───────────────────────┐

8│ Workflow Runner (BullMQ) │

9│ Picks up jobs, enforces timeout/budget │

10│ Publishes result to Redis on completion │

11└─────────────────────┬───────────────────────┘

12 │

13┌─────────────────────▼───────────────────────┐

14│ Agent + Tools (LangGraph.js) │

15│ StateGraph with typed Annotation │

16│ Tool definitions (Zod schemas) │

17│ LLM client (Anthropic/OpenAI SDK) │

18└─────────────────────┬───────────────────────┘

19 │

20┌─────────────────────▼───────────────────────┐

21│ State & Observability │

22│ Typed workflow state, structured logs │

23│ LangFuse for traces, token tracking │

24└─────────────────────────────────────────────┘

Component Breakdown

WorkflowState interface: Immutable typed state object. All workflow data — inputs, tool results, intermediate reasoning, final output — lives here. Passed through LangGraph.js nodes.

Tool registry: A Map<string, ToolDefinition & { execute: (...args) => Promise<string> }>. Separates the LLM-facing definition (Zod schema, description) from the implementation.

LLM client wrapper: Thin class wrapping the provider SDK. Adds retry with exponential backoff, token spend tracking, and trace correlation. Never called directly by business logic.

LangGraph.js StateGraph: The control flow. Nodes are TypeScript functions that transform WorkflowState. Edges are routing functions. The graph is compiled once at startup.

BullMQ worker: For workflows exceeding 10 seconds, a BullMQ worker picks up the job, runs the graph, and writes the result. The API route returns a job ID immediately.

Data Flow

typescript

1// TypeScript type-level view of data flow

2type WorkflowInput = { userId: string; userInput: string };

3type WorkflowResult = { runId: string; output: WorkflowOutput } | { runId: string; error: AgentError };

5// 1. API receives input, creates run record, enqueues job

6async function submitWorkflow(input: WorkflowInput): Promise<string>;

8// 2. Worker picks up job, initializes typed state

9const initialState: WorkflowState = { runId, ...input, messages: [], tokenSpend: 0 };

11// 3. LangGraph executes nodes, each returning new state

12type AgentNode = (state: WorkflowState) => Promise<Partial<WorkflowState>>;

14// 4. Zod validates final output before state update

15const parsed = WorkflowOutputSchema.safeParse(rawOutput);

17// 5. Result written to DB; client polls or receives webhook

Implementation Steps

Step 1: Project Setup

bash

1mkdir agent-ts && cd agent-ts

2npm init -y

3npm install \

4 @langchain/langgraph \

5 @langchain/anthropic \

6 @langchain/core \

7 @anthropic-ai/sdk \

8 openai \

9 zod \

10 bullmq \

11 ioredis \

12 langfuse \

13 pino \

14 dotenv

16npm install -D typescript @types/node tsx

17npx tsc --init

1agent-ts/

2├── src/

3│ ├── agent/

4│ │ ├── state.ts # WorkflowState type + Annotation

5│ │ ├── tools.ts # Tool definitions + implementations

6│ │ ├── llm.ts # LLM client wrapper

7│ │ ├── graph.ts # LangGraph StateGraph

8│ │ └── prompts/

9│ │ └── system.md

10│ ├── worker/

11│ │ └── workflow.worker.ts

12│ ├── api/

13│ │ └── routes.ts

14│ └── index.ts

15├── tsconfig.json

16└── .env

typescript

1// src/agent/state.ts

2import { Annotation } from '@langchain/langgraph';

3import type { BaseMessage } from '@langchain/core/messages';

5export interface WorkflowOutput {

6 answer: string;

7 confidence: number;

8 sources: string[];

11export const WorkflowAnnotation = Annotation.Root({

12 runId: Annotation<string>(),

13 userId: Annotation<string>(),

14 userInput: Annotation<string>(),

15 messages: Annotation<BaseMessage[]>({

16 reducer: (existing, update) => existing.concat(update),

17 }),

18 toolResults: Annotation<Record<string, string>>({

19 reducer: (existing, update) => ({ ...existing, ...update }),

20 }),

21 finalOutput: Annotation<WorkflowOutput | null>({ default: () => null }),

22 tokenSpend: Annotation<number>({

23 reducer: (existing, update) => existing + update,

24 default: () => 0,

25 }),

26 error: Annotation<string | null>({ default: () => null }),

27});

29export type WorkflowState = typeof WorkflowAnnotation.State;

Step 2: Core Logic

typescript

1// src/agent/tools.ts

2import { z } from 'zod';

3import { tool } from '@langchain/core/tools';

5// Tool definition + implementation in one place

6export const searchKnowledgeBase = tool(

7 async ({ query }: { query: string }): Promise<string> => {

8 const response = await fetch(`${process.env.KB_URL}/search`, {

9 method: 'POST',

10 headers: { 'Content-Type': 'application/json' },

11 body: JSON.stringify({ query, limit: 3 }),

12 signal: AbortSignal.timeout(10_000),

13 });

15 if (!response.ok) {

16 return `Search failed: HTTP ${response.status}`;

17 }

19 const results = await response.json() as Array<{ text: string }>;

20 if (results.length === 0) return 'No results found.';

21 return results.map(r => r.text).join('\n\n---\n\n');

22 },

23 {

24 name: 'search_knowledge_base',

25 description: 'Search the internal knowledge base for relevant information.',

26 schema: z.object({

27 query: z.string().describe('Natural language search query'),

28 }),

29 }

30);

32export const createSupportTicket = tool(

33 async ({ title, description, priority }: {

34 title: string;

35 description: string;

36 priority: 'low' | 'medium' | 'high' | 'critical';

37 }): Promise<string> => {

38 const response = await fetch(`${process.env.TICKETING_URL}/tickets`, {

39 method: 'POST',

40 headers: { 'Content-Type': 'application/json' },

41 body: JSON.stringify({ title, description, priority }),

42 signal: AbortSignal.timeout(10_000),

43 });

45 if (!response.ok) {

46 return `Ticket creation failed: HTTP ${response.status}`;

47 }

49 const ticket = await response.json() as { id: string };

50 return `Ticket created: ${ticket.id}`;

51 },

52 {

53 name: 'create_support_ticket',

54 description: 'Create a support ticket for user-reported issues.',

55 schema: z.object({

56 title: z.string().max(100).describe('Brief ticket title'),

57 description: z.string().describe('Detailed description'),

58 priority: z.enum(['low', 'medium', 'high', 'critical']),

59 }),

60 }

61);

63export const TOOLS = [searchKnowledgeBase, createSupportTicket];

typescript

1// src/agent/llm.ts

2import { ChatAnthropic } from '@langchain/anthropic';

3import { TOOLS } from './tools';

5export function createLLM() {

6 const llm = new ChatAnthropic({

7 model: process.env.LLM_MODEL ?? 'claude-3-5-sonnet-20241022',

8 temperature: 0,

9 maxTokens: 4096,

10 // Timeout in ms — never rely on default

11 timeout: Number(process.env.LLM_TIMEOUT_MS ?? 30_000),

12 });

14 return llm.bindTools(TOOLS);

15}

Step 3: Integration

typescript

1// src/agent/graph.ts

2import { StateGraph, END } from '@langchain/langgraph';

3import { ToolNode } from '@langchain/langgraph/prebuilt';

4import { HumanMessage, SystemMessage } from '@langchain/core/messages';

5import { WorkflowAnnotation, WorkflowState } from './state';

6import { createLLM } from './llm';

7import { TOOLS } from './tools';

8import { readFileSync } from 'fs';

9import { join } from 'path';

10import { z } from 'zod';

12const SYSTEM_PROMPT = readFileSync(

13 join(__dirname, 'prompts/system.md'),

14 'utf-8'

15);

17const WorkflowOutputSchema = z.object({

18 answer: z.string(),

19 confidence: z.number().min(0).max(1),

20 sources: z.array(z.string()),

21});

23const llm = createLLM();

25async function agentNode(state: WorkflowState): Promise<Partial<WorkflowState>> {

26 const messages = [

27 new SystemMessage(SYSTEM_PROMPT),

28 ...state.messages,

29 new HumanMessage(state.userInput),

30 ];

32 const response = await llm.invoke(messages);

33 const tokenSpend = (response.usage_metadata?.total_tokens ?? 0);

35 return {

36 messages: [response],

37 tokenSpend,

38 };

39}

41async function outputNode(state: WorkflowState): Promise<Partial<WorkflowState>> {

42 // Parse the last assistant message as structured output

43 const lastMessage = state.messages[state.messages.length - 1];

44 const content = typeof lastMessage.content === 'string'

45 ? lastMessage.content

46 : JSON.stringify(lastMessage.content);

48 try {

49 const parsed = WorkflowOutputSchema.parse(JSON.parse(content));

50 return { finalOutput: parsed };

51 } catch {

52 // Fallback: wrap free-text response in the expected structure

53 return {

54 finalOutput: {

55 answer: content,

56 confidence: 0.7,

57 sources: [],

58 }

59 };

60 }

61}

63function shouldContinue(state: WorkflowState): 'tools' | 'output' {

64 const lastMessage = state.messages[state.messages.length - 1];

65 const hasToolCalls = 'tool_calls' in lastMessage &&

66 Array.isArray((lastMessage as any).tool_calls) &&

67 (lastMessage as any).tool_calls.length > 0;

68 return hasToolCalls ? 'tools' : 'output';

69}

71const toolNode = new ToolNode(TOOLS);

73const graph = new StateGraph(WorkflowAnnotation)

74 .addNode('agent', agentNode)

75 .addNode('tools', toolNode)

76 .addNode('output', outputNode)

77 .addEdge('__start__', 'agent')

78 .addConditionalEdges('agent', shouldContinue)

79 .addEdge('tools', 'agent')

80 .addEdge('output', END);

82export const workflow = graph.compile();

Need a second opinion on your AI systems architecture?

I run free 30-minute strategy calls for engineering teams tackling this exact problem.

Book a Free Call

Code Examples

Basic Implementation

A minimal typed agent without a framework — direct SDK usage with tool calling:

typescript

1import Anthropic from '@anthropic-ai/sdk';

2import { z } from 'zod';

4const client = new Anthropic();

6const SentimentSchema = z.object({

7 sentiment: z.enum(['positive', 'negative', 'neutral']),

8 confidence: z.number().min(0).max(1),

9 keyPhrases: z.array(z.string()),

10});

12type SentimentResult = z.infer<typeof SentimentSchema>;

14async function analyzeSentiment(text: string): Promise<SentimentResult> {

15 const response = await client.messages.create({

16 model: 'claude-3-5-haiku-20241022',

17 max_tokens: 512,

18 system: `Analyze the sentiment of the provided text.

19Respond with JSON matching exactly:

20{"sentiment":"positive|negative|neutral","confidence":0.0-1.0,"keyPhrases":["..."]}

21Respond ONLY with the JSON object.`,

22 messages: [{ role: 'user', content: text }],

23 });

25 const raw = response.content[0];

26 if (raw.type !== 'text') throw new Error('Unexpected response type');

28 return SentimentSchema.parse(JSON.parse(raw.text));

29}

Advanced Patterns

Multi-step agent with typed state transitions:

typescript

1import { StateGraph, Annotation, END } from '@langchain/langgraph';

3const ResearchAnnotation = Annotation.Root({

4 query: Annotation<string>(),

5 searchResults: Annotation<string[]>({

6 reducer: (a, b) => [...a, ...b],

7 default: () => [],

8 }),

9 synthesis: Annotation<string | null>({ default: () => null }),

10 currentStep: Annotation<'search' | 'synthesize' | 'done'>({

11 default: () => 'search'

12 }),

13});

15type ResearchState = typeof ResearchAnnotation.State;

17// Each node is a pure async function on typed state

18async function searchNode(state: ResearchState): Promise<Partial<ResearchState>> {

19 const results = await performSearch(state.query);

20 return {

21 searchResults: results,

22 currentStep: 'synthesize',

23 };

24}

26async function synthesizeNode(state: ResearchState): Promise<Partial<ResearchState>> {

27 const response = await llm.invoke([

28 new SystemMessage('Synthesize the search results into a clear answer.'),

29 new HumanMessage(

30 `Query: ${state.query}\n\nResults:\n${state.searchResults.join('\n\n')}`

31 ),

32 ]);

33 return {

34 synthesis: response.content as string,

35 currentStep: 'done',

36 };

37}

39function routeStep(state: ResearchState): string {

40 return state.currentStep === 'done' ? END : state.currentStep;

41}

43const researchGraph = new StateGraph(ResearchAnnotation)

44 .addNode('search', searchNode)

45 .addNode('synthesize', synthesizeNode)

46 .addEdge('__start__', 'search')

47 .addConditionalEdges('search', routeStep)

48 .addConditionalEdges('synthesize', routeStep)

49 .compile();

Parallel tool execution with typed results:

typescript

1import pLimit from 'p-limit';

3interface ToolCall {

4 id: string;

5 name: string;

6 args: Record<string, unknown>;

9interface ToolResult {

10 toolCallId: string;

11 content: string;

12}

14const TOOL_MAP = new Map(TOOLS.map(t => [t.name, t]));

15const limit = pLimit(5); // max 5 concurrent tool calls

17async function executeToolCallsParallel(toolCalls: ToolCall[]): Promise<ToolResult[]> {

18 return Promise.all(

19 toolCalls.map(call =>

20 limit(async (): Promise<ToolResult> => {

21 const tool = TOOL_MAP.get(call.name);

22 if (!tool) {

23 return { toolCallId: call.id, content: `Unknown tool: ${call.name}` };

24 }

26 try {

27 const result = await (tool as any).invoke(call.args);

28 return { toolCallId: call.id, content: String(result) };

29 } catch (err) {

30 return {

31 toolCallId: call.id,

32 content: `Tool error: ${err instanceof Error ? err.message : 'unknown'}`

33 };

34 }

35 })

36 )

37 );

38}

Production Hardening

typescript

1// src/worker/workflow.worker.ts

2import { Worker, Job } from 'bullmq';

3import { workflow } from '../agent/graph';

4import { WorkflowState } from '../agent/state';

5import { pino } from 'pino';

6import { randomUUID } from 'crypto';

8const logger = pino({ level: process.env.LOG_LEVEL ?? 'info' });

9const MAX_TOKEN_BUDGET = Number(process.env.MAX_TOKEN_BUDGET ?? 50_000);

10const WORKFLOW_TIMEOUT_MS = Number(process.env.WORKFLOW_TIMEOUT_MS ?? 120_000);

12interface WorkflowJobData {

13 userId: string;

14 userInput: string;

15}

17async function processWorkflow(job: Job<WorkflowJobData>): Promise<unknown> {

18 const runId = randomUUID();

19 const startMs = Date.now();

21 logger.info({ runId, jobId: job.id, userId: job.data.userId }, 'workflow_start');

23 const initialState: Partial<WorkflowState> = {

24 runId,

25 userId: job.data.userId,

26 userInput: job.data.userInput,

27 };

29 let result: WorkflowState;

31 try {

32 result = await Promise.race([

33 workflow.invoke(initialState),

34 new Promise<never>((_, reject) =>

35 setTimeout(() => reject(new Error('WORKFLOW_TIMEOUT')), WORKFLOW_TIMEOUT_MS)

36 ),

37 ]) as WorkflowState;

38 } catch (err) {

39 const errorMessage = err instanceof Error ? err.message : 'unknown_error';

40 logger.error({ runId, error: errorMessage }, 'workflow_failed');

41 throw err; // BullMQ will retry based on job options

42 }

44 const durationMs = Date.now() - startMs;

46 if (result.tokenSpend > MAX_TOKEN_BUDGET) {

47 logger.warn(

48 { runId, tokenSpend: result.tokenSpend, budget: MAX_TOKEN_BUDGET },

49 'token_budget_exceeded'

50 );

51 }

53 logger.info(

54 { runId, durationMs, tokenSpend: result.tokenSpend, hasError: !!result.error },

55 'workflow_complete'

56 );

58 return {

59 runId,

60 output: result.finalOutput,

61 error: result.error,

62 meta: { durationMs, tokenSpend: result.tokenSpend },

63 };

64}

66const worker = new Worker<WorkflowJobData>(

67 'agent-workflows',

68 processWorkflow,

69 {

70 connection: { host: process.env.REDIS_HOST, port: Number(process.env.REDIS_PORT) },

71 concurrency: Number(process.env.WORKER_CONCURRENCY ?? 5),

72 }

73);

75worker.on('failed', (job, err) => {

76 logger.error({ jobId: job?.id, error: err.message }, 'job_failed');

77});

Performance Considerations

Latency Optimization

Streaming for user-facing responses. When displaying agent output in real-time, use Vercel AI SDK's streamText or LangChain's astream_events. The perceived latency drops significantly when users see tokens appearing instead of waiting for a complete response.

typescript

1// Next.js App Router route with streaming (Vercel AI SDK)

2import { streamText } from 'ai';

3import { anthropic } from '@ai-sdk/anthropic';

5export async function POST(req: Request) {

6 const { messages } = await req.json();

8 const result = streamText({

9 model: anthropic('claude-3-5-sonnet-20241022'),

10 system: SYSTEM_PROMPT,

11 messages,

12 tools: {

13 search: {

14 description: 'Search the knowledge base',

15 parameters: z.object({ query: z.string() }),

16 execute: async ({ query }) => searchKnowledgeBase(query),

17 },

18 },

19 maxSteps: 10, // max tool call iterations

20 });

22 return result.toDataStreamResponse();

23}

Parallel tool execution. The ToolNode in LangGraph.js executes tool calls sequentially by default. Replace it with a custom parallel implementation using Promise.all when tool calls in a single turn are independent.

Model routing per step. Route simpler intermediate steps (classification, extraction from short text) to claude-3-5-haiku-20241022 or gpt-4o-mini. Reserve Sonnet/GPT-4o for steps requiring deep reasoning. This reduces p50 latency by 40–60% and cost by a similar margin for mixed workloads.

Memory Management

Node.js single-process LLM workers can accumulate memory through context window growth. Key mitigations:

Truncate tool results. Cap the string length of tool results before adding to the message history. A tool that returns a 500KB JSON blob will bloat every subsequent LLM call's context.

typescript

1const MAX_TOOL_RESULT_LENGTH = 4000;

3function truncateToolResult(result: string): string {

4 if (result.length <= MAX_TOOL_RESULT_LENGTH) return result;

5 return (

6 result.slice(0, MAX_TOOL_RESULT_LENGTH) +

7 `\n\n[... truncated ${result.length - MAX_TOOL_RESULT_LENGTH} chars]`

8 );

Use separate BullMQ workers per concurrency group. Long workflows on the same worker process as short workflows create head-of-line blocking. Separate queues for fast (< 10s) and standard (> 10s) workflows allow independent scaling.

Set Node.js --max-old-space-size explicitly. Default heap size in Node.js is environment-dependent and often lower than needed for concurrent agentic workflows with large contexts. Set --max-old-space-size=4096 (4GB) explicitly for worker processes.

Load Testing

typescript

1// k6 load test — agent workflow endpoint

2import http from 'k6/http';

3import { check, sleep } from 'k6';

5export const options = {

6 scenarios: {

7 steady_load: {

8 executor: 'ramping-vus',

9 startVUs: 1,

10 stages: [

11 { duration: '2m', target: 10 },

12 { duration: '5m', target: 20 },

13 { duration: '2m', target: 0 },

14 ],

15 },

16 },

17 thresholds: {

18 http_req_duration: ['p(95)<5000'], // 95% under 5s (submission, not completion)

19 http_req_failed: ['rate<0.01'], // <1% HTTP errors

20 },

21};

23const TEST_INPUTS = [

24 'What is the return policy?',

25 'Create a ticket for payment processing failure',

26 'How do I upgrade my subscription?',

27];

29export default function () {

30 const payload = JSON.stringify({

31 userInput: TEST_INPUTS[Math.floor(Math.random() * TEST_INPUTS.length)],

32 });

34 const response = http.post(

35 `${__ENV.BASE_URL}/api/workflow`,

36 payload,

37 { headers: { 'Content-Type': 'application/json' } }

38 );

40 check(response, {

41 'status 202': (r) => r.status === 202,

42 'has run_id': (r) => JSON.parse(r.body as string).runId !== undefined,

43 });

45 sleep(1);

46}

Testing Strategy

Unit Tests

Test tool implementations independently — these should be fast, deterministic, and have no LLM dependency:

typescript

1// src/agent/tools.test.ts (Vitest)

2import { describe, it, expect, vi, beforeEach } from 'vitest';

3import { searchKnowledgeBase } from './tools';

5describe('searchKnowledgeBase', () => {

6 beforeEach(() => {

7 vi.stubGlobal('fetch', vi.fn());

8 });

10 it('returns formatted results', async () => {

11 vi.mocked(fetch).mockResolvedValueOnce({

12 ok: true,

13 json: async () => [{ text: 'Policy text here' }, { text: 'More context' }],

14 } as Response);

16 const result = await searchKnowledgeBase.invoke({ query: 'return policy' });

17 expect(result).toContain('Policy text here');

18 expect(result).toContain('More context');

19 });

21 it('handles empty results gracefully', async () => {

22 vi.mocked(fetch).mockResolvedValueOnce({

23 ok: true,

24 json: async () => [],

25 } as Response);

27 const result = await searchKnowledgeBase.invoke({ query: 'nonexistent topic' });

28 expect(result).toBe('No results found.');

29 });

31 it('handles HTTP errors', async () => {

32 vi.mocked(fetch).mockResolvedValueOnce({

33 ok: false,

34 status: 503,

35 } as Response);

37 const result = await searchKnowledgeBase.invoke({ query: 'test' });

38 expect(result).toMatch(/failed.*503/i);

39 });

40});

Integration Tests

Test the graph with mocked LLM calls:

typescript

1// src/agent/graph.test.ts

2import { describe, it, expect, vi } from 'vitest';

3import { workflow } from './graph';

5const MOCK_FINAL_RESPONSE = {

6 content: JSON.stringify({

7 answer: 'Returns are accepted within 30 days.',

8 confidence: 0.95,

9 sources: ['knowledge-base-chunk-42'],

10 }),

11 tool_calls: [],

12 usage_metadata: { total_tokens: 350 },

13};

15describe('workflow graph', () => {

16 it('completes successfully with mocked LLM', async () => {

17 // Mock the LLM at the module level

18 vi.mock('@langchain/anthropic', () => ({

19 ChatAnthropic: vi.fn().mockImplementation(() => ({

20 bindTools: vi.fn().mockReturnThis(),

21 invoke: vi.fn().mockResolvedValue(MOCK_FINAL_RESPONSE),

22 })),

23 }));

25 const result = await workflow.invoke({

26 runId: 'test-run-1',

27 userId: 'user-123',

28 userInput: 'What is the return policy?',

29 });

31 expect(result.finalOutput).not.toBeNull();

32 expect(result.finalOutput?.answer).toContain('30 days');

33 expect(result.error).toBeNull();

34 });

36 it('handles tool calls in the loop', async () => {

37 // First call returns a tool call; second returns final response

38 const mockInvoke = vi.fn()

39 .mockResolvedValueOnce({

40 content: '',

41 tool_calls: [{ id: 'tc1', name: 'search_knowledge_base', args: { query: 'returns' } }],

42 usage_metadata: { total_tokens: 200 },

43 })

44 .mockResolvedValueOnce(MOCK_FINAL_RESPONSE);

46 vi.mock('@langchain/anthropic', () => ({

47 ChatAnthropic: vi.fn().mockImplementation(() => ({

48 bindTools: vi.fn().mockReturnThis(),

49 invoke: mockInvoke,

50 })),

51 }));

53 const result = await workflow.invoke({

54 runId: 'test-run-2',

55 userId: 'user-123',

56 userInput: 'What is the return policy?',

57 });

59 expect(mockInvoke).toHaveBeenCalledTimes(2);

60 expect(result.finalOutput).not.toBeNull();

61 });

62});

End-to-End Validation

For E2E validation, record real LLM interactions as fixtures and replay them:

typescript

1// scripts/record-cassette.ts — run once, commit the output

2import { workflow } from '../src/agent/graph';

3import { writeFileSync } from 'fs';

5// Set up an interceptor that records all HTTP calls to the LLM API

6// Then replay from the recorded fixture in tests

8async function record() {

9 const calls: unknown[] = [];

10 const originalFetch = global.fetch;

12 global.fetch = async (url, init) => {

13 const response = await originalFetch(url, init);

14 const body = await response.clone().json();

15 calls.push({ url, requestBody: init?.body, responseBody: body });

16 return response;

17 };

19 await workflow.invoke({

20 runId: 'cassette-1',

21 userId: 'test-user',

22 userInput: 'What is the cancellation policy?',

23 });

25 writeFileSync('src/agent/__fixtures__/cancellation-query.json', JSON.stringify(calls, null, 2));

26 global.fetch = originalFetch;

27}

29record();

Conclusion

TypeScript's genuine advantage for agentic AI is not ecosystem size — Python still leads there — but compile-time safety for workflow state and tool contracts. Zod schemas at LLM output boundaries give you runtime validation that TypeScript's erased types cannot provide. Discriminated unions for agent actions make illegal states unrepresentable. And the async-first nature of Node.js aligns naturally with the I/O-bound reality of LLM orchestration.

The practical path forward: define your workflow state as an immutable typed interface, validate every LLM output with Zod before it touches application logic, separate tool definitions from implementations so both are independently testable, and run long workflows through BullMQ rather than synchronous request-response cycles. Use LangGraph.js when you need stateful multi-step graphs with checkpointing; use the Vercel AI SDK when you need streaming responses in a Next.js context. The TypeScript agentic ecosystem is younger than Python's, but the type system advantage compounds — every bug caught at compile time is a production incident you never have to diagnose.

FAQ

Need expert help?

Building with agentic AI?

I help teams ship production-grade systems. From architecture review to hands-on builds.

Book a Free Call Send a Brief

agentic-ai llm workflows orchestration typescript guide

Muneer Puthiya Purayil

SaaS Architect & AI Systems Engineer. 10+ years shipping production infrastructure across fintech, automotive, e-commerce, and healthcare.

View Portfolio Book a Call

← Previous

Introduction

Why This Matters

Who This Is For

What You Will Learn

Core Concepts

Key Terminology

Mental Models

Foundational Principles

Architecture Overview

High-Level Design

Component Breakdown

Data Flow

Implementation Steps

Step 1: Project Setup

Step 2: Core Logic

Step 3: Integration

Code Examples

Basic Implementation

Advanced Patterns

Production Hardening

Performance Considerations

Latency Optimization

Memory Management

Load Testing

Testing Strategy

Unit Tests

Integration Tests

End-to-End Validation

Conclusion

FAQ

Building with agentic AI?

Complete Guide to Agentic AI Workflows with Python

Agentic AI Workflows at Scale: Lessons from Production

Agentic AI Workflows Best Practices for High Scale Teams

Complete Guide to Agentic AI Workflows with Python

Start aConversation.

Start a
Conversation.