When LangChain launched in 2022, it became the default choice for building LLM-powered applications. Then LlamaIndex emerged as the go-to for retrieval-augmented generation (RAG). In 2026, many production teams are quietly abandoning both in favor of thin wrappers around the raw Anthropic and OpenAI SDKs.
This is not a story of bad libraries. It is a story of the ecosystem maturing to a point where framework abstraction adds more complexity than it removes. This post explains when each approach is genuinely the right choice.
The Core Tradeoff
ABSTRACTION TRADEOFF:
LangChain
LlamaIndex
More ▲ Less
Abstraction │ Abstraction
│
┌────────────────┴────────────────┐
│ Batteries Raw SDK │
│ Included Control │
│ (faster start) (fewer bugs) │
└─────────────────────────────────┘
│
▼
Raw SDK
+ Custom utilsLangChain: When It Shines, When It Doesn't
LangChain provides chains, agents, memory, and tool abstractions that connect LLM calls into complex pipelines.
Where LangChain excels:
- Rapid prototyping of multi-step LLM pipelines.
- Pre-built integrations with 100+ vector stores, document loaders, and tools.
- LCEL (LangChain Expression Language) for composable, observable chains.
Where it becomes a liability:
- Abstraction leakage: When something goes wrong, debugging through LangChain's layers is painful.
- Version instability: LangChain's API has changed significantly multiple times since launch.
- Overhead: For simple use cases, you're importing thousands of lines for a single
chat()call.
// LangChain approach
import { ChatAnthropic } from '@langchain/anthropic';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
const model = new ChatAnthropic({ model: 'claude-sonnet-4-5' });
const prompt = ChatPromptTemplate.fromMessages([
['system', 'You are a helpful assistant.'],
['user', '{input}'],
]);
const chain = prompt.pipe(model).pipe(new StringOutputParser());
const result = await chain.invoke({ input: 'Hello!' });// Raw SDK equivalent — simpler, no abstraction
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-sonnet-4-5',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello!' }],
});
const result = response.content[0].type === 'text' ? response.content[0].text : '';For simple cases, the raw SDK is less code and more readable.
LlamaIndex: RAG Done Right
LlamaIndex specializes in document indexing and retrieval-augmented generation. If your application needs to answer questions over a corpus of documents, LlamaIndex's abstractions genuinely save significant implementation effort.
// LlamaIndex: Full RAG pipeline in ~20 lines
import {
VectorStoreIndex,
SimpleDirectoryReader,
OpenAIEmbedding,
Settings,
} from 'llamaindex';
Settings.embedModel = new OpenAIEmbedding({ model: 'text-embedding-3-small' });
// Load documents
const reader = new SimpleDirectoryReader();
const documents = await reader.loadData('./docs');
// Index documents
const index = await VectorStoreIndex.fromDocuments(documents);
// Query
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: 'What are the main features of the product?',
});Implementing this from scratch with a raw SDK would require:
- Chunking documents into segments.
- Generating embeddings for each chunk.
- Storing embeddings in a vector database.
- Implementing semantic search at query time.
- Injecting retrieved chunks into the LLM context.
For RAG specifically, LlamaIndex's abstractions are worth the dependency.
Raw SDK + Custom Utilities: The Production Default
In 2026, many teams building serious production AI features find themselves writing thin utility layers around the raw Anthropic or OpenAI SDK:
// lib/ai/client.ts — a minimal, typed AI client
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
export interface ChatMessage {
role: 'user' | 'assistant';
content: string;
}
export async function chat(
messages: ChatMessage[],
options: {
system?: string;
model?: string;
maxTokens?: number;
temperature?: number;
} = {}
): Promise<string> {
const response = await client.messages.create({
model: options.model ?? 'claude-sonnet-4-5',
max_tokens: options.maxTokens ?? 2048,
system: options.system,
messages,
});
const textBlock = response.content.find(b => b.type === 'text');
if (!textBlock || textBlock.type !== 'text') {
throw new Error('No text content in response');
}
return textBlock.text;
}
export async function chatWithTools(
messages: ChatMessage[],
tools: Anthropic.Tool[],
options: { system?: string; model?: string } = {}
): Promise<{ message: string; toolCalls: Anthropic.ToolUseBlock[] }> {
const response = await client.messages.create({
model: options.model ?? 'claude-sonnet-4-5',
max_tokens: 4096,
system: options.system,
messages,
tools,
});
const textBlocks = response.content.filter(b => b.type === 'text') as Anthropic.TextBlock[];
const toolCalls = response.content.filter(b => b.type === 'tool_use') as Anthropic.ToolUseBlock[];
return {
message: textBlocks.map(b => b.text).join(''),
toolCalls,
};
}
This 50-line file replaces LangChain for the 80% of use cases that don't need complex pipelines.
Comparison Matrix
| Criterion | LangChain | LlamaIndex | Raw SDK |
|---|---|---|---|
| RAG pipelines | ✅ Good | ✅ Excellent | ⚠️ Manual work |
| Simple chat | ⚠️ Overkill | ❌ Not designed for | ✅ Best choice |
| Agentic tool use | ✅ Good | ⚠️ Limited | ✅ Full control |
| Debugging | ⚠️ Hard (abstraction layers) | ⚠️ Moderate | ✅ Easy |
| Bundle size | ❌ Large | ⚠️ Moderate | ✅ Minimal |
| API stability | ⚠️ Frequent breaking changes | ✅ Stable | ✅ Stable |
| Custom memory | ✅ Built-in | ⚠️ Limited | ✅ Full control |
Decision Framework
SHOULD I USE A FRAMEWORK?
Is your app primarily RAG (Q&A over documents)?
├── YES → LlamaIndex
└── NO
│
Do you need complex multi-step chains with many pre-built integrations?
├── YES → LangChain (but accept the maintenance cost)
└── NO
│
→ Raw SDK + custom utility layerConclusion
LangChain and LlamaIndex are not bad libraries — they are specialized tools that solve real problems in specific contexts. LlamaIndex is the right choice for RAG applications. LangChain provides value when you need its pre-built integrations and can accept its complexity overhead. For everything else — simple chat, agentic tool use, custom pipelines — the raw Anthropic or OpenAI SDK plus a small custom utility layer gives you more control, better debuggability, and a smaller bundle with less code. The default in 2026 should be to start with the raw SDK and reach for a framework only when you hit a wall it was specifically designed to solve.