Building AI Agents

AI/ML Engineering Track | Complexity: [COMPLEX] | Time: 6-8

Or: The Other Frameworks You Should Know About

Reading Time: 5-6 hours Prerequisites: Module 18

The Fork in the Road: When One Framework Isn’t Enough

San Francisco. December 15, 2022. 11:23 PM.

Jerry Liu was staring at his laptop in a cramped WeWork office, surrounded by takeout containers and energy drink cans. He’d just quit his job as a data platform engineer at Uber. His friends thought he was crazy—leaving a stable Big Tech job to build yet another AI framework.

“Everyone’s building chatbots,” his co-founder said. “We need to build something different.”

Jerry opened ChatGPT and typed: “What are the key findings from Uber’s 2022 financial report?” ChatGPT responded with its characteristic confidence, but the answer was completely wrong. It hallucinated numbers, invented executives, and fabricated entire initiatives.

“That’s the problem,” Jerry muttered. “These models don’t know anything about your data. They know the internet, not your documents.”

He created a new Python file and typed: # GPT Index - connecting LLMs to your data. Over the next 48 hours, barely sleeping, he wrote the first version of what would become LlamaIndex: a system to index your documents so LLMs could actually use them.

November 20, 2022: First commit to GitHub.

December 2022: 1,000 GitHub stars.

March 2023: 15,000 stars, $8.5 million seed round.

Today: 35,000+ stars, used by Notion, Stripe, and hundreds of enterprises.

What made LlamaIndex explode? A simple insight that changed everything: LLMs are commodities, but your data is your moat. Every company has the same ChatGPT. But only you have your documents, your knowledge, your domain expertise. The winning framework wouldn’t be the one that wrapped the best LLM—it would be the one that connected LLMs to proprietary data most effectively.

In this module, you’ll explore LlamaIndex and the ecosystem of alternative frameworks that have emerged. Each solves a different problem, embodies a different philosophy, and excels at different tasks. Understanding them all will help you pick the right tool for each job—because no single framework wins every battle.

What You’ll Be Able to Do

By the end of this module, you will:

Understand LlamaIndex’s architecture and when to use it
Compare LangChain vs LlamaIndex for different use cases
Explore alternative multi-agent frameworks (CrewAI, AutoGen)
Make informed framework selection decisions
Build integrations between frameworks

The Framework Landscape: A Map of the Territory

You’ve now mastered LangChain and LangGraph. But the AI framework ecosystem is rich with alternatives, each with different philosophies and strengths. Think of it like choosing a programming language—Python, JavaScript, and Rust all solve problems, but they make different trade-offs and excel in different domains.

Why This Module Matters

Different teams solve AI challenges differently, and those differences compound into distinct philosophies:

Framework	Philosophy	Best For
LangChain	Composable chains, flexibility	General-purpose LLM apps
LlamaIndex	Data-centric, indexing focus	RAG and knowledge systems
CrewAI	Role-based agents, simplicity	Multi-agent collaboration
AutoGen	Conversational agents	Research, complex dialogues
Semantic Kernel	Enterprise, Microsoft stack	.NET/Azure integration
Haystack	Search-first	Production search systems

It’s like the difference between a Swiss Army knife (LangChain), a specialized chef’s knife (LlamaIndex), a complete kitchen set (CrewAI), and a restaurant kitchen (AutoGen). Each has its place.

The Big Picture

                         AI Application Frameworks
                                   │
        ┌──────────────────────────┼──────────────────────────┐
        │                          │                          │
   LangChain/LangGraph         LlamaIndex                Multi-Agent
        │                          │                          │
   ┌────┴────┐               ┌────┴────┐               ┌────┴────┐
   │         │               │         │               │         │
 Chains    Agents          Index    Query            CrewAI  AutoGen
 Memory    Tools           RAG      Engine           Roles   Agents

LlamaIndex: The Data Framework for LLMs

What is LlamaIndex?

LlamaIndex (formerly GPT Index) is a data framework for building LLM applications. While LangChain focuses on chains and agents, LlamaIndex focuses on a different trinity:

Data Ingestion: Connect to any data source (files, databases, APIs)
Data Indexing: Structure data for efficient retrieval
Query Interface: Natural language access to your data

Think of LlamaIndex like a librarian. LangChain is the patron asking questions and taking actions. The librarian knows where every book is, understands how they’re organized, and can quickly find relevant information. Without the librarian, even the smartest patron would spend hours searching the stacks.

Core Philosophy

LlamaIndex’s philosophy: “Your data is your moat.”

“The LLM is commoditized. What differentiates AI applications is your proprietary data and how you leverage it.” — Jerry Liu, LlamaIndex founder

This philosophy manifests in practical design decisions. LlamaIndex makes it trivially easy to load data from 150+ sources, index it in various ways (vector, knowledge graph, tree), and query it with natural language. The framework handles the complexity of chunking, embedding, and retrieval so you can focus on your data.

LlamaIndex Architecture

┌─────────────────────────────────────────────────────────────┐
│                       LlamaIndex                             │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐              │
│  │  Data    │───▶│  Index   │───▶│  Query   │              │
│  │ Loaders  │    │  Types   │    │ Engines  │              │
│  └──────────┘    └──────────┘    └──────────┘              │
│       │               │               │                     │
│       ▼               ▼               ▼                     │
│  - PDF, CSV      - Vector        - RAG                     │
│  - Web, API      - Summary       - Chat                    │
│  - Database      - Knowledge     - SQL                     │
│  - Notion        - Tree          - Agent                   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

The architecture is like a three-stage rocket: data loaders fuel the system, indexes organize it, and query engines extract value. Each stage is independent—you can swap out loaders, combine index types, or chain query engines without rebuilding the whole system.

Key Components

1. Data Connectors (Loaders)

LlamaIndex has 150+ data connectors—like USB adapters for every device:

from llama_index.core import SimpleDirectoryReader
from llama_index.readers.web import SimpleWebPageReader
from llama_index.readers.database import DatabaseReader

# Load from directory - handles PDF, DOCX, TXT, etc.
documents = SimpleDirectoryReader("./data").load_data()

# Load from web - scrapes and converts to text
web_docs = SimpleWebPageReader(html_to_text=True).load_data(
    ["https://example.com/page1", "https://example.com/page2"]
)

# Load from database - runs SQL, converts rows to documents
db_reader = DatabaseReader(
    sql_database="postgresql://user:pass@host:5432/db"
)
documents = db_reader.load_data(query="SELECT * FROM articles")

2. Index Types

Different index types for different use cases—like different filing systems:

from llama_index.core import (
    VectorStoreIndex,
    SummaryIndex,
    TreeIndex,
    KeywordTableIndex
)

# Vector Index - best for semantic search
# Like organizing books by topic rather than title
vector_index = VectorStoreIndex.from_documents(documents)

# Summary Index - good for summarization
# Like having cliff notes for every book
summary_index = SummaryIndex.from_documents(documents)

# Tree Index - hierarchical organization
# Like organizing books by category > subcategory > title
tree_index = TreeIndex.from_documents(documents)

# Keyword Index - good for exact matches
# Like a traditional library card catalog
keyword_index = KeywordTableIndex.from_documents(documents)

3. Query Engines

Query engines process natural language queries—they’re the librarian answering your questions:

# Basic query engine
query_engine = index.as_query_engine()
response = query_engine.query("What is the main topic?")

# With retrieval settings
query_engine = index.as_query_engine(
    similarity_top_k=5,           # Return top 5 most relevant chunks
    response_mode="compact"       # Summarize results
)

# Chat engine (maintains conversation context)
chat_engine = index.as_chat_engine()
response = chat_engine.chat("Tell me about the document")
follow_up = chat_engine.chat("Can you elaborate?")  # Remembers context

LlamaIndex RAG Pipeline

Here’s a complete RAG pipeline in LlamaIndex—notice how concise it is:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI

# 1. Load documents (handles chunking internally)
documents = SimpleDirectoryReader("./data").load_data()

# 2. Create index (embeds and stores)
index = VectorStoreIndex.from_documents(documents)

# 3. Create query engine
query_engine = index.as_query_engine(
    llm=OpenAI(model="gpt-5"),
    similarity_top_k=3
)

# 4. Query
response = query_engine.query("What are the key findings?")
print(response)

Compare this to LangChain—which requires explicit chunking, vector store setup, and chain configuration. LlamaIndex’s opinionated defaults handle the common case elegantly.

Advanced LlamaIndex Features

Sub-Question Query Engine

Complex queries often need to be broken down. Like a detective who solves a case by answering smaller questions first:

from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.tools import QueryEngineTool, ToolMetadata

# Create tools from multiple indexes
tools = [
    QueryEngineTool(
        query_engine=financial_index.as_query_engine(),
        metadata=ToolMetadata(
            name="financial_data",
            description="Financial reports and metrics"
        )
    ),
    QueryEngineTool(
        query_engine=product_index.as_query_engine(),
        metadata=ToolMetadata(
            name="product_data",
            description="Product documentation"
        )
    )
]

# Sub-question engine decomposes complex queries automatically
query_engine = SubQuestionQueryEngine.from_defaults(query_engine_tools=tools)
response = query_engine.query(
    "How did Q3 product launches affect revenue?"
)
# Internally: "What products launched in Q3?" + "What was Q3 revenue?" → synthesis

Router Query Engine

For systems with multiple specialized indexes—like a receptionist directing calls:

from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),  # LLM decides which index to use
    query_engine_tools=[
        QueryEngineTool(
            query_engine=technical_index.as_query_engine(),
            metadata=ToolMetadata(
                name="technical",
                description="Technical documentation and code"
            )
        ),
        QueryEngineTool(
            query_engine=business_index.as_query_engine(),
            metadata=ToolMetadata(
                name="business",
                description="Business processes and policies"
            )
        )
    ]
)

Knowledge Graph Index

Build and query knowledge graphs—like connecting dots in a mystery novel:

from llama_index.core import KnowledgeGraphIndex

# Create knowledge graph from documents
kg_index = KnowledgeGraphIndex.from_documents(
    documents,
    max_triplets_per_chunk=3,    # Extract relationships
    include_embeddings=True      # Also enable semantic search
)

# Query with graph traversal
query_engine = kg_index.as_query_engine(
    include_text=True,
    retriever_mode="keyword",
    response_mode="tree_summarize"
)

LangChain vs LlamaIndex: A Deep Comparison

Philosophy Comparison

Aspect	LangChain	LlamaIndex
Core Focus	Chains, agents, tools	Data indexing, retrieval
Mental Model	”Building blocks for LLM apps"	"Data framework for LLMs”
Strength	Flexibility, agent patterns	RAG, knowledge management
Complexity	Higher learning curve	More opinionated, simpler
Use Case	General-purpose	Data-intensive apps

Think of it this way: LangChain is like a workshop with every tool imaginable—you can build anything, but you need to know how to use each tool. LlamaIndex is like a specialized RAG factory—it does one thing extremely well with minimal configuration.

When to Use Each

Choose LangChain when:

Building complex agent systems with multiple tools
Need maximum flexibility in architecture
Require stateful workflows (with LangGraph)
Integrating many external tools beyond documents
Building chatbots with complex conversational logic

Choose LlamaIndex when:

RAG is your primary use case
Working with many heterogeneous data sources
Need sophisticated indexing strategies (graph, tree, summary)
Building knowledge-intensive applications
Want simpler RAG setup with sensible defaults

Code Comparison: Basic RAG

LangChain RAG (explicit, flexible):

from langchain_community.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.chains import RetrievalQA

# Load and split - you control every step
loader = DirectoryLoader("./data")
documents = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=1000)
chunks = splitter.split_documents(documents)

# Create vector store - explicit embedding configuration
vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

# Create chain - wire up retriever and LLM
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    retriever=vectorstore.as_retriever()
)

# Query
response = qa_chain.invoke("What is the main topic?")

LlamaIndex RAG (opinionated, concise):

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load and index - handles splitting internally with smart defaults
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)

# Query - one line
query_engine = index.as_query_engine()
response = query_engine.query("What is the main topic?")

Observation: LlamaIndex is more concise for basic RAG—10 lines vs 20. LangChain offers more control over each step. Neither is “better”—they’re optimized for different priorities.

Using Both Together

Here’s the secret: they’re not mutually exclusive! Use LlamaIndex for data management, LangChain for agents:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain.tools import Tool

# LlamaIndex for indexing - leverage its data handling
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

# Wrap as LangChain tool - bridge the frameworks
def search_docs(query: str) -> str:
    response = query_engine.query(query)
    return str(response)

search_tool = Tool(
    name="document_search",
    func=search_docs,
    description="Search internal documents for information"
)

# Use in LangChain agent - leverage its agent patterns
agent = create_tool_calling_agent(llm, [search_tool], prompt)
executor = AgentExecutor(agent=agent, tools=[search_tool])

CrewAI: Role-Based Multi-Agent

What is CrewAI?

CrewAI is a framework for orchestrating role-playing AI agents. Its philosophy is based on human team dynamics—instead of thinking about graphs and states, you think about roles, responsibilities, and collaboration.

Think of CrewAI like directing a film. You cast actors (agents) for specific roles, give them their motivation and backstory, and let them work together to create something greater than any individual contribution.

Core Concepts

┌─────────────────────────────────────────────────────────────┐
│                         CrewAI                               │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌─────────┐    ┌─────────┐    ┌─────────┐                 │
│  │  Agent  │    │  Task   │    │  Crew   │                 │
│  │         │    │         │    │         │                 │
│  │ - Role  │    │ - Goal  │    │ - Team  │                 │
│  │ - Goal  │    │ - Agent │    │ - Tasks │                 │
│  │ - Tools │    │ - Output│    │ - Flow  │                 │
│  └─────────┘    └─────────┘    └─────────┘                 │
│                                                              │
└─────────────────────────────────────────────────────────────┘

CrewAI Example

Here’s how you’d build a content creation team:

from crewai import Agent, Task, Crew, Process

# Define agents with roles - like casting actors
researcher = Agent(
    role="Senior Research Analyst",
    goal="Find comprehensive information about AI frameworks",
    backstory="You are an expert at finding and analyzing technical information.",
    tools=[search_tool, web_scraper],
    verbose=True
)

writer = Agent(
    role="Technical Writer",
    goal="Create clear, engaging technical content",
    backstory="You specialize in making complex topics accessible.",
    verbose=True
)

reviewer = Agent(
    role="Quality Reviewer",
    goal="Ensure content accuracy and clarity",
    backstory="You have a keen eye for errors and unclear explanations.",
    verbose=True
)

# Define tasks - like scenes in a script
research_task = Task(
    description="Research the latest AI framework developments in 2024",
    agent=researcher,
    expected_output="Comprehensive research notes with sources"
)

writing_task = Task(
    description="Write a blog post based on the research",
    agent=writer,
    expected_output="1500-word blog post in markdown",
    context=[research_task]  # Uses output from research
)

review_task = Task(
    description="Review and improve the blog post",
    agent=reviewer,
    expected_output="Edited blog post with improvements",
    context=[writing_task]
)

# Create crew - like assembling the production
crew = Crew(
    agents=[researcher, writer, reviewer],
    tasks=[research_task, writing_task, review_task],
    process=Process.sequential  # or Process.hierarchical
)

# Run the crew - action!
result = crew.kickoff()

CrewAI Strengths

Intuitive Role-Based Design: Matches how humans think about teams
Built-in Collaboration: Agents naturally hand off work via context
Simple API: Less boilerplate than LangGraph for common patterns
Memory and Learning: Agents remember past interactions

CrewAI vs LangGraph

Aspect	CrewAI	LangGraph
Mental Model	Human teams, roles	State machines, graphs
Flexibility	More opinionated	More flexible
Learning Curve	Lower	Higher
Complex Workflows	Limited	Excellent
Customization	Moderate	High

Choose CrewAI for quick prototypes and role-based tasks. Choose LangGraph when you need precise control over state and complex conditional logic.

AutoGen: Conversational Agents

What is AutoGen?

AutoGen (Microsoft Research) focuses on conversational multi-agent systems where agents communicate through chat. It’s like a group chat where AI participants discuss, debate, and collaborate.

Core Concept

┌─────────────────────────────────────────────────────────────┐
│                         AutoGen                              │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│    Agent A ◄─────── Messages ───────► Agent B               │
│       │                                   │                 │
│       ▼                                   ▼                 │
│  "I'll research..."              "Based on that..."         │
│       │                                   │                 │
│       └──────────► Agent C ◄──────────────┘                 │
│                "Let me synthesize..."                       │
│                                                              │
└─────────────────────────────────────────────────────────────┘

AutoGen Example

from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager

# Create agents - each with a persona
assistant = AssistantAgent(
    name="assistant",
    llm_config={"model": "gpt-5"},
    system_message="You are a helpful AI assistant."
)

coder = AssistantAgent(
    name="coder",
    llm_config={"model": "gpt-5"},
    system_message="You write Python code to solve problems."
)

critic = AssistantAgent(
    name="critic",
    llm_config={"model": "gpt-5"},
    system_message="You review code and suggest improvements."
)

# User proxy for human-in-the-loop
user_proxy = UserProxyAgent(
    name="user",
    human_input_mode="TERMINATE",  # or "ALWAYS" for full control
    code_execution_config={"work_dir": "coding"}
)

# Group chat - agents converse naturally
group_chat = GroupChat(
    agents=[user_proxy, assistant, coder, critic],
    messages=[],
    max_round=10
)

manager = GroupChatManager(groupchat=group_chat)

# Start conversation
user_proxy.initiate_chat(
    manager,
    message="Create a Python function to calculate fibonacci numbers"
)

AutoGen Strengths

Natural Conversations: Agents communicate like humans in chat
Code Execution: Built-in safe code execution sandbox
Research-Grade: From Microsoft Research, academically rigorous
Flexible Termination: Fine-grained control over when to stop

When to Use AutoGen

Research and experimentation with multi-agent systems
Code generation workflows with review cycles
Complex multi-turn dialogues with multiple perspectives
When agents need to debate and collaborate naturally

Other Notable Frameworks

Semantic Kernel (Microsoft)

Best for: Enterprise, Azure integration, .NET developers

import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

kernel = sk.Kernel()
kernel.add_service(AzureChatCompletion(
    deployment_name="gpt-5",
    endpoint="https://your-resource.openai.azure.com/",
    api_key="your-key"
))

# Create semantic function - natural language as code
summarize = kernel.create_semantic_function(
    "Summarize this text: {{$input}}",
    max_tokens=200
)

result = await kernel.invoke(summarize, input="Long text here...")

Haystack

Best for: Search-focused applications, European enterprises

from haystack import Pipeline
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder

# Build pipeline - search-first design
pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store))
pipeline.add_component("prompt_builder", PromptBuilder(template="""
    Context: {{documents}}
    Question: {{query}}
    Answer:
"""))
pipeline.add_component("generator", OpenAIGenerator())

pipeline.connect("retriever", "prompt_builder.documents")
pipeline.connect("prompt_builder", "generator")

result = pipeline.run({"query": "What is the capital of France?"})

DSPy (Stanford)

Best for: Prompt optimization, research

import dspy

# Configure
lm = dspy.OpenAI(model="gpt-5")
dspy.settings.configure(lm=lm)

# Define signature - what, not how
class QA(dspy.Signature):
    """Answer questions based on context."""
    context = dspy.InputField()
    question = dspy.InputField()
    answer = dspy.OutputField()

# Create module - DSPy optimizes prompts automatically
qa = dspy.ChainOfThought(QA)

# Use - no manual prompt engineering needed
result = qa(context="Paris is the capital of France", question="What is the capital?")

Framework Selection Guide

Decision Framework

                    What's your primary use case?
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
   RAG/Knowledge        Agent/Workflow       Multi-Agent
        │                     │                     │
   LlamaIndex            LangChain           ┌────┴────┐
        │                     │              │         │
   (simpler setup)      (flexible)      CrewAI    AutoGen
                              │              │         │
                        LangGraph        (roles)  (chat)
                              │
                    (complex state)

Quick Decision Table

Need	Recommended Framework
Simple RAG	LlamaIndex
Complex RAG with custom logic	LangChain
Stateful agent workflows	LangGraph
Role-based team of agents	CrewAI
Conversational agent research	AutoGen
Enterprise/Azure	Semantic Kernel
Search-first application	Haystack
Prompt optimization	DSPy

Practical Recommendations

Start with one framework - Don’t try to learn all at once
LangChain + LlamaIndex is a powerful combination
CrewAI for quick multi-agent prototypes
LangGraph when you need precise control
Consider team skills - Choose what your team can maintain

Did You Know? The LlamaIndex Origin Story

Jerry Liu created LlamaIndex in November 2022, just two weeks after ChatGPT launched. Originally called “GPT Index,” it was born from a simple frustration:

“I was at Uber working on their data platform. When ChatGPT came out, I tried to use it with our internal docs. The context window was tiny, and everyone was just copy-pasting text. I thought: ‘We solved this for databases 40 years ago. Why are we starting from scratch?’”

Jerry quit his job at Uber and started coding. The first commit was on November 20, 2022. By December, it had 1,000 GitHub stars. By March 2023, it had 15,000 stars and Jerry had raised a $8.5 million seed round led by Greylock Partners.

In April 2024, LlamaIndex raised a $19 million Series A, valuing the company at around $150 million. All from solving the “how do I use my data with LLMs?” problem.

Did You Know? The Name Change Drama

The original name “GPT Index” caused problems:

OpenAI (politely) asked about trademark concerns
Users confused it with an OpenAI product
It implied GPT-only support (it works with Claude, Gemini, etc.)

The rename to “LlamaIndex” in February 2023 was inspired by Meta’s LLaMA model release that same month. The llama mascot stuck—now it’s one of the most recognizable brands in AI tooling.

Fun fact: Jerry considered “VectorPanda” and “EmbedBear” before settling on LlamaIndex.

Did You Know? The CrewAI Viral Moment

João Moura, a Brazilian developer, was frustrated in December 2023. He’d spent 3 days trying to build a simple multi-agent system in LangChain:

“I just wanted three agents to work together—a researcher, a writer, and an editor. The LangChain code was 400 lines. I thought: ‘This should be 40 lines.’”

He built CrewAI over a weekend, posted it on Twitter/X on January 4, 2024, and went to bed.

He woke up to:

2,000+ GitHub stars overnight
500+ forks in 12 hours
DMs from Y Combinator, Andreessen Horowitz, and Sequoia

By January 7th, CrewAI had 10,000 stars. By February, João had quit his job and raised $2 million in pre-seed funding.

The magic? This code that creates a functional AI team:

crew = Crew(
    agents=[researcher, writer, editor],
    tasks=[research_task, write_task, edit_task],
    process=Process.sequential
)
result = crew.kickoff()

That’s it. No complex chains, no state management, no 400 lines.

Did You Know? AutoGen: The Rejected Paper

AutoGen’s story starts with a rejected paper. In early 2022, Microsoft Research submitted a paper on “multi-agent conversation protocols” to NeurIPS. Rejected. ICML? Rejected. The reviewers said it was “too engineering-focused.”

Then in September 2023, they released AutoGen as open-source code instead of a paper. Within weeks:

5,000+ GitHub stars
Microsoft’s fastest-growing AI repo (at the time)
Adopted by researchers worldwide for multi-agent experiments

The irony? The “rejected paper” concepts are now cited in hundreds of papers. Sometimes open-source beats peer review.

Did You Know? The $100 Million Framework War That Never Happened

In early 2024, tech journalists predicted an “AI framework war” between:

LangChain ($135M raised, 80K+ GitHub stars)
LlamaIndex ($27M raised, 30K+ GitHub stars)
CrewAI ($18M raised, 15K+ GitHub stars)

Headlines screamed: “Only one will survive!”

What actually happened? Collaboration.

LlamaIndex created llama-index-langchain for LangChain integration
CrewAI added LlamaIndex tools support
LangChain’s LangSmith works with LlamaIndex traces
Harrison Chase (LangChain) and Jerry Liu (LlamaIndex) regularly promote each other’s work

The lesson: In developer tools, interoperability beats competition. Developers want frameworks that work together, not walled gardens.

Did You Know? Enterprise Adoption Patterns

A 2024 survey of Fortune 500 companies using AI frameworks revealed surprising patterns:

Framework	Fortune 500 Adoption	Surprise Factor
LangChain	47%	Expected
LlamaIndex	38%	Higher than expected
Custom built	31%	They build their own!
CrewAI	12%	Fast for a new framework
AutoGen	8%	Mostly research teams

The biggest surprise? 31% built custom frameworks because:

Existing frameworks changed too fast (breaking changes)
They needed specific compliance features
“Not invented here” syndrome is real in enterprises

Did You Know? DSPy: The Framework That Programs Prompts

While everyone focused on chains and agents, Omar Khattab at Stanford was asking a different question:

“Why are we hand-writing prompts like it’s 1999?”

His solution, DSPy (Declarative Self-improving Python), automatically optimizes prompts based on examples. Results from the paper:

25% better accuracy than hand-written prompts on GSM8K math
40% token reduction through optimized prompts
Works across different LLMs without prompt rewrites

It’s called the “compiler for LLMs”—you write what you want, DSPy figures out how to get it.

Did You Know? The Framework Half-Life Problem

Here’s a sobering stat: Of the 47 AI frameworks that had 1,000+ GitHub stars in January 2024:

12 are now abandoned (no commits in 6+ months)
8 merged into other projects
6 pivoted to completely different purposes
21 are still active

The average “half-life” of an AI framework is about 14 months. Before committing to a framework, check:

Last commit date
Open issue response time
Who’s funding development
Enterprise adoption (they don’t switch often)

Did You Know? Real Production Usage

Company	Framework	Scale	Use Case
Notion	LlamaIndex	30M+ users	Document Q&A
Replit	LangGraph	40M+ users	AI code assistant
Klarna	LangChain	2.3M conv/mo	Customer service
Stripe	Custom + LlamaIndex	Billions $/yr	Fraud detection
Shopify	Custom	4.5M merchants	Product descriptions
Netflix	Custom + embeddings	260M users	Recommendations

Notice the pattern? The biggest companies either use custom solutions or combine multiple frameworks. There’s no “one framework to rule them all.”

Best Practices

1. Start Simple

# Don't do this first:
from crewai import Agent, Task, Crew, Process
from langchain.agents import create_tool_calling_agent
from llama_index.core import VectorStoreIndex

# Do this first - learn one framework deeply:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)

2. Abstract Your Framework

Prepare for change by abstracting framework specifics:

# Good: Framework-agnostic interface
class RAGSystem:
    def __init__(self, implementation: str = "llamaindex"):
        if implementation == "llamaindex":
            self._engine = LlamaIndexRAG()
        elif implementation == "langchain":
            self._engine = LangChainRAG()

    def query(self, question: str) -> str:
        return self._engine.query(question)

3. Benchmark Before Committing

Test frameworks on YOUR data before choosing:

# Compare frameworks on YOUR data
frameworks = ["llamaindex", "langchain", "haystack"]
test_queries = load_test_queries()

for framework in frameworks:
    engine = create_engine(framework)
    results = [engine.query(q) for q in test_queries]
    print(f"{framework}: {evaluate(results)}")

4. Consider Maintenance

Questions to ask before committing:

How active is the community?
How often are there breaking changes?
Is the documentation good?
Are there examples for your use case?
Who’s funding development?

Key Takeaways

LlamaIndex excels at RAG. Its opinionated defaults and data-centric philosophy make it the go-to for knowledge-intensive applications.
LangChain excels at flexibility. When you need complex agents, multiple tools, or custom orchestration, LangChain’s building blocks shine.
They work together. Use LlamaIndex for data indexing, LangChain for agent orchestration. The frameworks interoperate well.
CrewAI simplifies multi-agent. Role-based design matches human intuition and reduces boilerplate significantly.
AutoGen is research-grade. When you need agents to debate, collaborate, and execute code, AutoGen’s conversational model excels.
Frameworks have half-lives. Before committing, check activity, funding, and enterprise adoption. Build abstractions for portability.
Start simple, then specialize. Master one framework deeply before combining multiple. Premature optimization applies to framework choices too.
Your data is your moat. As Jerry Liu says, LLMs are commoditized. Your proprietary data and how you leverage it is the differentiator.

Summary

Frameworks Covered

Framework	Philosophy	Best For
LlamaIndex	Data framework	RAG, knowledge systems
LangChain	Composable chains	General LLM apps
LangGraph	State machines	Complex workflows
CrewAI	Role-based teams	Multi-agent collab
AutoGen	Conversations	Research, dialogues

Key Insights

No single “best” framework - Choose based on use case
Frameworks are converging - Interoperability is improving
LlamaIndex excels at RAG - Simpler than LangChain for indexing
CrewAI is beginner-friendly - Good for multi-agent prototypes
Combine frameworks - LlamaIndex indexing + LangChain agents

Decision Summary

Simple RAG → LlamaIndex
Complex agents → LangChain + LangGraph
Quick multi-agent → CrewAI
Research/experiments → AutoGen
Enterprise/Azure → Semantic Kernel

Next Steps

With framework knowledge complete, you’re ready for:

Module 20: Advanced Agentic AI
Agent memory systems (short-term, long-term, episodic)
Planning algorithms (ReWOO, Plan-and-Execute)
Multi-agent collaborative systems

You can now choose the right tool for the job!

Module 19 Complete!

Next: Module 20 - Advanced Agentic AI