You're reading from Generative AI with LangChain Build production-ready LLM applications and advanced agents using Python, LangChain, and LangGraph

Product type Paperback

Published in May 2025

Publisher Packt

ISBN-13 9781837022014

Length 476 pages

Edition 2nd Edition

Languages

Python

Tools

LangChain

Concepts

GPT/LLMs

Table of Contents (14) Chapters

Preface

1. The Rise of Generative AI: From Language Models to Agents

2. First Steps with LangChain FREE CHAPTER

3. Building Workflows with LangGraph

4. Building Intelligent RAG Systems

5. Building Intelligent Agents

6. Advanced Applications and Multi-Agent Systems

7. Software Development and Data Analysis Agents

8. Evaluation and Testing

9. Production-Ready LLM Deployment and Observability

10. The Future of Generative Models: Beyond Scaling

11. Other Books You May Enjoy

12. Index

Appendix

The modern LLM landscape

Artificial intelligence (AI) has long been a subject of fascination and research, but recent advancements in generative AI have propelled it into mainstream adoption. Unlike traditional AI systems that classify data or make predictions, generative AI can create new content—text, images, code, and more—by leveraging vast amounts of training data.

The generative AI revolution was catalyzed by the 2017 introduction of the transformer architecture, which enabled models to process text with unprecedented understanding of context and relationships. As researchers scaled these models from millions to billions of parameters, they discovered something remarkable: larger models didn’t just perform incrementally better—they exhibited entirely new emergent capabilities like few-shot learning, complex reasoning, and creative generation that weren’t explicitly programmed. Eventually, the release of ChatGPT in 2022 marked a turning point, demonstrating these capabilities to the public and sparking widespread adoption.

The landscape shifted again with the open-source revolution led by models like Llama and Mistral, democratizing access to powerful AI beyond the major tech companies. However, these advanced capabilities came with significant limitations—models couldn’t reliably use tools, reason through complex problems, or maintain context across interactions. This gap between raw model power and practical utility created the need for specialized frameworks like LangChain that transform these models from impressive text generators into functional, production-ready agents capable of solving real-world problems.

Key terminologies

Tools: External utilities or functions that AI models can use to interact with the world. Tools allow agents to perform actions like searching the web, calculating values, or accessing databases to overcome LLMs’ inherent limitations.

Memory: Systems that allow AI applications to store and retrieve information across interactions. Memory enables contextual awareness in conversations and complex workflows by tracking previous inputs, outputs, and important information.

Reinforcement learning from human feedback (RLHF): A training technique where AI models learn from direct human feedback, optimizing their performance to align with human preferences. RLHF helps create models that are more helpful, safe, and aligned with human values.

Agents: AI systems that can perceive their environment, make decisions, and take actions to accomplish goals. In LangChain, agents use LLMs to interpret tasks, choose appropriate tools, and execute multi-step processes with minimal human intervention.

Year	Development	Key Features
1990s	IBM Alignment Models	Statistical machine translation
2000s	Web-scale datasets	Large-scale statistical models
2009	Statistical models dominate	Large-scale text ingestion
2012	Deep learning gains traction	Neural networks outperform statistical models
2016	Neural Machine Translation (NMT)	Seq2seq deep LSTMs replace statistical methods
2017	Transformer architecture	Self-attention revolutionizes NLP
2018	BERT and GPT-1	Transformer-based language understanding and generation
2019	GPT-2	Large-scale text generation, public awareness increases
2020	GPT-3	API-based access, state-of-the-art performance
2022	ChatGPT	Mainstream adoption of LLMs
2023	Large Multimodal Models (LMMs)	AI models process text, images, and audio
2024	OpenAI o1	Stronger reasoning capabilities
2025	DeepSeek R1	Open-weight, large-scale AI model

Table 1.1: A timeline of major developments in language models

The field of LLMs is rapidly evolving, with multiple models competing in terms of performance, capabilities, and accessibility. Each provider brings distinct advantages, from OpenAI’s advanced general-purpose AI to Mistral’s open-weight, high-efficiency models. Understanding the differences between these models helps practitioners make informed decisions when integrating LLMs into their applications.

Model comparison

The following points outline key factors to consider when comparing different LLMs, focusing on their accessibility, size, capabilities, and specialization:

Open-source vs. closed-source models: Open-source models like Mistral and LLaMA provide transparency and the ability to run locally, while closed-source models like GPT-4 and Claude are accessible through APIs. Open-source LLMs can be downloaded and modified, enabling developers and researchers to investigate and build upon their architectures, though specific usage terms may apply.
Size and capabilities: Larger models generally offer better performance but require more computational resources. This makes smaller models great for use on devices with limited computing power or memory, and can be significantly cheaper to use. Small language models (SLMs) have a relatively small number of parameters, typically using millions to a few billion parameters, as opposed to LLMs, which can have hundreds of billions or even trillions of parameters.
Specialized models: Some LLMs are optimized for specific tasks, such as code generation (for example, Codex) or mathematical reasoning (e.g., Minerva).

The increase in the scale of language models has been a major driving force behind their impressive performance gains. However, recently there has been a shift in architecture and training methods that has led to better parameter efficiency in terms of performance.

Model scaling laws

Empirically derived scaling laws predict the performance of LLMs based on the given training budget, dataset size, and the number of parameters. If true, this means that highly powerful systems will be concentrated in the hands of Big Tech, however, we have seen a significant shift over recent months.

The KM scaling law, proposed by Kaplan et al., derived through empirical analysis and fitting of model performance with varied data sizes, model sizes, and training compute, presents power-law relationships, indicating a strong codependence between model performance and factors such as model size, dataset size, and training compute.

The Chinchilla scaling law, proposed by the Google DeepMind team, involved experiments with a wider range of model sizes and data sizes. It suggests an optimal allocation of compute budget to model size and data size, which can be determined by optimizing a specific loss function under a constraint.

However, future progress may depend more on model architecture, data cleansing, and model algorithmic innovation rather than sheer size. For example, models such as phi, first presented in Textbooks Are All You Need (2023, Gunasekar et al.), with about 1 billion parameters, showed that models can – despite a smaller scale – achieve high accuracy on evaluation benchmarks. The authors suggest that improving data quality can dramatically change the shape of scaling laws.

Further, there is a body of work on simplified model architectures, which have substantially fewer parameters and only modestly drop accuracy (for example, One Wide Feedforward is All You Need, Pessoa Pires et al., 2023). Additionally, techniques such as fine-tuning, quantization, distillation, and prompting techniques can enable smaller models to leverage the capabilities of large foundations without replicating their costs. To compensate for model limitations, tools like search engines and calculators have been incorporated into agents, and multi-step reasoning strategies, plugins, and extensions may be increasingly used to expand capabilities.

The future could see the co-existence of massive, general models with smaller and more accessible models that provide faster and cheaper training, maintenance, and inference.

Let’s now discuss a comparative overview of various LLMs, highlighting their key characteristics and differentiating factors. We’ll delve into aspects such as open-source vs. closed-source models, model size and capabilities, and specialized models. By understanding these distinctions, you can select the most suitable LLM for your specific needs and applications.

LLM provider landscape

You can access LLMs from major providers like OpenAI, Google, and Anthropic, along with a growing number of others, through their websites or APIs. As the demand for LLMs grows, numerous providers have entered the space, each offering models with unique capabilities and trade-offs. Developers need to understand the various access options available for integrating these powerful models into their applications. The choice of provider will significantly impact development experience, performance characteristics, and operational costs.

The table below provides a comparative overview of leading LLM providers and examples of the models they offer:

Provider	Notable models	Key features and strengths
OpenAI	GPT-4o, GPT-4.5; o1; o3-mini	Strong general performance, proprietary models, advanced reasoning; multimodal reasoning across text, audio, vision, and video in real time
Anthropic	Claude 3.7 Sonnet; Claude 3.5 Haiku	Toggle between real-time responses and extended “thinking” phases; outperforms OpenAI’s o1 in coding benchmarks
Google	Gemini 2.5, 2.0 (flash and pro), Gemini 1.5	Low latency and costs, large context window (up to 2M tokens), multimodal inputs and outputs, reasoning capabilities
Cohere	Command R, Command R Plus	Retrieval-augmented generation, enterprise AI solutions
Mistral AI	Mistral Large; Mistral 7B	Open weights, efficient inference, multilingual support
AWS	Titan	Enterprise-scale AI models, optimized for the AWS cloud
DeepSeek	R1	Maths-first: solves Olympiad-level problems; cost-effective, optimized for multilingual and programming tasks
Together AI	Infrastructure for running open models	Competitive pricing; growing marketplace of models

Table 1.2: Comparative overview of major LLM providers and their flagship models for LangChain implementation

Other organizations develop LLMs but do not necessarily provide them through application programming interfaces (APIs) to developers. For example, Meta AI develops the very influential Llama model series, which has strong reasoning, code-generation capabilities, and is released under an open-source license.

There is a whole zoo of open-source models that you can access through Hugging Face or through other providers. You can even download these open-source models, fine-tune them, or fully train them. We’ll try this out practically starting in Chapter 2.

Once you’ve selected an appropriate model, the next crucial step is understanding how to control its behavior to suit your specific application needs. While accessing a model gives you computational capability, it’s the choice of generation parameters that transforms raw model power into tailored output for different use cases within your applications.

Now that we’ve covered the LLM provider landscape, let’s discuss another critical aspect of LLM implementation: licensing considerations. The licensing terms of different models significantly impact how you can use them in your applications.

Licensing

LLMs are available under different licensing models that impact how they can be used in practice. Open-source models like Mixtral and BERT can be freely used, modified, and integrated into applications. These models allow developers to run them locally, investigate their behavior, and build upon them for both research and commercial purposes.

In contrast, proprietary models like GPT-4 and Claude are accessible only through APIs, with their internal workings kept private. While this ensures consistent performance and regular updates, it means depending on external services and typically incurring usage costs.

Some models like Llama 2 take a middle ground, offering permissive licenses for both research and commercial use while maintaining certain usage conditions. For detailed information about specific model licenses and their implications, refer to the documentation of each model or consult the model openness framework: https://isitopen.ai/.

The model openness framework (MOF) evaluates language models based on criteria such as access to model architecture details, training methodology and hyperparameters, data sourcing and processing information, documentation around development decisions, ability to evaluate model workings, biases, and limitations, code modularity, published model card, availability of servable model, option to run locally, source code availability, and redistribution rights.

In general, open-source licenses promote wide adoption, collaboration, and innovation around the models, benefiting both research and commercial development. Proprietary licenses typically give companies exclusive control but may limit academic research progress. Non-commercial licenses often restrict commercial use while enabling research.

By making knowledge and knowledge work more accessible and adaptable, generative AI models have the potential to level the playing field and create new opportunities for people from all walks of life.

The evolution of AI has brought us to a pivotal moment where AI systems can not only process information but also take autonomous action. The next section explores the transformation from basic language models to more complex, and finally, fully agentic applications.

The information provided about AI model licensing is for educational purposes only and does not constitute legal advice. Licensing terms vary significantly and evolve rapidly. Organizations should consult qualified legal counsel regarding specific licensing decisions for their AI implementations.