Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Generative AI with LangChain

You're reading from   Generative AI with LangChain Build production-ready LLM applications and advanced agents using Python, LangChain, and LangGraph

Arrow left icon
Product type Paperback
Published in May 2025
Publisher Packt
ISBN-13 9781837022014
Length 476 pages
Edition 2nd Edition
Languages
Concepts
Arrow right icon
Toc

Table of Contents (14) Chapters Close

Preface 1. The Rise of Generative AI: From Language Models to Agents 2. First Steps with LangChain FREE CHAPTER 3. Building Workflows with LangGraph 4. Building Intelligent RAG Systems 5. Building Intelligent Agents 6. Advanced Applications and Multi-Agent Systems 7. Software Development and Data Analysis Agents 8. Evaluation and Testing 9. Production-Ready LLM Deployment and Observability 10. The Future of Generative Models: Beyond Scaling 11. Other Books You May Enjoy 12. Index Appendix

The modern LLM landscape

Artificial intelligence (AI) has long been a subject of fascination and research, but recent advancements in generative AI have propelled it into mainstream adoption. Unlike traditional AI systems that classify data or make predictions, generative AI can create new content—text, images, code, and more—by leveraging vast amounts of training data.

The generative AI revolution was catalyzed by the 2017 introduction of the transformer architecture, which enabled models to process text with unprecedented understanding of context and relationships. As researchers scaled these models from millions to billions of parameters, they discovered something remarkable: larger models didn’t just perform incrementally better—they exhibited entirely new emergent capabilities like few-shot learning, complex reasoning, and creative generation that weren’t explicitly programmed. Eventually, the release of ChatGPT in 2022 marked a turning point, demonstrating these capabilities to the public and sparking widespread adoption.

The landscape shifted again with the open-source revolution led by models like Llama and Mistral, democratizing access to powerful AI beyond the major tech companies. However, these advanced capabilities came with significant limitations—models couldn’t reliably use tools, reason through complex problems, or maintain context across interactions. This gap between raw model power and practical utility created the need for specialized frameworks like LangChain that transform these models from impressive text generators into functional, production-ready agents capable of solving real-world problems.

Key terminologies

Tools: External utilities or functions that AI models can use to interact with the world. Tools allow agents to perform actions like searching the web, calculating values, or accessing databases to overcome LLMs’ inherent limitations.

Memory: Systems that allow AI applications to store and retrieve information across interactions. Memory enables contextual awareness in conversations and complex workflows by tracking previous inputs, outputs, and important information.

Reinforcement learning from human feedback (RLHF): A training technique where AI models learn from direct human feedback, optimizing their performance to align with human preferences. RLHF helps create models that are more helpful, safe, and aligned with human values.

Agents: AI systems that can perceive their environment, make decisions, and take actions to accomplish goals. In LangChain, agents use LLMs to interpret tasks, choose appropriate tools, and execute multi-step processes with minimal human intervention.

Year

Development

Key Features

1990s

IBM Alignment Models

Statistical machine translation

2000s

Web-scale datasets

Large-scale statistical models

2009

Statistical models dominate

Large-scale text ingestion

2012

Deep learning gains traction

Neural networks outperform statistical models

2016

Neural Machine Translation (NMT)

Seq2seq deep LSTMs replace statistical methods

2017

Transformer architecture

Self-attention revolutionizes NLP

2018

BERT and GPT-1

Transformer-based language understanding and generation

2019

GPT-2

Large-scale text generation, public awareness increases

2020

GPT-3

API-based access, state-of-the-art performance

2022

ChatGPT

Mainstream adoption of LLMs

2023

Large Multimodal Models (LMMs)

AI models process text, images, and audio

2024

OpenAI o1

Stronger reasoning capabilities

2025

DeepSeek R1

Open-weight, large-scale AI model

Table 1.1: A timeline of major developments in language models

The field of LLMs is rapidly evolving, with multiple models competing in terms of performance, capabilities, and accessibility. Each provider brings distinct advantages, from OpenAI’s advanced general-purpose AI to Mistral’s open-weight, high-efficiency models. Understanding the differences between these models helps practitioners make informed decisions when integrating LLMs into their applications.

Model comparison

The following points outline key factors to consider when comparing different LLMs, focusing on their accessibility, size, capabilities, and specialization:

  • Open-source vs. closed-source models: Open-source models like Mistral and LLaMA provide transparency and the ability to run locally, while closed-source models like GPT-4 and Claude are accessible through APIs. Open-source LLMs can be downloaded and modified, enabling developers and researchers to investigate and build upon their architectures, though specific usage terms may apply.
  • Size and capabilities: Larger models generally offer better performance but require more computational resources. This makes smaller models great for use on devices with limited computing power or memory, and can be significantly cheaper to use. Small language models (SLMs) have a relatively small number of parameters, typically using millions to a few billion parameters, as opposed to LLMs, which can have hundreds of billions or even trillions of parameters.
  • Specialized models: Some LLMs are optimized for specific tasks, such as code generation (for example, Codex) or mathematical reasoning (e.g., Minerva).

The increase in the scale of language models has been a major driving force behind their impressive performance gains. However, recently there has been a shift in architecture and training methods that has led to better parameter efficiency in terms of performance.

Model scaling laws

Empirically derived scaling laws predict the performance of LLMs based on the given training budget, dataset size, and the number of parameters. If true, this means that highly powerful systems will be concentrated in the hands of Big Tech, however, we have seen a significant shift over recent months.

The KM scaling law, proposed by Kaplan et al., derived through empirical analysis and fitting of model performance with varied data sizes, model sizes, and training compute, presents power-law relationships, indicating a strong codependence between model performance and factors such as model size, dataset size, and training compute.

The Chinchilla scaling law, proposed by the Google DeepMind team, involved experiments with a wider range of model sizes and data sizes. It suggests an optimal allocation of compute budget to model size and data size, which can be determined by optimizing a specific loss function under a constraint.

However, future progress may depend more on model architecture, data cleansing, and model algorithmic innovation rather than sheer size. For example, models such as phi, first presented in Textbooks Are All You Need (2023, Gunasekar et al.), with about 1 billion parameters, showed that models can – despite a smaller scale – achieve high accuracy on evaluation benchmarks. The authors suggest that improving data quality can dramatically change the shape of scaling laws.

Further, there is a body of work on simplified model architectures, which have substantially fewer parameters and only modestly drop accuracy (for example, One Wide Feedforward is All You Need, Pessoa Pires et al., 2023). Additionally, techniques such as fine-tuning, quantization, distillation, and prompting techniques can enable smaller models to leverage the capabilities of large foundations without replicating their costs. To compensate for model limitations, tools like search engines and calculators have been incorporated into agents, and multi-step reasoning strategies, plugins, and extensions may be increasingly used to expand capabilities.

The future could see the co-existence of massive, general models with smaller and more accessible models that provide faster and cheaper training, maintenance, and inference.

Let’s now discuss a comparative overview of various LLMs, highlighting their key characteristics and differentiating factors. We’ll delve into aspects such as open-source vs. closed-source models, model size and capabilities, and specialized models. By understanding these distinctions, you can select the most suitable LLM for your specific needs and applications.

LLM provider landscape

You can access LLMs from major providers like OpenAI, Google, and Anthropic, along with a growing number of others, through their websites or APIs. As the demand for LLMs grows, numerous providers have entered the space, each offering models with unique capabilities and trade-offs. Developers need to understand the various access options available for integrating these powerful models into their applications. The choice of provider will significantly impact development experience, performance characteristics, and operational costs.

The table below provides a comparative overview of leading LLM providers and examples of the models they offer:

Provider

Notable models

Key features and strengths

OpenAI

GPT-4o, GPT-4.5; o1; o3-mini

Strong general performance, proprietary models, advanced reasoning; multimodal reasoning across text, audio, vision, and video in real time

Anthropic

Claude 3.7 Sonnet; Claude 3.5 Haiku

Toggle between real-time responses and extended “thinking” phases; outperforms OpenAI’s o1 in coding benchmarks

Google

Gemini 2.5, 2.0 (flash and pro), Gemini 1.5

Low latency and costs, large context window (up to 2M tokens), multimodal inputs and outputs, reasoning capabilities

Cohere

Command R, Command R Plus

Retrieval-augmented generation, enterprise AI solutions

Mistral AI

Mistral Large; Mistral 7B

Open weights, efficient inference, multilingual support

AWS

Titan

Enterprise-scale AI models, optimized for the AWS cloud

DeepSeek

R1

Maths-first: solves Olympiad-level problems; cost-effective, optimized for multilingual and programming tasks

Together AI

Infrastructure for running open models

Competitive pricing; growing marketplace of models

Table 1.2: Comparative overview of major LLM providers and their flagship models for LangChain implementation

Other organizations develop LLMs but do not necessarily provide them through application programming interfaces (APIs) to developers. For example, Meta AI develops the very influential Llama model series, which has strong reasoning, code-generation capabilities, and is released under an open-source license.

There is a whole zoo of open-source models that you can access through Hugging Face or through other providers. You can even download these open-source models, fine-tune them, or fully train them. We’ll try this out practically starting in Chapter 2.

Once you’ve selected an appropriate model, the next crucial step is understanding how to control its behavior to suit your specific application needs. While accessing a model gives you computational capability, it’s the choice of generation parameters that transforms raw model power into tailored output for different use cases within your applications.

Now that we’ve covered the LLM provider landscape, let’s discuss another critical aspect of LLM implementation: licensing considerations. The licensing terms of different models significantly impact how you can use them in your applications.

Licensing

LLMs are available under different licensing models that impact how they can be used in practice. Open-source models like Mixtral and BERT can be freely used, modified, and integrated into applications. These models allow developers to run them locally, investigate their behavior, and build upon them for both research and commercial purposes.

In contrast, proprietary models like GPT-4 and Claude are accessible only through APIs, with their internal workings kept private. While this ensures consistent performance and regular updates, it means depending on external services and typically incurring usage costs.

Some models like Llama 2 take a middle ground, offering permissive licenses for both research and commercial use while maintaining certain usage conditions. For detailed information about specific model licenses and their implications, refer to the documentation of each model or consult the model openness framework: https://isitopen.ai/.

The model openness framework (MOF) evaluates language models based on criteria such as access to model architecture details, training methodology and hyperparameters, data sourcing and processing information, documentation around development decisions, ability to evaluate model workings, biases, and limitations, code modularity, published model card, availability of servable model, option to run locally, source code availability, and redistribution rights.

In general, open-source licenses promote wide adoption, collaboration, and innovation around the models, benefiting both research and commercial development. Proprietary licenses typically give companies exclusive control but may limit academic research progress. Non-commercial licenses often restrict commercial use while enabling research.

By making knowledge and knowledge work more accessible and adaptable, generative AI models have the potential to level the playing field and create new opportunities for people from all walks of life.

The evolution of AI has brought us to a pivotal moment where AI systems can not only process information but also take autonomous action. The next section explores the transformation from basic language models to more complex, and finally, fully agentic applications.

The information provided about AI model licensing is for educational purposes only and does not constitute legal advice. Licensing terms vary significantly and evolve rapidly. Organizations should consult qualified legal counsel regarding specific licensing decisions for their AI implementations.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime