SemanticKernelCookBook en
SemanticKernelCookBook en
Since the release of Semantic Kernel in March of 2023, we’ve managed to see a
community of AI engineers emerge from around the globe. And Kinfey Lo’s
“Semantic Kernel Cookbook” is certain to enable even more “AI chefs” to find
their path to combining native code with semantic code. How? By learning how
We’re all learning together — in the open. And we know that the best way to
build a community is through sharing recipes so we can all learn together. Faster.
Open source has a wonderful capability to enable greater learning by just being
you’ve chosen to show up, definitely try out some of Kinfey’s cooking. And if
you’re in the mood for it, make your own recipe for AI with SK and share it back.
Introdoction
With the rise of LLMs , AI has entered the 2.0 era. The open source
Content
Getting Stared with LLM......................................................................... 1
2 43
https://github.com/microsoft/SemanticKernelCookBook
GPT-4 and beyond: With the continuous development of technology, subsequent GPT models may continue
to improve in terms of model scale, understanding capabilities, and multi-modal capabilities.
GPT application areas GPT models are widely used in many fields, including but not limited to:
Text generation: such as article writing, creative writing, code generation, etc.
Chatbots: Provide a smooth conversational experience.
Natural language understanding: such as sentiment analysis, text classification, etc.
Translation and multilingual tasks: automatically translate different languages.
Knowledge extraction and question answering: Extract information from large amounts of text to answer
specific questions.
Overall, the GPT model represents an important milestone in the current field of artificial intelligence and
natural language processing. Its powerful capabilities and diverse application prospects continue to lead the
trend of technological development.
GPT-3
GPT-3 is a LLMs developed by OpenAI that can understand and generate natural language. It is one of the
LLMs currently, with 175 billion parameters, and can complete text summarization, machine translation,
dialogue systems, code generation, etc. The characteristic of GPT-3 is that it can adapt to different tasks
and fields through simple text prompts, that is, "few-shot learning", without requiring additional fine-tuning
or labeling data. GPT-3 opened Pandoraʼs box and changed the rules of the industry. GPT-3 has been used
in many products and services, such as OpenAI API, OpenAI Codex, early GitHub Copilot, etc., which can
make it easier for developers, creators, and scholars to use and learn artificial intelligence. GPT-3 has also
triggered some discussions and thinking about the ethics, society and security of artificial intelligence, such
as artificial intelligence's bias, explainability, responsibility, impact, etc.
GPT-3.5 and ChatGPT
GPT-3.5 and ChatGPT are both large language models based on the GPT-3 architecture, which can
understand and generate natural language. They all have 175 billion parameters and can perform amazingly
well on a variety of language processing tasks, such as text summarization, machine translation, dialogue
systems, code generation, etc.
The main difference between GPT-3.5 and ChatGPT is their scope and purpose. GPT-3.5 is a general
language model that can handle a variety of language processing tasks. ChatGPT, on the other hand, is a
specialized model designed specifically for chat applications. It emphasizes interaction and communication
with users, and can play different roles, such as cat ladies, celebrities, politicians, etc. It can also generate
multimedia content such as images, music, and videos based on user input.
Another difference between GPT-3.5 and ChatGPT is their training data and training methods. GPT-3.5 is
pre-trained on 570 GB of text data from different sources such as websites, books, articles, etc. It is trained
through self-supervised learning, which generates its own labels for the input data by predicting the next
word or token in the sequence, given the previous words. ChatGPT is based on GPT-3.5 and uses more
conversation data, such as social media, chat records, movie scripts, etc., for further fine-tuning. Its training
3 43
https://github.com/microsoft/SemanticKernelCookBook
method is through multi-task learning, that is, optimizing multiple goals at the same time, such as language
model, dialogue generation, emotion classification, image generation, etc.
GPT-4
GPT-4 (Generative Pre-trained Transformer 4th Generation) is the latest generation of artificial intelligence
language models developed by OpenAI. It is the successor of GPT-3 with more advanced and refined
features. Here are some of the key features of GPT-4:
Larger knowledge base and data processing capabilities: GPT-4 can process larger amounts of data, and its
knowledge base is broader and deeper than GPT-3.
Higher language understanding and generation capabilities: GPT-4 has significantly improved in
understanding and generating natural language, and can more accurately understand complex language
structures and meanings.
Multi-modal capabilities: GPT-4 can not only process text, but also understand and generate images,
providing a multi-modal interactive experience.
Better contextual understanding: GPT-4 can better understand and maintain context in long conversations,
providing more coherent and consistent responses.
Improved security and reliability: OpenAI has strengthened the filtering and control of inappropriate content
in GPT-4 to provide a more secure and reliable user experience.
Wide range of application fields: GPT- can be used in various fields, including but not limited to chatbots,
content creation, educational assistance, language translation, data analysis, etc.
Overall, GPT-4 has made significant improvements and enhancements over its predecessor model,
providing more powerful and diverse features. GPT-4 has an absolute leadership position at this stage and
is also the goal of many companies' large models.
GPT-4V
The full name of GPT-4V is GPT-4 Turbo with Vision. It can understand pictures, analyze pictures for users,
and answer questions related to pictures. GPT-4V can accurately understand the content of images,
identify objects in images, count the number of objects, provide image-related insights and information,
extract text, etc. It can be said that GPT-4V is the king of LLMs, and it also allows LLMs to better
understand the world. GPT-4Vʼs main vision capabilities and application directions
Object Detection: GPT-4V is able to identify and detect a variety of common objects in images, such as
cars, animals, and household items. Its recognition capabilities have been evaluated on standard image
datasets.
Text Recognition: This model features optical character recognition (OCR) technology that finds printed or
handwritten text in images and converts it into machine-readable text. This feature is proven in images such
as documents, logos, and titles.
Face Recognition: GPT-4V is able to find and recognize faces in images. It also has a degree of ability to
determine gender, age and racial attributes from facial features. The modelʼs facial analysis capabilities
have been tested on datasets such as FairFace and LFW.
4 43
https://github.com/microsoft/SemanticKernelCookBook
CAPTCHA SOLVING: GPT-4V demonstrates visual reasoning capabilities in solving text- and image-based
CAPTCHAs. This indicates that the model has advanced puzzle-solving skills.
Geolocation: GPT-4V is able to identify cities or geographical locations represented in landscape images.
This shows that the model has mastered knowledge about the real world, but it also means that there is a
risk of privacy leakage.
Complex Images: The model performs poorly when dealing with complex scientific diagrams, medical
scans, or images with multiple overlapping text components. It cannot grasp contextual details.
DALL·E
DALL·E is an advanced artificial intelligence program developed by OpenAI specifically designed to
generate images. It is a neural network model based on the GPT-3 architecture, but unlike GPT-3, which
mainly processes text, DALL·E's expertise lies in generating corresponding images based on text
descriptions. The name of this model is a tribute to the famous artist Salvador Dalí and the popular
animated character WALL·E.
Key features of DALL·E Text to image conversion: DALL·E can generate images based on text descriptions
provided by users. These descriptions can be very specific or creative, and the model will do its best to
generate images that match the description.
Creativity and Flexibility: DALL·E displays amazing creativity when generating images, able to combine
different concepts and elements to create unique and innovative visual works.
Variety and detail: The model is capable of generating multiple styles and types of images and can handle
complex, detailed descriptions.
Application potential: DALL·E has extensive application potential in art creation, advertising, design and
other fields.
DALL·E application scenarios include
Artistic Creation: Artists and designers can use DALL·E to explore new ideas and visual expressions.
Advertising and Media: Generate images that fit a specific theme or concept.
Education and Entertainment: Used in the production of instructional materials or the creation of
entertainment content.
Research and exploration: Explore the possibilities of artificial intelligence in the field of visual arts.
The emergence of DALL·E marks an important progress in artificial intelligence in creative tasks and shows
the huge potential of AI in the field of visual arts. Now the latest DALL·E model is the DALL·E 3 .
Whisper·
Whisper is an advanced automatic speech recognition (ASR) model developed by OpenAI. This model
focuses on transcribing speech into text and has shown excellent performance in multiple languages and
different environments. Here are some key features about the Whisper model:
Features
5 43
https://github.com/microsoft/SemanticKernelCookBook
Multi-language support: Whisper models are capable of handling many different languages and dialects,
making them widely applicable across the globe.
High-precision recognition: It can accurately recognize and transcribe speech, maintaining a high accuracy
even in environments with a lot of background noise.
Adaptable to different contexts: Whisper can not only recognize standard voice input, but also adapt to
various colloquial and informal conversation styles.
Easy to integrate and use: As a machine learning model, Whisper can be integrated into various applications
and services to provide speech recognition capabilities.
Application
Automatic subtitles and transcription: Automatically generate subtitles or text for video and audio content.
Voice assistants and chatbots: Improve the ability of voice assistants and chatbots to recognize voice
commands.
Accessibility Services: Help people with hearing impairments better understand audio content.
Meeting and Lecture Recording: Automatically record and transcribe meeting or lecture content.
Overall, the Whisper model represents an important advancement in the field of automatic speech
recognition, and its multi-language and high-precision recognition capabilities make it extremely valuable in
a variety of application scenarios.
Microsoft & OpenAI
The partnership between Microsoft and OpenAI is an important development in contemporary artificial
intelligence. Microsoft has been an important partner and supporter of OpenAI since its inception. Here are
some key aspects and impacts of their collaboration:
Investment and Cooperation
6 43
https://github.com/microsoft/SemanticKernelCookBook
Financial support: Microsoft made significant investments in OpenAI in its early days, including hundreds of
millions of dollars in funding. These investments help OpenAI develop its research projects and technology.
Cloud computing resources: Microsoft provides OpenAI with the resources of its Azure cloud computing
platform, which is crucial for training and running large AI models, such as GPT and DALL·E series models.
Technical cooperation
Joint research and development: The two companies have cooperated on multiple AI projects and
technologies to jointly promote the development of artificial intelligence.
Product integration: Some of OpenAIʼs technologies, such as GPT-3, have been integrated into Microsoft
products and services, such as Microsoft Azure and other enterprise-level solutions.
Strategic Cooperation
Sustainable and safe AI: Both parties are committed to developing AI technology that is both sustainable
and safe, and pay attention to AI ethics and safety issues.
Expand AI applications: Through cooperation, the two companies are committed to applying AI technology
to a wider range of fields, such as health care, education, and environmental protection.
Influence
Accelerate the development of AI technology: This cooperation promotes the rapid development and
innovation of AI technology.
Business applications and services: Microsoft has promoted the widespread application of artificial
intelligence in the business field by applying OpenAI's technology to its products and services.
Promote AI democratization: This collaboration helps make advanced AI technology accessible and usable
to more enterprises and developers.
Overall, the cooperation between Microsoft and OpenAI is a model of combining technological innovation
and commercial applications. This cooperation has had a profound impact on the development and
popularization of artificial intelligence technology. As cooperation between the two parties continues to
deepen, it can be expected that they will continue to play an important role in the field of artificial
intelligence.
Azure OpenAI Service
Azure OpenAI Service is a collaboration between Microsoft Azure and OpenAI. Azure OpenAI Service is a
cloud-based platform that enables developers and data scientists to quickly and easily build and deploy
artificial intelligence models. With Azure OpenAI, users can access a variety of AI tools and technologies to
create intelligent applications, including natural language processing, computer vision, and deep learning.
Azure OpenAI Service is designed to accelerate the development of AI applications, allowing users to focus
on creating innovative solutions that create value for their organizations and customers.
Azure OpenAI Service provides REST API access to OpenAI's powerful language models, including GPT-4,
GPT-4 Turbo with Vision, GPT-3.5-Turbo, and the family of embedded models. Additionally, the new GPT-4
and GPT-3.5-Turbo model series are now officially released. These models can be easily adapted to
specific tasks, including but not limited to content generation, aggregation, image understanding, semantic
7 43
https://github.com/microsoft/SemanticKernelCookBook
search, and natural language to code conversion. Users can access the service through a REST API, Python
SDK, or a web-based interface in Azure OpenAI Studio.
To use Azure OpenAI Service, you need to have an Azure account, then apply through this link, and wait 1-3
working days to use Azure OpenAI Service.
Azure OpenAI Studio
We can manage our models through Azure OpenAI Studio, as well as test our models in the Playground
Azure OpenAI Studio
We can manage our models through Azure OpenAI Studio, as well as test our models in the Playground
8 43
https://github.com/microsoft/SemanticKernelCookBook
Model Sharing and Community: Hugging Face has built a strong community that promotes model and
knowledge sharing among researchers and developers. Through its platform, anyone can upload, share and
use pre-trained models.
Research and Collaboration: Hugging Face conducts active research in the field of artificial intelligence,
collaborating with numerous teams in academia and industry.
Education and Resources: Hugging Face also provides a variety of educational resources, including
tutorials, documentation, and research papers, to help people better understand and use NLP technology.
Influence
Technological innovation: Hugging Face has played an important role in promoting technological innovation
in the field of NLP, especially in the development and application of pre-trained models.
Lowering the technical threshold: By providing easy-to-use tools and resources, Hugging Face lowers the
technical threshold for working in the field of NLP, allowing more researchers and developers to participate
in this field.
Community building: Its strong community and open source culture promote knowledge sharing and
collaboration, accelerating the development and innovation of NLP technology.
While Hugging Face originally started as a consumer-facing chatbot application, it quickly transformed into
a company focused on providing NLP technology and resources. Now, it not only supports research and
education, but also provides commercial solutions to enterprises, such as custom model training, data
processing and machine learning consulting services.
In summary, Hugging Face is a key player in the NLP field, and its open source ethos and contributions to
the community have played an important role in promoting the democratization and innovation of artificial
intelligence technology.
Azure AI Studio also supports the introduction of the Hugging Face model, which allows enterprises to
better combine business scenarios and use different models to solve problems in different application
scenarios.
9 43
https://github.com/microsoft/SemanticKernelCookBook
Summary
This chapter introduces the current knowledge related to LLM, especially the knowledge related to
mainstream large-scale language model platforms such as OpenAI, Microsoft, and Hugging Face, as well as
the application scenarios and performance of different models. For application scenarios, it is impossible for
us to use only one model. In the AI 2.0 era, we need the support of different models to complete more
intelligent application scenarios. Whether in the cloud or locally, the application scenarios of large language
models will be a hot topic of concern in the next few years. As a beginner, what you need to do is to
understand different models and complete application construction based on actual scenarios.
10 43
https://github.com/microsoft/SemanticKernelCookBook
After click 'Create', configure the region where Azure OpenAI is located. Please note: Because the resource
distribution is different, different regions have different OpenAI models. You must understand it clearly
before using it.
11 43
https://github.com/microsoft/SemanticKernelCookBook
. Go to the created resources, you can deploy the model, and obtain the Key and Endpoint required
when calling the SDK
12 43
https://github.com/microsoft/SemanticKernelCookBook
. Enter 'Model Deployment' and select 'Management Deployment' to enter Azure OpenAI Studio
13 43
https://github.com/microsoft/SemanticKernelCookBook
Congratulations, you have successfully deployed the model. Now you can use the SDK to connect it.
Using SDK with Azure OpenAI Service
The SDK that interfaces with Azure OpenAI Service includes the SDK released by OpenAI for the Python
version, and the SDK released by Microsoft for .NET. As a beginner, it is recommended to use it in a
Notebook environment so that it is easier to understand the key steps of execution.
Python SDK
The official Python SDK released by OpenAI supports linking OpenAI and Azure OpenAI Service. Now
OpenAI SDK has released version 1.x, but many people on the market are using version 0.2x. ***The content
of this course will be based on OpenAI SDK version 1.x and use Python 3.10.x. ***
14 43
01.UsingAzureOpenAIServiceWithSDK.md
https://github.com/microsoft/SemanticKernelCookBook
.NET SDK
Microsoft releases an SDK based on Azure OpenAI Service. You can get the latest package through Nuget
to complete .NET generative AI applications. The content of this course will be based on .NET 8 and the
latest Azure.AI.OpenAI SDK to demonstrate examples. Of course, Polyglot Notebook will also be
used as the environment
We have configured the SDK environment based on .NET / Python above. Next, we need to create the linked
class to complete the related initialization work.
Getting started with the .NET environment
client = AzureOpenAI(
azure_endpoint = 'Your Azure OpenAI Service Endpoint',
api_key='Your Azure OpenAI Service Key',
api_version="Your Azure OpenAI API version"
)
Response<Completions> completionsResponse =
client.GetCompletions(completionsOptions);
response = openai.Completion.create(engine=deployment_name,
prompt=start_phrase, max_tokens=1000)
2. Chat API
This is an API based on the gpt-35-turbo and gpt-4 models for the chat scenario
Chat with .NET
Response<ChatCompletions> response =
client.GetChatCompletions(chatCompletionsOptions);
16 43
https://github.com/microsoft/SemanticKernelCookBook
Chat with Python
response = client.chat.completions.create(
model="gpt-35-turbo", # model = "deployment_name".
messages=[
{"role": "system", "content": "You are my coding assistant."},
{"role": "user", "content": "Can you tell me how to write python
flask application?"}
]
)
print(response.choices[0].message.content)
result = client.images.generate(
model="dalle3",
prompt="Chinese New Year picture for the Year of the Dragon",
n=1
)
json_response = json.loads(result.model_dump_json())
4. Embeddings API
17 43
https://github.com/microsoft/SemanticKernelCookBook
Based on text-embedding-ada-002 model, implementation based on vector conversion
Embeddings with .NET
Samples
Examples related to the above APIs are listed below. Please click here
Python examples Please visit Click here
.NET examples Please visit Click here
Summary
We use the most original and basic SDK to deal with Azure OpenAI Service. This is also our first step
towards generative AU programming. We can understand different interfaces more quickly without using a
framework, and it also lays the foundation for us to enter Semantic Kernel
18 43
https://github.com/microsoft/SemanticKernelCookBook
Python
Note: We are using the latest version of Semantic Kernel 0.4.3.dev0 here, and using Notebook to
complete related learning
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
Python
import semantic_kernel as sk
import semantic_kernel.connectors.ai.open_ai as skaoai
21 43
https://github.com/microsoft/SemanticKernelCookBook
kernel = sk.Kernel()
deployment, api_key, endpoint = sk.azure_openai_settings_from_dot_env()
kernel.add_chat_service("azure_chat_competion_service",
skaoai.AzureChatCompletion(deployment,endpoint,api_key=api_key,api_version
= "2023-07-01-preview"))
var plugin =
kernel.CreatePluginFromPromptDirectory(Path.Combine(pluginDirectory,
"TranslatePlugin"));
Python
pluginFunc =
kernel.import_semantic_skill_from_directory(base_plugin,"TranslatePlugin")
Step 4: Running
.NET
transalteContent.GetValue();
Python
22 43
https://github.com/microsoft/SemanticKernelCookBook
translateFunc = pluginFunc["Basic"]
23 43
https://github.com/microsoft/SemanticKernelCookBook
We know that the original data of LLMs is time-limited, and if you want to add real-time content or
enterprise knowledge, there are considerable shortcomings. OpenAI connects ChatGPT to third-party
applications via plugins. These plugins enable ChatGPT to interact with developer-defined APIs, thereby
enhancing ChatGPT's functionality and allowing a wider range of operations, such as:
. Retrieve real-time information, such as sports scores, stock prices, latest news, etc.
. Retrieve knowledge base information, such as company documents, personal notes, etc.
. Assist users to perform related operations, such as booking flights, ordering meals, etc.
Semantic Kernel follows the plugin specifications of OpenAI plug-ins and can easily access and export plug-
ins (such as plug-ins based on Bing, Microsoft 365, OpenAI), which allows developers to easily call different
plugin services. In addition to plug-ins compatible with OpenAI, Semantic Kernel also has its own plugin
definition method. Not only can Plugins be defined in a specified template format, but Plugins can also be
defined within a function.
Plugins Style
In early preview versions of Semantic Kernel, Plugins were defined as Skills. As mentioned above, on par
with OpenAI. But the specific goals have not changed. You can understand that the plug-in of Semantic
Kernel is to provide developers with various functions to complete intelligent business requirements. You
can think of Semantic Kernel as the base of Lego. To complete the construction of a Great Wall, you need
various functional modules. And these functional modules are what we call plug-ins.
24 43
https://github.com/microsoft/SemanticKernelCookBook
We may have a business-based plug-in set, such as HRPlugins, which contains different functions, such as:
Plugins Intro
Holidays holiday content
Contracts About employee contracts
Departments About organizational structure
These sub-functions can be effectively combined according to different requirements to complete different
planned tasks. For example, the following structure:
|-plugins
|-HRPlugins
|-Holidays
|-Contract
|-Departments
|-EmailsPlugins
|-HRMail
|-CustomMail
|-PeoplesPlugins
|-Managers
|-Workers
We issue an instruction to the LLMs, "Please send an email to the manager whose contract has expired." In
fact, we find a combination from different components, and finally complete the relevant work based on
Contract + Managers + HRMail.
Semantic Kernel's plugins
Define plugins through templates
We know we can have conversations with LLMs through prompt engineering. For an enterprise or start-up
company, when we deal with business, it may not be a prompt project, but may need a collection of prompt
engineerings. We can put these prompt engineerings sets for business capabilities into the Semantic Kernel
plugin collection. For plugins that combine prompt projects, Semantic Kernel has a fixed template. Prompt
engineerings are placed in the skprompt.txt file, and related parameter settings are placed in the config.json
file. The final file structure is like this
|-plugins
|-HRPlugins
|-Holidays
|-skprompt.txt
|-config.json
|-Contract
25 43
https://github.com/microsoft/SemanticKernelCookBook
|-skprompt.txt
|-config.json
|-Departments
|-skprompt.txt
|-config.json
|-EmailsPlugins
|-HRMail
|-skprompt.txt
|-config.json
|-CustomMail
|-skprompt.txt
|-config.json
|-PeoplesPlugins
|-Managers
|-skprompt.txt
|-config.json
|-Workers
|-skprompt.txt
|-config.json
Letʼs first take a look at the definition of skprompt.txt. This is generally where business-related prompts
are placed and can support multiple parameters. Each parameter is placed in {{$parameter's name}}, as in
the following format:
Our job here is to translate the input content into a specific language. Input and language are two
parameters, which means you can give any value to these two parameters.
config.json contains configuration-related content. In addition to setting parameters related to LLMs, you
can also set input parameters and related descriptions.
{
"schema": 1,
"type": "completion",
"description": "Translate sentenses into a language of your choice",
"completion": [
{
"max_tokens": 2000,
"temperature": 0.7,
"top_p": 0.0,
"presence_penalty": 0.0,
"frequency_penalty": 0.0,
"stop_sequences": [
26 43
https://github.com/microsoft/SemanticKernelCookBook
"[done]"
]
}
],
"input": {
"parameters": [
{
"name": "input",
"description": "sentense to translate",
"defaultValue": ""
},
{
"name": "language",
"description": "Language to translate to",
"defaultValue": ""
}
]
}
}
[KernelFunction,Description("search weather")]
public string WeatherSearch(string text)
{
return "Guangzhou, 2 degree,rainy";
}
Note: It is recommended to use business class encapsulation to define different extension functions to
make it easier to call, such as
using Microsoft.SemanticKernel;
using System.ComponentModel;
using System.Globalization;
[KernelFunction,Description("search weather")]
public string WeatherSearch(string text)
{
return text + ", 2 degree,rainy";
}
}
var companySearchPlugin =
kernel.ImportPluginFromObject(companySearchPluginObj,
"CompanySearchPlugin");
28 43
https://github.com/microsoft/SemanticKernelCookBook
weatherContent.GetValue<string>()
Python
To define function extension, macro definition needs to be added
Place all custom functions in native_function.py and define them through classes, such as
class API:
@sk_function(
description = "Get news from the web",
name = "NewsPlugin"
)
@sk_function_context_parameter(name="location", description="location
name")
def get_news_api(self, context: SKContext) -> str:
return """Get news from the """ + context["location"] + """."""
@sk_function(
description="Search Weather in a city",
name="WeatherFunction"
)
@sk_function_context_parameter(name="city", description="city string")
def ask_weather_function(self, context: SKContext) -> str:
return "Guangzhou’s weather is 30 celsius degree , and very hot."
@sk_function(
description="Search Docs",
name="DocsFunction"
)
@sk_function_context_parameter(name="docs", description="docs string")
def ask_docs_function(self, context: SKContext) -> str:
29 43
https://github.com/microsoft/SemanticKernelCookBook
return "ask docs :" + context["docs"]
Calling functions:
api_plugin = kernel.import_native_skill_from_directory(base_plugin ,
"APIPlugin")
context_variables = sk.ContextVariables(variables={
"location": "China"
})
print(news_result)
Predefined plugin
Semantic Kernel has a lot of predefined plug-ins as related capabilities for solving general business. Of
course, as Semantic Kernel matures, more built-in plug-ins will be incorporated.
Samples
.NET Samples Click
Python Samples Click
Summary
Through plugins, you can complete more work based on business scenarios. Through studying this chapter,
you have initially mastered the definition of plugins and how to use plugins to work. I hope you can create a
plugin library for your enterprise based on business scenarios in real business work.
30 43
https://github.com/microsoft/SemanticKernelCookBook
import library
using Microsoft.SemanticKernel.Planning;
Note: When using HanlerBars, you need to note that these features are still evolving, and please ignore
SKEXP0060 when using them.
Python
import library
or
32 43
https://github.com/microsoft/SemanticKernelCookBook
Note: The settings of Python's Planner and .NET's Planner are different here. Python should be
synchronized with .NET, so when using Python, you need to be prepared for future changes.
The Changing Planner
In the official blog, changes in Planner are mentioned https://devblogs.microsoft.com/semantic-
kernel/migrating-from-the-sequential-and-stepwise-planners-to-the-new-handlebars-and -stepwise-
planner/ combines Function Calling to rearrange the different Planner integrations in the preview version.
You can pay attention to this content to learn more.
If you want to understand the principles of Planner implementation, please refer to
https://github.com/microsoft/semantic-
kernel/blob/main/dotnet/src/Planners/Planners.Handlebars/Handlebars/CreatePlanPrompt.handlebars
Sample
.NET Sample Please click here
Python Sample Please click here
Summary
The addition of Planner greatly improves the usability of Semantic Kernel, especially for business and tool
scenarios. Building an enterprise-level plugin library is also very important for the implementation of
Planner. After all, we use plug-ins to combine different tasks to complete the work.
33 43
https://github.com/microsoft/SemanticKernelCookBook
Embedding Skills
Many industries hope to have the capabilities of LLMs and hope that LLMs can solve their own internal
problems. This includes employee-related content such as onboarding instructions, leave and
reimbursement processes, and benefit inquiries. Enterprise business flow-related content includes relevant
documents, regulations, execution processes, etc., as well as some customer-oriented inquiries. Although
LLMs have strong knowledge capabilities, industry-based data and knowledge cannot be obtained. So how
to inject this industry-based knowledge content? This is also an important step for LLMs to enter the
corporate world. In this chapter, we will talk to you about how to inject industry data and knowledge to make
LLMs more professional. This is the basis on which we create RAG applications.
Letʼs start with vectors in natural language
In the field of natural language, we know that the finest granularity is words, words constitute sentences,
and sentences constitute paragraphs, chapters and final documents. Computers don't know words, so we
need to convert words into mathematical representations. This representation is a vector, that is, Vector.
Vector is a mathematical concept. It is a directional quantity with magnitude and direction. With vectors, we
can effectively vectorize text, which is also the basis of the field of computer natural language. In the field
of natural language processing, we have many vectorization methods, such as One-hot, TF-IDF, Word2Vec,
Glove, ELMO, GPT, BERT, etc. These vectorization methods have their own advantages and disadvantages,
but they are all based on word vectorization, that is, word vectors. Word vectors are the basis of natural
language processing and the basis of all OpenAI models. Letʼs look at several common methods in word
vectors respectively.
One-hot
One-hot encoding uses 0 and 1 encoding to represent words. For example, we have 4 words, namely: I,
love, Beijing, and Tiananmen. Then we can use 4 vectors to represent these 4 words, which are:
I = [1, 0, 0, 0]
love = [0, 1, 0, 0]
Beijing = [0, 0, 1, 0]
Tiananmen = [0, 0, 0, 1]
In traditional natural language application scenarios, we regard each word as represented by a One-Hot
vector as a unique discrete symbol. The number of words in our vocabulary is the dimension of the vector.
For example, the above example contains a total of four words, so we can use a four-dimensional vector to
represent it. In this vector, each word is unique, which means that each word is independent and has no
relationship. Such vectors are called One-Hot vectors. The advantage of One-Hot vector is that it is simple,
easy to understand, and each word is unique without any relationship. However, the disadvantage of One-
Hot vector is also obvious, that is, the dimension of the vector will increase as the number of words
increases. For example, if we have 1000 words, then our vector will be 1000 dimensions. Such a vector is
very sparse, that is, most of its values are 0. Such vectors will cause a very large amount of computer
34 43
https://github.com/microsoft/SemanticKernelCookBook
calculations and are not conducive to computer calculations. Therefore, the disadvantage of One-Hot
encoding is that the vector dimension is large, the calculation amount is large, and the calculation efficiency
is low.
TF-IDF
TF-IDF is a statistic that evaluates the importance of a word to a corpus. TF-IDF is the abbreviation of Term
Frequency - Inverse Document Frequency, which is called Term Frequency-Inverse Document Frequency in
Chinese. The main idea of TF-IDF is: if a word appears frequently in an article and rarely appears in other
articles, then this word is the keyword of this article. Generally, we are used to splitting this concept into two
parts: TF and IDF.
TF - Term Frequency
Term Frequency refers to the frequency with which a word appears in an article. The formula for calculating
word frequency is as follows:
TF = The number of times a word appears in the article / the total number
of words in the article
One problem with TF is that if a word appears many times in an article, the TF value of the word will be very
large. In this case, we will consider this word to be the keyword of this article. But in this case, we will find
that many words are keywords in this article. In this case, we will not be able to distinguish which words are
keywords in this article. So we need to make some adjustments to TF, and this adjustment is IDF.
IDF - Inverse document frequency
Inverse Document Frequency refers to the frequency of a certain word appearing in all articles. The formula
for calculating inverse document frequency is as follows:
TF-IDF = TF * IDF
35 43
https://github.com/microsoft/SemanticKernelCookBook
In the calculation formula of IDF, 1 is added to the denominator to avoid the situation where the denominator
is 0. In the calculation formula of IDF, the total number of documents in the corpus is fixed, so we only need
to calculate the number of documents containing the word. If a word appears in many articles, the IDF value
of this word will be small. If a word appears in few articles, the IDF value of this word will be large. In this
case, we can calculate the TF-IDF value of a word by multiplying TF and IDF. The calculation formula of TF-
IDF is as follows:
Word2Vec
We also call Word2Vec Word Embeddings, which is called word embedding in Chinese. The main idea of
Word2Vec is that the semantics of a word can be determined by its context. Word2Vec has two models,
CBOW and Skip-Gram. CBOW is the abbreviation of Continuous Bag-of-Words, which is called the
continuous bag-of-words model in Chinese. Skip-Gram is the abbreviation of Skip-Gram Model, which is
called skip model in Chinese. The idea of the CBOW model is to predict a word through its context. The idea
of the Skip-Gram model is to predict the context of a word from a word. The advantage of Word2Vec is that
it can get the semantics of words and the relationship between words. Compared with One-Hot encoding
and TF-IDF encoding, the advantage of Word2Vec encoding is that it can obtain the semantics of words and
the relationship between words. The disadvantage of Word2Vec encoding is that it is computationally
intensive and requires a large corpus.
We mentioned before that the dimension of One-Hot encoding is the number of words, while the dimension
of Word2Vec encoding can be specified. Generally we will specify 100 dimensions or 300 dimensions. The
higher the dimensionality of Word2Vec encoding, the richer the relationships between words, but the
greater the computational complexity. The lower the dimensionality of Word2Vec encoding, the simpler the
relationship between words, but the smaller the calculation amount. The dimensionality of Word2Vec
encoding is generally 100 or 300 dimensions, which can meet most application scenarios.
The calculation formula of Word2Vec encoding is very simple, which is Word Embeddings. Word
Embeddings is a word vector whose dimensions can be specified. The dimensions of Word Embeddings are
generally 100 or 300 dimensions, which can meet most application scenarios. The calculation formula for
Word Embeddings is as follows:
The training process is divided into two stages, the first stage is pre-training and the second stage is fine-
tuning. The pre-training corpora are Wikipedia and BookCorpus, and the fine-tuning corpora are different
natural language tasks. The goal of pre-training is to predict a word through its context, and the goal of
fine-tuning is to fine-tune the semantic model to obtain different models based on different natural
language tasks, such as text classification, text generation, question and answer systems, etc.
The GPT model has gone through four stages, the most famous of which are GPT-3.5 and GPT 4 used by
ChatGPT. GPT opens a new era, and its emergence allows us to see the infinite possibilities of natural
language processing. The advantage of the GPT model is that it can obtain the semantics of words and the
relationship between words. The disadvantage of the GPT model is that it is computationally intensive and
requires a large corpus. Many people hope to have a GPT that benchmarks their own industry. This is also a
problem we need to solve in this chapter.
BERT
BERT is the abbreviation of Bidirectional Encoder Representations from Transformers, which is called
Transformer of Bidirectional Encoder in Chinese. BERT is a pre-trained model, and its training corpus is
Wikipedia and BookCorpus. The main idea of BERT is that the semantics of a word can be determined by its
context. The advantage of BERT is that it can obtain the semantics of words and the relationship between
words. Compared with One-Hot encoding, TF-IDF encoding and Word2Vec encoding, the advantage of
BERT encoding is that it can obtain the semantics of words and the relationship between words. The
disadvantage of BERT encoding is that it is computationally intensive and requires a large corpus.
Vector embedding
We mentioned One-Hot encoding, TF-IDF encoding, Word2Vec encoding, BERT encoding, and GPT
models. These codes and models are all a type of Embeddings embedding technology. Embeddings The
main idea of embedding technology is that the semantics of a word can be determined by its context. The
advantage of Embeddings technology is that it can obtain the semantics of words and the relationship
between words. Embeddings are the basis of natural language deep learning. Its emergence allows us to
see the infinite possibilities of natural language processing.
For the Embeddings method of text content, letʼs combine it with the previous section. You will find that
since the birth of word2vec technology, the Embeddings of text content have been continuously
37 43
https://github.com/microsoft/SemanticKernelCookBook
strengthened. From word2vec to GPT to BERT, the effect of Embeddings technology is getting better and
better. The essence of Embeddings technology is "compression", using fewer dimensions to represent
more information. The advantage of this is that it can save storage space and improve computing efficiency.
In Azure OpenAI Service, Embeddings technology is widely used to convert text strings into floating-point
vectors and measure the similarity between texts by the distance between vectors. If different industries
want to add their own data, we can use these enterprise-level data to query the vectors through the OpenAI
Embeddings - text-embedding-ada-002 model, and save them through mapping. When using them, we can
also convert the questions into vectors, and use similar Compare the algorithms to find the closest TopN
results, so as to find the enterprise content related to the problem.
We can vectorize the enterprise data through the vector database and save it, and then use the text-
embedding-ada-002 model to query through the similarity of the vectors to find the enterprise content
associated with the problem. Commonly used vector databases include Qdrant, Milvus, Faiss, Annoy,
NMSLIB, etc.
Open AI 的 Embeddings Model
Correlation of text strings with text embedding vectors from OpenAI. Embedding is usually used in the
following scenarios
Search (results sorted by relevance to query string)
Clustering (where text strings are grouped by similarity)
Recommend (recommend items with relevant text strings)
Anomaly detection (identify outliers with little correlation)
Diversity measurement (analyzing similarity distribution)
Classification (where text strings are classified by their most similar tags)
Embeddings are vectors (lists) of floating point numbers. The distance between two vectors measures their
correlation. Small distances indicate high correlation, and large distances indicate low correlation. For
example, if you have a string "dog" with an embedding of [0.1,0.2,0.3], that string is more similar to a string
"cat" with an embedding of [0.2,0.3,0.4] than to a string "cat" with an embedding of [0.9,0.8 ,0.7] the string
"car" is more relevant.
Semantic Kernel's Embeddings
The support for Embeddings in Semantic Kernel is very good. In addition to supporting text-embedding-
ada-002, it also supports vector databases. Semantic Kernel abstracts the vector database, and developers
can use a consistent API to call the vector database. This case uses Qdrant as an example. In order for
you to run the example smoothly, please install Docker first, install the Qdrant container and run it. The
running script is as follows:
38 43
https://github.com/microsoft/SemanticKernelCookBook
.NET
Add Nuget library
Reference library
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.Connectors.Qdrant;
Note: Semantic Kernel Memory component is still in the adjustment stage, so you need to pay attention to
the risk of interface changes, and you also need to ignore the following information
Python
Reference library
39 43
https://github.com/microsoft/SemanticKernelCookBook
kernel.add_text_embedding_generation_service(
"embeddings_services", AzureTextEmbedding("EmbeddingModel",
endpoint,api_key=api_key,api_version = "2023-07-01-preview")
)
Add Memory
qdrant_store = QdrantMemoryStore(vector_size=1536,
url="http://localhost",port=6333)
await qdrant_store.create_collection_async('aboutMe')
kernel.register_memory_store(memory_store=qdrant_store)
Note: The Memory component is still in the adjustment stage, so you need to pay attention to the risk of
interface changes.
Save and search your vectors in the Semantic Kernel
In Semantic Kernel, the methods of different vector data are unified through abstraction. You can easily
save and search your vectors.
.NET
Save vector data
Python
Save vector data
i = 0
for memory in memories:
i = i + 1
print(f"Top {i} Result: {memory.text} with score {memory.relevance}")
434543
41
https://github.com/microsoft/SemanticKernelCookBook
You can easily and conveniently access any vector database to complete related operations, which also
means that you can build RAG applications very simply.
Sample
.NET Sample Please click here
Python Sample Please click here
Summary
Many enterprise data enter LLMs using Embeddings to build RAG applications. Semantic Kernel gives us a
very simple way to complete related functions in both Python and .NET, so it is very helpful for those who
want to add RAG applications to their projects.
42 43
Sematic Kernel Cookbook