Ebook 2023 Glossary AI Terms
Ebook 2023 Glossary AI Terms
Intelligence Glossary
2023 Edition
Introduction
knowledge of AI.
activation function of a node defines the for finding bounding boxes in an object
output of that node given an input or set detection problem. For example, square
detection models.
A statistical way of comparing two (or
input. These notorious inputs are There are many ways to describe a
indistinguishable to the human eye, but bounding box's size and position (JSON,
cause the network to fail to identify the XML, TXT, etc) and to delineate which
A research field that lies at the intersection Describes what types of objects you are
of machine learning (ML) and computer identifying. For example, "chess pieces" or
Interface (API)
AI Algorithms
A set of commands, functions, protocols,
Accuracy Extended subset of machine learning that
and objects that programmers can use to
Refers to the percentage of correct tells the computer how to learn to operate
create software or interact with an external
predictions the classifier made. on its own through a set of rules or
system.
instructions.
Architecture What is Autonomous AI?
A specific neural network layout (layers,
The most advanced form of AI is autonomous artificial intelligence, in which
neurons, blocks, etc). These often come in
processes are automated to generate the intelligence that allows machines, bots
multiple sizes whose design is similar
and systems to act on their own, independent of human intervention. It is often
except for the number of parameters.
used in autonomous vehicles.
Artificial Intelligence This field of AI is still very new, and researchers are continually refining their
algorithms and their approaches to the problem, but it entails multiple layers.
Building a model of the constantly Finding the best path forward re quires
Artificial Neural Network shifting world requires a collection of studying the model and also importing
sensors that are usually cameras and information from other sources like
A learning model created to act like a
often controlled lighting from lasers or mapping software, weather forecasts,
human brain that solves tasks that are too
other sources. The sensors usually also traffic sensors and more.
difficult for traditional computer systems
to solve.
include position information from GPS
or some other independent mechanism.
Control
Audio Speech Recognition (ASR) After a path is chosen, any device
Fusion
must ensure that the motors and
A technology that processes human
The details from the various sensors steering work to move along the path
speech into readable text.
must be organized into a single, without being diverted by bumps or
sensor fusion algorithms must sort In general, information flows from the
Automates each step of the ML workflow
through the details and construct a top layer of the sensors down to the
so that it’s easier for users with minimal
reliable model that can be used in later control layer as decisions are made.
effort and machine learning expertise.
stages for planning. There are feedback loops, though,
makes errors.
Batch Size Boosting
B The number of training examples utilized A machine learning technique that
of model training.
Black Box AI
Batch Inference An Al system whose inputs and operations Brute Force Search
are not visible to the user. A black box, in a A search that isn't limited by clustering/
Asynchronous process that is executing
general sense, is an impenetrable system. approximations; it searches across all
predictions based on existing models and
inputs. Often more time-consuming and
observations, and then stores the output.
expensive, but more thorough.
C Checkpoint
Data that captures the state of the
variables of a model at a particular time.
Cluster
A group of observations that show
similarities to each other and are organized
Checkpoints enable exporting model by similarities.
Calibration Layer weights, performing training across
A post-prediction adjustment, typically to multiple sessions and continuing training
account for prediction bias. The adjusted past errors. Clustering
predictions and probabilities should match A method of unsupervised learning and
the distribution of an observed set
common statistical data analysis
of labels.
Class technique. In this method, observations
One of a set of enumerated target values that show similarities to each other are
for a label. For example, in a binary organized into groups (clusters).
Chatbot classification model that detects spam, the
Simulates human conversation, using two classes are spam and not spam. In a
response workflows or artificial intelligence multi-class classification model that
to interact with people based on verbal identifies dog breeds, the classes would
and written cues. Chatbots have become be poodle, beagle, pug, etc.
increasingly sophisticated in recent years
and in the future may be indistinguishable Class Balance
from humans.
The relative distribution between the
number of examples of each class used to
train a model. A model performs better if
there are a relatively even number of
examples for each class. Cognitive Computing
A computerized model that mimics the way
Classification the human brain thinks. It involves self
Process of grouping and categorizing learning through the use of data mining,
objects and ideas recognized, natural language processing, and pattern
differentiated, and understood in data. recognition.
Classifier
An algorithm that implements classification.
It refers to the mathematical function
implemented by a classification algorithm
that maps input data to a category.
Computer Vision Container Curse of Dimensionality
Field of AI that trains computers to A virtualized environment that packages its The curse of dimensionality refers to
interpret and understand the visual world. dependencies together into a portable various phenomena that arise when
Using digital images from cameras and environment. Docker is one common way analyzing and organizing data in high-
videos and deep learning models, to create containers. dimensional spaces that do not occur in
machines can accurately identify and low-dimensional settings such as the
classify objects — and then react to what three-dimensional physical space of
they “see.” Convolutional Filter everyday experience.
A convolution is a type of block that
helps a model learn information about
Concept relationships between nearby pixels. Custom Dataset
Describes an input, similar to a "tag" or A set of images and annotations pertaining
"keyword." There are two types: those that to a domain specific problem. In contrast
you specify to train a model, and those Convolutional Neural Network to a research benchmark dataset like coco
that a model assigns as a prediction. Convolutional neural networks are deep or Pascal voe.
artificial neural networks that are used
primarily to classify images (e.g. name
Confidence what they see), cluster them by similarity Custom Training
A model is inherently statistical. Along with (photo search), and perform object The process of teaching a model to make
its prediction, it also outputs a confidence recognition within scenes. certain predictions.
value that quantifies how "sure" it is that
its prediction is correct.
CoreML
A proprietary format used to encode
Confidence Threshold weights for Apple devices that takes
We often discard predictions that fall advantage of the hardware accelerated
below a certain bar. This bar is the neural engine present on iPhone and
confidence threshold. iPad devices.
to understand, but the related terminology (such as sound, activity, and text
can be… confusing. classification).
Dataset Deep Neural Network
D A collection of data and a ground truth of An artificial neural network (A NN) with
outputs that you use to train machine multiple layers between the input and
structured data is highly specific and is purposes it can be considered duplicate Taking the results of a trained model and
stored in a predefined format such as an data. Using visual search, a similarity using them to do inference on real world
spreadsheet table, whereas unstructured threshold can be set to decide what data. This could mean hosting a model on
data is a conglomeration of many varied should be removed. a server or installing it to an edge device.
being “datum”.
j
Also known as ob ect detection. A model that
j
ob ects within images or video frames.
Data Annotation
The process of collecting, organizing, ages, races and ethnicities, abilities and
cleaning, labeling, and maintaining data for disabilities, genders, religions, cultures,
Deep Learning
Data Mining
The general term for machine learning Domain Adaptation
using layered (or deep) algorithms to
The process by which patterns are A technique to improve the performance of a
learn patterns in data. It is most often
discovered within large sets of data with model where there is little data in the target
used for supervised learning problems.
the goal of extracting useful information domain by using knowledge learned by
Networks (GANs) H
A class of artificial intelligence algorithms
used in unsupervised machine learning, Hashing
implemented by a system of two neural
networks contesting with each other in a In machine learning, a mechanism for
zero-sum game framework. This technique bucketing categorical data, particularly
can generate photographs that look at when the number of categories is large,
least superficially authentic to human but the number of categories actually
observers, having many realistic appearing in the dataset is
characteristics (though in tests people can comparatively small.
tell real from generated in many cases).
Folksonomy Hidden Layer
User-generated system of classifying and
Generative AI A synthetic layer in a neural network
Models that can be trained using existing between the input layer (that is, the
organizing online content into different features) and the output layer (the
categories by the use of metadata such as content like text, audio files, or images to
create new original content. prediction). Hidden layers typically contain
electronic tags. an activation function (such as ReLU) for
training. A deep neural network contains
Framework Grid Search more than one hidden layer.
Deep learning frameworks implement Grid search is a tuning technique that
neural network concepts. Some are attempts to compute the optimal values of
designed for training and inference— hyperparameters for training models by
TensorFlow, PyTorch, FastAI, etc. And performing an exhaustive search through a
others are designed particularly for speedy subset of hyperparameters.
inference—OpenVino, TensorRT, etc.
Holdout Data Hyperparameter Image Segmentation
Examples intentionally not used during The levers by which you can tune your The process of dividing a digital image
training. The validation dataset and test model during training. These include into multiple segments with the goal of
dataset are examples of holdout data. It things like learning rate and batch size. simplifying the representation of an
helps evaluate your model's ability to You can experiment with changing image into something that is easier to
generalize to data other than the data on hyperparameters to see which ones analyze. Segmentation divides whole
which it was trained. perform best with a given model for images into pixel groupings, which can
Hosted Model
Information Retrieval
I
A set of trained weights located in the
cloud that you can receive predictions The area of Computer Science studying
A large visual database designed for Any information or data sent to a computer
software research.
Input Layer
Image Recognition The first layer (the one that receives the
actions in images.
Intelligent Character
Recognition (ICR)
Implicit Bias
Related technology to OCR designed to
Automatically making an association or
recognize handwritten characters.
assumption based on one's mental models
and developed.
J L M
Jetson Label Machine Intelligence
An edge computing device created by Assigning a class or category to a specific An umbrella term that encompasses
NVIDIA that includes an onboard GPU. object in your dataset. machine learning, deep learning and
JSON Labeling
Machine Learning
A freeform data serialization format Also known as data labeling; the
originally created as part of JavaScript process of annotating datasets to train A general term for algorithms that can
but now used much more broadly. Many machine learning models. learn patterns from existing data and use
A common data science tool that process itself as well as written definitions A language model that predicts the
enables you to execute Python code and a multitude of visual examples for probability of candidate tokens to fill in
K
workflows to label images and video at object, a component of an object, or a
model.
Multimodal Model Named Entity Recognition
Pre-trained Model
generally using another data set. (for Used to model the probability of different
example, finding lines, corners, and outcomes in a process that cannot easily
large dataset like the huge Common random variables. It’s a technique used to
Objects in Context (COCO), which has understand the impact of risk and
objects to detect, can reduce the working on nuclear weapons in the 1940s,
number of custom images you need to and was given the code name “Monte
A model that uses observations for each concept, this model indicates
probability that a different sample or coloring book) of regions for each Classification problems that distinguish
remainder of the population will concept. between more than two classes. For
exhibit the same behavior or have example, there are approximately 53 species
Software that is installed and runs on The selection of the best element (with P
computers located on the premises of the regard to some criterion) from some set of
have to be in a similar domain as your Paying people to annotate, or label, your A branch of machine learning that focuses
training example. data. Its effectiveness can depend on the on the recognition of patterns and
domain expertise of annotators. Providing regularities in data, although it is in some
a comprehensive labeling criteria is crucial cases considered to be nearly synonymous
Open Neural Network
for training annotators before beginning
with machine learning.
Exchange (ONNX) a project.
Recognition (OCR)
Overfitting
A computer system that takes images of Precision
typed, handwritten, or printed text and A machine learning problem where an
Indicator of a machine learning model's
converts them into machine-readable text. algorithm is unable to discern information
performance – the quality of a positive
that is relevant to its assigned task from
prediction made by the model. Refers to
information which is irrelevant within
the number of true positives divided by the
training data. Overfitting inhibits the
total number of positive predictions.
algorithm's predictive performance when
to precision. PyTorch
Regression
A popular open source deep learning
framework developed by Facebook.
A statistical measure used to determine
Prediction the strength of the relationships between
It focuses on accelerating the path
An attempt by a model to replicate the from research prototyping to
dependent and independent variables.
ground truth. A prediction usually contains production deployment.
a confidence value for each class.
Reinforcement Learning Specificity
A type of machine learning in which S The rate of how often a model predicts
machines are "taught" to achieve their “no,” when it’s actually “no.”
reinforcement when they do not. This is needs. If the query itself is a piece of visual to one of a fixed set of categories. In
differentiated from supervised learning, content then that is what is known as a machine learning, this is often achieved by
which would require an annotation for every "visual search query." learning a function that maps an input to a
individual action the algorithmwould take. score for each potential category.
Selective Filtering
Strong AI
When a model ignores "noise" to focus on
classify image inputs it trains two neural Data that resides in a fixed field within a
In the context of artificial neural networks,
networks that learn simultaneously to find file or record. Structured data is typically
the ReLU (rectified linear unit) activation
similarity between images. stored in a relational database. It can
function is an activation function which
consist of numbers and text, and sourcing
outputs the same as its input if the input is
can happen automatically or manually, as
positive, and zero if the input is negative. A
Signal long as it's within an RDBMS structure.
related function is the leaky rectified linear
Inputs, information, data.
unit (leaky rectified linear unit) which
Responsible AI A set of software development tools that defined by its use of labeled datasets.
allows for the creation of applications on a These datasets are designed to train or
Umbrella term for aspects of making
specific platform. "supervise" algorithms into classifying data
appropriate business and ethical choices
or predicting outcomes accurately. Using
when adopting AI, including business and
labeled inputs and outputs, the model can
societal value, risk, trust, transparency,
measure its accuracy and learn over time.
fairness, bias mitigation, explainability,
regulatory compliance.
Symbiotic Intelligence Train
fully in control. In essence, a taxonomy is a model’s learning algorithms. Models create and
worldview, or the framework for how your '
refine their rules using this data. It s a set
model sees its training data. In practice, it’s of data samples used to fit the parameters
Synthetic Intelligence
a list of visually-distinct model concepts of a machine learning model to training it
Synthetic intelligence (SI) is an alternative and the definitions of those concepts. by example.
can be a genuine form of intelligence. An Data recorded at different points in time. Transferring information from one machine
analogy can be made with simulated learning task to another. It might involve
diamonds (such as cubic zirconia) versus transferring knowledge from the solution
TensorFlow of a simpler task to a more complex one, or
synthetic diamonds (real diamonds made
of carbon created by humans). An open-source software library also used involve transferring knowledge from a task
for machine learning applications such as where there is more data to one where
sequence-to-sequence tasks.
Torch
True Positives
A scientific computing framework with
Actual positives that are correctly
wide support for machine learning
identified as actual “Yes” or predicted
algorithms, written in C and lua.
“Yes.”
True Negatives Unsupervised Learning Variance
Actual negatives that are correctly Uses machine learning algorithms to The error due to sensitivity to fluctuations
identified as an actual “No” or predicted analyze and cluster unlabeled datasets. in the training set computed as the
“No.” These algorithms discover hidden patterns expectation of the squared deviation of a
or data groupings without the need for random variable from its mean.
human intervention. Its ability to discover
Turing Test similarities and differences in information
A test developed by Alan Turing in 1950, make it the ideal solution for exploratory Verify/Verification
used to identify true artificial intelligence.
data analysis, cross-selling strategies, The process of verifying that labeled data
It tested a machine’s ability to exhibit customer segmentation, and image has been labeled correctly in adherence to
intelligent behavior equivalent to, or recognition. the ground truth.
indistinguishable from, that of a human.
V
Video Frame Interpolation
Is to synthesize several frames in the
middle of two adjacent frames of video.
Visual Recognition
unbiased evaluation of a model fit on the
Unstructured Data training dataset while tuning model
The ability of software to identify objects,
places, people, writing, and actions in
Unstructured data is information that hyperparameters. The evaluation becomes images and videos.
either does not have a pre-defined data more biased as skill on the validation
model or is not organized in a pre-defined dataset is incorporated into the model
manner. Unstructured data may include configuration.
documents, images, video and audio.
Visual Match Width
About Clarifai
Clarifai is the leading deep learning AI platform for computer vision, natural language processing, and automatic speech
recognition. We help enterprises and public sector organizations transform unstructured images, video, text, and audio data into
structured data, significantly faster and more accurately than humans would be able to do on their own. Founded in 2013 by Matt
Zeiler, Ph.D., Clarifai has been a market leader in computer vision AI since winning the top five places in image classification at
the 2013 ImageNet Challenge. Clarifai, headquartered in Wilmington, DE, is continuing to grow with more than 90 employees in
North America and Europe. For more information, please visit: www.clarifai.com.
www.clarifai.com