CLASS – XII (AI)
NOTES
UNIT 4: AI with Orange Data Mining Tool
Summary
This unit introduces students to the Orange Data
Mining tool, emphasizing its intuitive visual
programming interface and component-based
approach.
Students learn to use its diverse widgets for data
visualization, preprocessing, feature selection,
modeling, and evaluation across three domains:
Data Science, Computer Vision, and Natural
Language Processing (NLP).
Aim: To provide practical insights into its real-world
applications.
4.1. What is Data Mining?
Definition: The process of discovering trends, useful
information, and patterns from large datasets.
It involves analyzing and interpreting data to extract
meaningful insights that aid decision-making.
4.2. Introduction to Orange Data Mining Tool
Orange: A component-based visual programming
software package for data visualization, machine
learning, data mining, and analysis.
Widgets: Components within Orange that provide
functionalities from basic data visualization to
advanced modeling and evaluation.
Visual Programming: Workflows are created by
interconnecting widgets in an intuitive drag-and-
drop interface.
4.3. Beneficiaries of Orange Data Mining
Data Analysts and Scientists: User-friendly interface,
works even without programming skills.
Researchers: Tools for exploring and analyzing
research data, testing hypotheses, and generating
insights.
Educators and Students: Simplifies complex topics
with visual programming.
Business Professionals: Supports trend analysis,
customer behavior prediction, and process
optimization.
Open-Source Community: Orange is open-source,
meaning its code is freely available.
4.4. Getting Started with Orange Tool
Installation: Download installer (Standalone for
Windows, Apple Silicon for Mac) from
orangedatamining.com/download/.
Launch: Open Orange from system’s applications
menu after installation.
4.5. Components of Orange Data Mining Tool
1. Blank Canvas
o Workspace for building analysis workflows by
dragging and dropping widgets.
o Allows adding, rearranging, and connecting
widgets to form a data pipeline.
2. Widgets
o Graphical elements for specific data tasks (e.g.,
File, Data Table, Scatter Plot, Tree).
o Categories: Data, Transform, Visualize, Model,
Evaluate, Unsupervised.
3. Connectors
o Lines that link widgets together.
o Show data flow from one widget to another.
4.6. Default Widget Catalog
Data Widgets: File, Data Table, SQL Table.
Transform Widgets: Perform data transformation.
Visualize Widgets: Scatter plots, bar charts,
heatmaps.
Model Widgets: Apply ML algorithms (classification,
regression, clustering).
Evaluate Widgets: Evaluate performance (cross-
validation, confusion matrices).
Unsupervised Widgets: Exploratory analysis
(clustering, dimensionality reduction).
4.7. Key Domains of AI with Orange
4.7.1. Data Science with Orange
Focus: Data visualization and classification models.
Example: Exploring Iris Flower Dimensions (Data
Visualization)
1. Launch Orange software.
2. Drag File widget → load iris dataset.
3. Change "iris" column role to Target.
4. Connect Data Table widget to display dataset.
5. Connect Scatter Plot widget to visualize variables.
4.7.1.1. Classification
Goal: Classify iris flowers (Setosa, Versicolor,
Virginica) using sepal & petal dimensions.
Steps:
1. Prepare testing dataset (spreadsheet of iris
measurements).
2. Add Tree widget (classification algorithm) →
connect to training dataset.
3. Add Predictions widget.
4. Connect training File widget to Predictions.
5. Add second File widget (testing data) → connect
to Predictions.
6. Interpret results in Predictions (class labels,
accuracy).
4.7.1.2. Evaluating the Classification Model
Importance: Assess performance with accuracy,
precision, recall, F1 score, confusion matrix.
Cross-Validation: Resampling method (Orange
default = 10-fold cross-validation).
Steps:
1. Connect training File widget → Test and Score
widget.
2. Test and Score calculates metrics.
3. Interpret metrics: Accuracy, Precision, Recall,
F1 Score.
4. Add Confusion Matrix widget → connect to Test
and Score.
5. Analyze confusion matrix (TP, TN, FP, FN).
Exam Focus: Role of Test and Score and Confusion
Matrix in evaluation.
4.7.2. Computer Vision with Orange
Focus: Extracting insights from image data.
Example: Clustering Images of Dogs and Cats
1. Install Image Analytics Add-On from Options > Add-
ons.
2. Drag Import Images widget → upload dataset folder.
3. Connect Image Viewer widget to visualize images.
4. Connect Image Embedding widget → converts
images to numerical vectors.
5. Connect Distance widget (cosine similarity).
6. Connect to Hierarchical Clustering widget.
7. Double-click to view dendrogram, visualize clusters
(dogs grouped, cats grouped).
4.7.3. Natural Language Processing (NLP) with Orange
Focus: Understanding text and extracting patterns.
Steps:
1. Install Text Add-On.
2. Load/create textual data using Corpus or Create
Corpus widget.
3. Connect Corpus Viewer to browse/search text.
4. Connect Word Cloud widget → visualize word
frequencies.
5. Preprocess Text widget → noise removal,
normalization.
o Convert text to lowercase.
o Tokenization (split into words).
o Remove punctuation.
o Remove stop words (the, is, etc.).
o Optional: stemming/lemmatization.
6. Connect cleaned output → Word Cloud widget for
better results.
Exam Focus: Text preprocessing steps:
Normalization, Tokenization, Stop word removal.
Summary / Key Takeaways for Unit 4
Orange Data Mining is an open-source, visual
programming tool for data visualization, ML, and
data analysis.
Core components: Blank Canvas, Widgets,
Connectors.
Supports AI domains:
o Data Science: Visualization, classification (Tree
+ Predictions), evaluation (Test and Score,
Confusion Matrix).
o Computer Vision: With Image Analytics add-on
(Import Images, Embedding, Distance,
Hierarchical Clustering).
o NLP: With Text add-on (Corpus, Viewer, Word
Cloud, Preprocess Text).
Makes complex AI tasks accessible to users without
coding expertise.