0% found this document useful (0 votes)
62 views

Data Exploration

Data exploration is the process of analyzing data to understand its characteristics and identify patterns. It typically involves both automatic and manual activities like data profiling, visualization, and filtering. The goal is to gain insights from the data and understand features like size, completeness, relationships between attributes. Attributes represent characteristics of data objects and come in different types like nominal, ordinal, binary, numeric, discrete and continuous. Correct identification of attribute types is important for preprocessing data.

Uploaded by

FUNअनंत
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Data Exploration

Data exploration is the process of analyzing data to understand its characteristics and identify patterns. It typically involves both automatic and manual activities like data profiling, visualization, and filtering. The goal is to gain insights from the data and understand features like size, completeness, relationships between attributes. Attributes represent characteristics of data objects and come in different types like nominal, ordinal, binary, numeric, discrete and continuous. Correct identification of attribute types is important for preprocessing data.

Uploaded by

FUNअनंत
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Data Exploration

Definition 1:

▪ It is the process of accumulating data relevant and


concerned with information about a target object or field.

▪ These characteristics will embrace or hold


• the size or quantity of information,
• completeness of the information,
• correctness of the information,
• doable or possible relationships amongst knowledge
components or files/tables within the knowledge.
Data Exploration
Definition 2:

▪ Data exploration is usually conducted employing a


combination of automatic and manual activities.

▪ Automatic activities will embrace data profiling or data


visualization or tabular report to offer the analyst initial
read into the information and an understanding of key
characteristics. Usually, it is followed by manual
drill-down or filtering of the information to spot
anomalies or patterns known through the automatic
actions.
Attribute Introduction
▪ The attribute can be defined as a field for storing the data
that represents the characteristics of a data object.
▪ The attribute is the property of the object.
▪ The attribute represents different features of the object.
▪ Examples:
✔ For a customer, object attributes can be customer Id,
address, etc.
✔ For a student, object attributes can be Roll_no, name,
marks , etc.
▪ A set of attributes used to describe a given object are
known as attribute vector or feature vector.
Types of Attribute
▪ This is the First step of Data-preprocessing.
▪ We differentiate between different types of attributes and
then preprocess the data.
▪ Types of Attribute
Types of Qualitative Attribute
1. Nominal Attributes – related to names:
▪ The values of a Nominal attribute are names of things,
some kind of symbols.
▪ Values of Nominal attributes represents some category or
state and that’s why nominal attribute also referred
as categorical attributes.
▪ There is no order (rank, position) among values of the
nominal attribute.
▪ Example:
Types of Qualitative Attribute
2. Binary Attributes
▪ Binary data has only 2 values/states. For Example yes or
no, affected or unaffected, true or false.
i. Symmetric: Both values are equally important (Gender).
▪ Example:

ii. Asymmetric: Both values are not equally important


(Result).

▪ Example:
Types of Qualitative Attribute
3. Ordinal Attributes
▪ The Ordinal Attributes contains values that have a
meaningful sequence or ranking(order) between them,
but the magnitude between values is not actually known,
the order of values that shows what is important but don’t
indicate how important it is.

▪ Example:
Types of Quantitative Attribute
1. Numeric Attributes
▪ A numeric attribute is quantitative because, it is a
measurable quantity, represented in integer or real
values. Numerical attributes are of 2 types, interval,
and ratio.
i. An interval-scaled attribute
• It has values, whose differences are interpretable, but the
numerical attributes do not have the correct reference
point, or we can call zero points.
• Data can be added and subtracted at an interval scale but
can not be multiplied or divided.
• Example: Temperature in degrees Centigrade. If a day’s
temperature of one day is twice of the other day we
cannot say that one day is twice as hot as another day.
Types of Quantitative Attribute
1. Numeric Attributes

ii. A ratio-scaled attribute

• It is a numeric attribute with a fix zero-point.

• If a measurement is ratio-scaled, we can say of a value as


being a multiple (or ratio) of another value.

• The values are ordered, and we can also compute the


difference between values, and the mean, median, mode,
Quantile-range, and Five number summary can be given.
Types of Quantitative Attribute
2. Discrete Attributes

• Discrete data have finite values it can be numerical and


can also be in categorical form.

• These attributes has finite or countable infinite set of


values.

• Example:
Types of Quantitative Attribute
3. Continuous Attributes

• Continuous data have an infinite number of states.

• Continuous data is of float type.

• There can be many values between 2 and 3

• Example:
Data Exploration Link

▪ Types of Attributes of Data - Data Exploration - Data


Mining and Business Intelligence – YouTube

▪ Statistical Description of Data - Data Exploration - Data


Mining and Business Intelligence - YouTube
▪ What is Data Mining? How Does it Work with Statistics for
Knowledge Extraction | Springboard Blog

You might also like