How to Create Categorical Variables in R?
Last Updated :
19 Dec, 2021
In this article, we will learn how to create categorical variables in the R Programming language.
In statistics, variables can be divided into two categories, i.e., categorical variables and quantitative variables. The variables which consist of numerical quantifiable values are known as quantitative variables and a categorical variable is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or nominal category on the basis of some qualitative property.
Method 1: Categorical Variable from Scratch
To create a categorical variable from scratch i.e. by giving manual value for each row of data, we use the factor() function and pass the data column that is to be converted into a categorical variable. This factor() function converts the quantitative variable into a categorical variable by grouping the same values together.
Syntax:
df$categorical_variable <- factor( categorical_vector )
where
- df: determines the data frame.
- categorical_variable: determines the final column variable which will contain categorical data.
- categorical_vector: is the vector that has to be converted.
Example:
Here, is a basic data frame where a new column group is added as a categorical variable.
R
# create sample data frame
df <- data.frame(x=c(10, 23, 13, 41, 15),
y=c(71, 17, 28, 32, 12))
# create categorical vector
group_vector <- c('A','B','C','D','E')
# Add categorical variable to the data frame
df$group <- factor(group_vector)
# print data frame
df
Output:
x y group
1 10 71 A
2 23 17 B
3 13 28 C
4 41 32 D
5 15 12 E
Method 2: Categorical Variable from the Existing column using two values
To create a categorical variable from the existing column, we use an if-else statement within the factor() function and give a value to a column if a certain condition is true otherwise give another value.
Syntax:
df$categorical_variable <- as.factor( ifelse(condition, val1, val2) )
where
- df: determines the data frame.
- categorical_variable: determines the final column variable which will contain categorical data.
- condition: determines the condition to be checked, if the condition is true, use val1 otherwise val2.
Example:
Here, is a basic data frame where a new column group is added as a categorical variable from an if-else condition.
R
# create sample data frame
df <- data.frame(x=c(10, 23, 13, 41, 15),
y=c(71, 17, 28, 32, 12))
# Add categorical variable to the data frame
df$group <- as.factor(ifelse(df$x >20, 'A', 'B'))
# print data frame
df
Output:
x y group
1 10 71 B
2 23 17 A
3 13 28 B
4 41 32 A
5 15 12 B
Method 3: Categorical Variable from the Existing column using multiple values
To create a categorical variable from the existing column, we use multiple if-else statements within the factor() function and give a value to a column if a certain condition is true, if none of the conditions are true we use the else value of the last statement.
Syntax:
df$categorical_variable <- as.factor( ifelse(condition, val,ifelse(condition, val,ifelse(condition, val, ifelse(condition, val, vale_else)))))
where
- df: determines the data frame.
- categorical_variable: determines the final column variable which will contain categorical data.
- condition: determines the condition to be checked, if the condition is true, use val.
- val_else: determines the value if no condition is true.
Example:
Here, is a basic data frame where a new column group is added as a categorical variable from multiple if-else conditions.
R
# create sample data frame
df <- data.frame(x=c(10, 23, 13, 41, 15, 11, 23, 45, 95, 23, 75),
y=c(71, 17, 28, 32, 12, 13, 41, 15, 11, 23, 34))
# Add categorical variable to the data frame
df$group <- as.factor(ifelse(df$x<20, 'A',
ifelse(df$x<30, 'B',
ifelse(df$x<50, 'C',
ifelse(df$x<90, 'D', 'E')))))
# print data frame
df
Output:
x y group
1 10 71 A
2 23 17 B
3 13 28 A
4 41 32 C
5 15 12 A
6 11 13 A
7 23 41 B
8 45 15 C
9 95 11 E
10 23 23 B
11 75 34 D
Similar Reads
How to Add Variables to a Data Frame in R In data analysis, it is often necessary to create new variables based on existing data. These new variables can provide additional insights, support further analysis, and improve the overall understanding of the dataset. R, a powerful tool for statistical computing and graphics, offers various metho
5 min read
How to Plot Categorical Data in R? In this article, we will be looking at different plots for the categorical data in the R programming language. Categorical Data is a variable that can take on one of a limited, and usually fixed, a number of possible values, assigning each individual or other unit of observation to a particular grou
3 min read
How to Create Tables in R? In this article, we will discuss how to create tables in R Programming Language. Method 1: Create a table from scratch We can create a table by using as.table() function, first we create a table using matrix and then assign it to this method to get the table format. Syntax: as.table(data) Example: I
2 min read
How to create an array in R The array is the fundamental data structure in R used to store multiple elements of the same data type. In this article, we will explore two different approaches to creating an array in R Programming Language. Creating an array in RBelow are the approaches for creating an array in R. Using array() f
4 min read
How do you create a factor variable in R In R programming Language factor variables are a fundamental data type for categorical data. Factor variables, unlike numeric or character variables, reflect defined categories, making them useful for a variety of statistical analysis and data modeling applications. What are factor variables?Factor
3 min read
How to Create, Rename, Recode and Merge Variables in R Variable manipulation is a key part of working with data in the R Programming Language. These actions, whether they involve adding new variables, renaming old ones, recoding them, or merging them together, are critical for every data analysis process. In this article, we'll delve into the intricacie
3 min read