Generating Word Cloud in R Programming
Last Updated :
07 May, 2024
Word Cloud is a data visualization technique used for representing text data in which the size of each word indicates its frequency or importance. Significant textual data points can be highlighted using a word cloud. Word clouds are widely used for analyzing data from social network websites.
Why Word Cloud?
The reasons one should use word clouds to present the text data are:
- Word clouds add simplicity and clarity. The most used keywords stand out better in a word cloud
- Word clouds are a potent communication tool. They are easy to understand, to be shared, and are impactful.
- Word clouds are visually engaging than a table data.
Implementation in R
Here are steps to create a word cloud in R Programming.
Step 1: Create a Text File
Copy and paste the text in a plain text file (e.g:file.txt) and save the file.
Step 2: Install and Load the Required Packages
Python3
# install the required packages
install.packages("tm") # for text mining
install.packages("SnowballC") # for text stemming
install.packages("wordcloud") # word-cloud generator
install.packages("RColorBrewer") # color palettes
# load the packages
library("tm")
library("SnowballC")
library("wordcloud")
library("RColorBrewer")
Step 3: Text Mining
- Load the Text:
The text is loaded using Corpus() function from text mining(tm) package. Corpus is a list of a document.
- Start by importing text file created in step 1:
To import the file saved locally in your computer, type the following R code. You will be asked to choose the text file interactively.
Python3
text = readLines(file.choose())
Load the data as a corpus:
Python3
# VectorSource() function
# creates a corpus of
# character vectors
docs = Corpus(VectorSource(text))
Text transformation:
Transformation is performed using tm_map() function to replace, for example, special characters from the text like "@", "#", "/".
Python3
toSpace = content_transformer
(function (x, pattern)
gsub(pattern, " ", x))
docs1 = tm_map(docs, toSpace, "/")
docs1 = tm_map(docs, toSpace, "@")
docs1 = tm_map(docs, toSpace, "#")
- Cleaning the Text:
The tm_map() function is used to remove unnecessary white space, to convert the text to lower case, to remove common stopwords. Numbers can be removed using removeNumbers.
Python3
# Convert the text to lower case
docs1 = tm_map(docs1,
content_transformer(tolower))
# Remove numbers
docs1 = tm_map(docs1, removeNumbers)
# Remove white spaces
docs1 = tm_map(docs1, stripWhitespace)
Step 4: Build a term-document Matrix
Document matrix is a table containing the frequency of the words. Column names are words and row names are documents. The function TermDocumentMatrix() from text mining package can be used as follows.
Python3
dtm = TermDocumentMatrix(docs)
m = as.matrix(dtm)
v = sort(rowSums(m), decreasing = TRUE)
d = data.frame(word = names(v), freq = v)
head(d, 10)
Step 5: Generate the Word Cloud
The importance of words can be illustrated as a word cloud as follows.
Python3
wordcloud(words = d$word,
freq = d$freq,
min.freq = 1,
max.words = 200,
random.order = FALSE,
rot.per = 0.35,
colors = brewer.pal(8, "Dark2"))
The complete code for the word cloud in R is given below.
Python3
# R program to illustrate
# Generating word cloud
# Install the required packages
install.packages("tm") # for text mining
install.packages("SnowballC") # for text stemming
install.packages("wordcloud") # word-cloud generator
install.packages("RColorBrewer") # color palettes
# Load the packages
library("tm")
library("SnowballC")
library("wordcloud")
library("RColorBrewer")
# To choose the text file
text = readLines(file.choose())
# VectorSource() function
# creates a corpus of
# character vectors
docs = Corpus(VectorSource(text))
# Text transformation
toSpace = content_transformer(
function (x, pattern)
gsub(pattern, " ", x))
docs1 = tm_map(docs, toSpace, "/")
docs1 = tm_map(docs, toSpace, "@")
docs1 = tm_map(docs, toSpace, "#")
strwrap(docs1)
# Cleaning the Text
docs1 = tm_map(docs1, content_transformer(tolower))
docs1 = tm_map(docs1, removeNumbers)
docs1 = tm_map(docs1, stripWhitespace)
# Build a term-document matrix
dtm = TermDocumentMatrix(docs)
m = as.matrix(dtm)
v = sort(rowSums(m),
decreasing = TRUE)
d = data.frame(word = names(v),
freq = v)
head(d, 10)
# Generate the Word cloud
wordcloud(words = d$word,
freq = d$freq,
min.freq = 1,
max.words = 200,
random.order = FALSE,
rot.per = 0.35,
colors = brewer.pal(8, "Dark2"))
Output:


Advantages of Word Clouds
- Analyzing customer and employee feedback.
- Identifying new SEO keywords to target.
- Word clouds are killer visualisation tools. They present text data in a simple and clear format
- Word clouds are great communication tools. They are incredibly handy for anyone wishing to communicate a basic insight
Drawbacks of Word Clouds
- Word Clouds are not perfect for every situation.
- Data should be optimized for context.
- Word clouds typically fail to give the actionable insights that needs to improve and grow the business.
Similar Reads
Hello World in R Programming When we start to learn any programming languages we do follow a tradition to begin HelloWorld as our first basic program. Here we are going to learn that tradition. An interesting thing about R programming is that we can get our things done with very little code. Before we start to learn to code, le
2 min read
How to Code in R programming? R is a powerful programming language and environment for statistical computing and graphics. Whether you're a data scientist, statistician, researcher, or enthusiast, learning R programming opens up a world of possibilities for data analysis, visualization, and modeling. This comprehensive guide aim
4 min read
Functions in R Programming A function accepts input arguments and produces the output by executing valid R commands that are inside the function. Functions are useful when we want to perform a certain task multiple times.In R Programming Language when we are creating a function the function name and the file in which we are c
5 min read
File Handling in R Programming In R programming, handling files (such as reading, writing, creating, and renaming files) can be done using built-in functions available in the base R package. These operations help in managing data stored in files, which is essential for tasks like data analysis, data manipulation, and automation.
3 min read
R6 Classes in R Programming In Object-Oriented Programming (OOP) of R Language, encapsulation means binding the data and methods inside a class. The R6 package is an encapsulated OOP system that helps us use encapsulation in R. R6 package provides R6 class which is similar to the reference class in R but is independent of the
3 min read