Open In App

Skewness in R Programming

Last Updated : 14 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Skewness is a statistical numerical method to measure the asymmetry of the distribution or data set. It tells about the position of the majority of data values in the distribution around the mean value. It is essential to several disciplines, including data analysis, social sciences, economics and finance.

  • Positive Skewness (Right-Skewed): The tail on the right side of the distribution is longer or fatter. Mean > Median.
  • Negative Skewness (Left-Skewed): The tail on the left side is longer or fatter. Mean < Median.
  • Zero Skewness (Symmetrical): The distribution is symmetrical, Mean = Median.

Mathematical Definition of Skewness

\text{Skewness} = \frac{n}{(n-1)(n-2)} \sum_{i=1}^{n} \left(\frac{x_i - \bar{x}}{s}\right)^3

Where:

  • n is the number of observations,
  • xi​ is the i-th observation.
  • \bar{x} is the sample mean.,
  • s is the sample standard deviation.

A skewness value:

Skewness > 0 indicates positive skewness.
Skewness < 0 indicates negative skewness.
Skewness = 0 indicates no skewness (symmetrical distribution).

How to Calculate Skewness in R

R provides multiple ways to calculate skewness, including base R functions, specialized packages and custom implementations.

1. Using the e1071 Package

The e1071 package provides a straightforward function to calculate skewness.

R
install.packages("e1071")
library(e1071)

data <- c(2, 3, 5, 6, 8, 9, 12, 15, 18, 21)

skewness_value <- skewness(data)
print(skewness_value)

Output:

[1] 0.3880299

2. Using the moments Package

Another popular package for calculating skewness is moments.

R
install.packages("moments")
library(moments)

skewness_value <- skewness(data)
print(skewness_value)

Output:

[1] 0.454466

This method is better for small sample.

3. Base R Implementation

While base R does not have a built-in skewness function, you can calculate it manually:

R
n <- length(data)
mean_data <- mean(data)
sd_data <- sd(data)
skewness_value <- (n * sum((data - mean_data)^3)) / ((n - 1) * (n - 2) * sd_data^3)
print(skewness_value)

Output:

[1] 0.5389304

Types of Skewness in R

Now we will discuss 3 types of skewness values on the basis of which the asymmetry of the graph is decided. These are as follows:

Positive Skewness in R

Positive skewness refers to distributions where the tail extends towards higher values. In such cases, the mean is greater than the median

R
# Required for skewness() function
library(moments)

x <- c(40, 41, 42, 43, 50)

hist(x)

Output:

Positive Skewness

Negatively Skewness in R

Negative skewness refers to distributions where the tail extends towards lower values. The mean is less than the median in such cases.

R
library(moments)

x <- c(10, 11, 21, 22, 23, 25)
hist(x)

Output:

negatively-skewness
Negatively Skewness

A histogram showing negative skewness, with a tail extending towards lower values.

Zero Skewness in R

Zero skewness indicates a symmetrical distribution, where the data is balanced around the mean and both mean and median are equal.

R
library(moments)

# Defining normally distributed data vector
x <- rnorm(50, 10, 10)
hist(x)

Output: 

Zero-Skewness
Zero Skewness or Symmetric

Importance of Skewness

Here are the some of the main Importance of Skewness.

  • Normality Assumption: Many statistical methods, like t-tests and ANOVA, assume data is normally distributed. Skewness can indicate departures from normality.
  • Impact on Mean and Median: In skewed distributions, the mean is pulled towards the tail, making the median a better measure of central tendency.
  • Interpretation in Data Analysis: Understanding the skewness of your data can influence decisions in model selection, data transformation and interpretation of results.

Related Article:


Next Article
Article Tags :

Similar Reads