Data Manipulation in R with data.table Last Updated : 24 Jun, 2025 Comments Improve Suggest changes Like Article Like Report data.table in R is a package used for handling and manipulating large datasets. It allows for fast data processing, such as creating, modifying, grouping and summarizing data and is often faster than other tools like dplyr for big data tasks.1. Creating and Sub-Setting DataWe can either convert existing data frames or create a new data.table object directly using data.table package. R library(data.table) DT <- data.table(x = c(1,2,3,4), y = c("A", "B", "C", "D"), z = c(TRUE, FALSE, TRUE, FALSE)) print(DT) subset_DT <- DT[x > 2] print(subset_DT) Output:Output2. Grouping the DataWe can group data by columns and perform calculations like sums, averages, etc., on those groups. R grouped_DT <- DT[, sum(x), by = y] print(grouped_DT) Output:Output3. Joining the DataWe can merge datasets, like performing an inner join on a common column. R DT2 <- data.table(y = c("A", "B", "C", "D"), v = c("alpha", "beta", "gamma", "delta")) inner_join_DT <- DT[DT2, on = "y"] print(inner_join_DT) Output:Output4. Modifying the DataWe can modify data by adding, updating or replacing columns. R DT[, x_squared := x^2] print(DT) Output:Output5. Comparison with dplyr PackageWhile the dplyr package is common, data.table is often faster for large datasets. We can use microbenchmark to compare execution times. R if (!require(microbenchmark)) { install.packages("microbenchmark") } library(microbenchmark) library(dplyr) dplyr_time <- microbenchmark( .dplyr <- DT %>% filter(x > 2) %>% group_by(y) %>% summarise(sum_x = sum(x)), times = 10 ) print(dplyr_time) data.table_time <- microbenchmark( .data.table <- DT[x > 2, sum(x), by = y], times = 10 ) print(data.table_time) Output:OutputThe output displays the execution time of the dplyr and data.table operations, including the minimum, median and maximum times across 10 runs. Comment More infoAdvertise with us Next Article data.table vs data.frame in R Programming A anitha_priyanka Follow Improve Article Tags : R Language R-basics Similar Reads data.table vs data.frame in R Programming data.table in R is an enhanced version of the data.frame. Due to its speed of execution and the less code to type it became popular in R. The purpose of data.table is to create tabular data same as a data frame but the syntax varies. In the below example let we can see the syntax for the data table: 3 min read Add Multiple New Columns to data.table in R In this article, we will discuss how to Add Multiple New Columns to the data.table in R Programming Language. To do this we will first install the data.table library and then load that library. Syntax: install.packages("data.table") After installing the required packages out next step is to create t 3 min read Data Science Tutorial with R Data Science is a field that combines statistics, computer science and subject knowledge to find useful insights from both organized and unorganized data, helping turn information into practical decisions. In this tutorial, we will explore how the data science process is implemented in an R console 3 min read Convert dataframe to data.table in R In this article, we will discuss how to convert dataframe to data.table in R Programming Language. data.table is an R package that provides an enhanced version of dataframe. Characteristics of data.table :Â data.table doesnât set or use row namesrow numbers are printed with a : for better readabilit 5 min read What Does .SD Stand for in data.table in R? data.table is a popular package in R for data manipulation, offering a high-performance version of data frames with enhanced functionality. One of the key features data.table is its special symbol .SD, which stands for "Subset of Data." This article will explore the theory behind .SD, its usage, and 4 min read How to Create Pivot Tables in R? In this article, we will discuss how to create the pivot table in the R Programming Language. The Pivot table is one of Microsoft Excel's most powerful features that let us extract the significance from a large and detailed data set. A Pivot Table often shows some statistical value about the dataset 2 min read Like