
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Select First and Last Row Based on Group Column in R Data Frame
Extraction of data is necessary in data analysis because extraction helps us to keep the important information about a data set. This important information could be the first row and the last row of groups as well, also we might want to use these rows for other type of analysis such as comparing the initial and last data values among groups. We can extract or select the first and last row based on group column by using slice function of dplyr package.
Example
Consider the below data frame: > x1<-rep(1:4,each=10) > x2<-rpois(40,5) > df1<-data.frame(x1,x2) > head(df1,12)
Output
x1 x2 1 1 3 2 1 4 3 1 6 4 1 6 5 1 3 6 1 4 7 1 7 8 1 8 9 1 7 10 1 2 11 2 8 12 2 7
Example
> tail(df1,12)
Output
x1 x2 29 3 4 30 3 5 31 4 4 32 4 6 33 4 7 34 4 5 35 4 5 36 4 4 37 4 9 38 4 4 39 4 3 40 4 6
Loading dplyr package −
> library(dplyr) Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’ −
filter, lag
The following objects are masked from ‘package:base’ −
intersect, setdiff, setequal, union
Selecting first and last row based on group column x1 −
Example
> df1%>%group_by(x1)%>%slice(c(1,n())) # A tibble: 8 x 2 # Groups: x1 [4]
Output
x1 x2 <int> <int> 1 1 3 2 1 2 3 2 8 4 2 4 5 3 5 6 3 5 7 4 4 8 4 6
Let’s have a look at another example −
Example
> y1<-rep(c("A","B","C"),each=10) > y2<-rnorm(30) > df2<-data.frame(y1,y2) > head(df2,12)
Output
y1 y2 1 A -1.1640927 2 A 0.3146504 3 A -1.5213974 4 A -1.3728970 5 A -0.9964678 6 A -0.5022738 7 A -0.4225463 8 A -0.3501037 9 A 0.3043838 10 A -1.5216102 11 B -0.2425732 12 B 0.5554217
Example
> tail(df2,12)
Output
y1 y2 19 B 0.30172320 20 B 1.68341427 21 C 0.55127997 22 C -1.77840803 23 C 0.03001296 24 C -1.19246335 25 C 0.03612258 26 C -0.35468216 27 C -0.63579743 28 C -1.90074403 29 C 0.50072577 30 C 0.31911138
Example
> df2%>%group_by(y1)%>%slice(c(1,n())) # A tibble: 6 x 2 # Groups: y1 [3]
Output
y1 y2 <fct> <dbl> 1 A -1.16 2 A -1.52 3 B -0.243 4 B 1.68 5 C 0.551 6 C 0.319
Advertisements