
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Remove Underscore from Column Names of an R Data Frame
When we import data from outside sources then the header or column names might be imported with underscore separated values and this is also possible if the original data has the same format. Therefore, to make the headers shorter and look better we would prefer to remove the underscore sign and this can be easily done with the help of gsub function.
Consider the below data frame −
Example
x_1<-sample(1:10,20,replace=TRUE) x_2<-sample(1:10,20,replace=TRUE) x_3<-sample(1:10,20,replace=TRUE) x_4<-sample(1:10,20,replace=TRUE) x_5<-sample(1:10,20,replace=TRUE) df1<-data.frame(x_1,x_2,x_3,x_4,x_5) df1
Output
x_1 x_2 x_3 x_4 x_5 1 10 4 6 5 10 2 6 10 2 1 4 3 9 9 6 1 4 4 6 1 5 5 8 5 7 7 4 7 4 6 1 5 2 1 8 7 8 5 5 2 9 8 8 4 1 9 8 9 8 1 7 4 3 10 5 9 3 10 3 11 2 7 5 6 9 12 10 1 4 1 5 13 8 10 10 1 2 14 3 10 5 7 6 15 5 6 9 1 10 16 3 8 6 4 7 17 8 9 5 7 2 18 6 10 5 6 8 19 1 8 3 2 9 20 8 1 5 10 5
Removing underscore from column names −
Example
names(df1)<-gsub("\_","",names(df1)) df1
Output
x1 x2 x3 x4 x5 1 6 8 2 9 6 2 1 9 3 4 10 3 2 1 8 10 10 4 4 10 3 6 1 5 10 6 6 6 5 6 9 4 6 6 2 7 3 9 10 5 9 8 8 1 5 3 8 9 4 9 2 5 6 10 9 3 3 5 4 11 7 1 4 6 3 12 10 6 3 3 1 13 7 6 10 10 8 14 9 6 4 1 1 15 7 5 10 2 1 16 1 3 7 4 8 17 2 1 7 2 8 18 1 10 8 2 3 19 8 7 6 6 10 20 3 8 9 8 3
Let’s have a look at another example −
Example
y_1<-rnorm(20) y_2<-rnorm(20,2,1) y_3<-rnorm(20,2,0.5) y_4<-rnorm(20,2,0.0003) y_5<-rnorm(20,10,1) df2<-data.frame(y_1,y_2,y_3,y_4,y_5) df2
Output
y_1 y_2 y_3 y_4 y_5 1 0.514450792 2.4374182 3.230083 1.999826 12.625661 2 -0.312792686 0.8350701 2.769788 1.999740 8.699441 3 -0.710758168 2.7832089 1.971917 2.000519 8.430542 4 -0.060647019 1.4626953 1.971298 2.000600 9.568890 5 2.363567996 0.8239008 2.626454 2.000266 10.038633 6 1.227010669 2.6716199 1.844929 1.999768 7.838243 7 -0.994717233 1.1798125 2.084188 1.999643 11.254072 8 2.584374114 1.6053897 2.453163 2.000089 11.256447 9 0.863363636 1.0685646 1.457286 2.000659 11.001834 10 -0.190736476 1.4468239 1.829696 2.000229 10.425032 11 0.716178594 2.7498080 2.406190 1.999487 9.906237 12 -1.670744103 1.1184815 2.206973 2.000288 8.993506 13 1.011970392 2.7794836 2.560877 2.000160 12.564313 14 -0.099591556 1.5176429 1.841669 2.000175 12.050816 15 3.230713917 1.8450534 2.065576 2.000189 9.243683 16 0.734370382 0.8649671 1.550325 2.000698 10.320533 17 1.156661539 3.8099910 2.842250 1.999826 10.134682 18 -0.496844480 2.0082680 1.456640 2.000119 10.498172 19 -0.001995988 1.7054230 2.702496 1.999963 8.572382 20 -0.190562902 2.6200714 1.822893 1.999612 9.683227
Removing underscore from column names −
Example
names(df2)<-gsub("\_","",names(df2)) df2
Output
y1 y2 y3 y4 y5 1 0.35283126 2.7403674 1.5855939 1.999599 10.615962 2 2.04048363 1.7570445 1.9365559 1.999934 10.734033 3 -0.99194313 1.9299296 3.4318183 2.000200 8.821012 4 0.03923376 2.8984508 1.3765896 1.999948 8.371278 5 0.48921437 1.7272755 2.0049735 1.999814 10.769563 6 -1.52296501 1.1843431 1.3387394 1.999670 10.984169 7 -0.43659539 3.0847073 2.0724138 2.000099 10.163438 8 -1.07562516 2.4046583 2.3631921 1.999976 8.119308 9 0.25897051 4.0599361 2.5180669 2.000179 8.780155 10 0.90011031 0.5844179 3.0924616 2.000156 10.945022 11 -1.01455924 1.3601391 1.3491111 2.000197 11.172243 12 -1.21902395 1.5613617 1.6721161 2.000014 9.752595 13 1.10335026 3.0485505 2.5479672 2.000200 10.851384 14 1.66150031 0.9157312 2.0733168 2.000298 10.045139 15 -2.88733135 1.6426962 1.4906487 1.999932 10.596103 16 -0.20689147 1.7962494 0.9636048 1.999893 10.489436 17 -0.66668766 2.0058826 1.7932363 2.000102 10.702172 18 -0.32072057 2.8834813 2.1764040 2.000017 10.699573 19 -0.29862766 4.6416591 2.8638125 1.999819 10.211451 20 -0.47632229 1.2781510 2.8128627 1.999981 9.046588
Advertisements