0% found this document useful (0 votes)

13 views

R and Python Tables

The document compares data manipulation operations between R's data.frame and data.table packages and Python's pandas library. It provides the class name, functions for help, creation, loading, saving, printing and information for each. It also shows how to select, add, remove and transform rows and columns in each.

Uploaded by

Leo Gama

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

R and Python Tables

Uploaded by

Leo Gama

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Operation R Python (pandas)

Class name data.frame data.table DataFrame

Load package/module # built-in library(data.table) import pandas as pd

Get help ?data.frame help(package=data.table) help(pd)

?`[.data.frame` ?data.table ?pd.DataFrame

Create table data.frame(a=c(0.3, 1.5, 7), data.table(i=c("x", "y", "z"), pd.DataFrame(data={"a": [0.3, 1.5, 7],
b=c(NA, 6, 2), a=c(0.3, 1.5, 7), "b": [None, 6, 2]},
row.names=c("x", "y", "z")) b=c(NA, 6, 2), key="i") index=["x", "y", "z"])

Save table to text file write.table(T, "table.txt", write.table(T, "table.txt", T.to_csv("table.txt",

col.names=TRUE, col.names=TRUE, header=True
row.names=TRUE, row.names=FALSE, index=True,
sep="\t" sep="\t", sep="\t",
na="NA", na="NA", na_rep="NA",
quote=FALSE) quote=FALSE) quoting=csv.QUOTE_MINIMAL)

Read table from text file read.delim("table.txt", fread("table.txt", pd.read_table("table.txt",

Ex: a b c* header=TRUE, header="auto", header="infer",
i ------------- row.names=1, # use setkey(T, i) for indexing index_col=0,
x | 0.3 7 NA col.names=c(NA, NA, NULL), select=c("i", "a", "b"), usecols=["a", "b"],
y | 1.5 NA 6 na.strings="NA") na.strings="NA") na_values="NA")

Print first/last rows head(T); tail(T) head(T); tail(T) T.head(); T.tail()

Table quartiles summary(T) summary(T) T.describe()

Table information str(T) str(T); tables() T.info()

Table dimensions dim(T) dim(T) T.shape

Number of rows nrow(T) nrow(T) len(T)

Number of columns ncol(T) ncol(T) len(T.columns)

Index rownames(T) rownames(T) T.index

rownames(T) <- new_index setkey(T, new_index) # column name of T.index = new_index

Columns colnames(T) colnames(T) T.columns

colnames(T) <- new_columns setnames(T, new_columns) T.columns = new_columns
Operation R Python (pandas)
Class name data.frame data.table DataFrame

Get 1st row T[1, ] T[1, ]; T[1] T.iloc[0]; T.iloc[0, :]

Get row "x" T["x", ] T["x", ]; T["x"] T.loc["x"]; T.loc["x", :]

Get 1st column T[[1]]; T[1] T[[1]]; T[, 1, with=F] T.iloc[:, 0]

Get column "a" T$a; T[["a"]]; T["a"]; T[, "a"] T$a; T[["a"]]; T[, a]; T[, "a", with=F] T.a; T["a"]; T.loc[:, "a"]

Add column T$d <- new_column T[, d := new_column]; # see help(`:=`) T["d"] = new_column;
# or any of the above forms set(T, NULL, "d", new_column) T.insert(0, "d", new_column)

Remove column T[["d"]] <- NULL; # idem T[, d := NULL]; del T["d"];
T <- T[names(T) != "d"] set(T, NULL, “d”, NULL) T.drop("d", axis=1, inplace=True)

Get subset (example) T[1:2, c("a", "c")] T[1:2, .(a, c)] # .(a, c) == list(a, c) T.ix[1:2, ["a", "c"]] # ix == iloc + loc

Reorder columns* T <- T[order(colnames(T))] setcolorder(T, new_order) T = T.reindex_axis(new_order, axis=1)

Sort* T <- T[order(T), ]? setorder(T, i) # setkey also sorts T.sort(0, inplace=True)

Apply func to rows/cols* lapply(T,func); apply(T, 2, func) T[, lapply(.SD, func), .SDcols=-1] T.apply()

Apply func elementwise* apply(T, 1:2, func) ??? T.applymap()

Mean column* T$sum <- rowMeans(T) T[, sum := rowMeans(.SD)] T["sum"] = T.mean(axis=1)

Add two columns* T$b + T$c T$b + T$c; T[, b + c] T.b + T.c

* Untested.

More about:
● R data.table differeces from data.frame (data.table FAQ) - http://datatable.r-forge.r-project.org/datatable-faq.pdf
● pandas DataFrame comparison with data.frame - http://pandas.pydata.org/pandas-docs/stable/comparison_with_r.html
● http://graphlab.com/learn/translator/
● https://sites.google.com/site/gappy3000/home/pandas_r
● https://drive.google.com/folderview?id=0ByIrJAE4KMTtaGhRcXkxNHhmY2M&usp=sharing
● http://mathesaurus.sourceforge.net/matlab-python-xref.pdf

R Studio Cheat Sheet For Math1041
No ratings yet
R Studio Cheat Sheet For Math1041
3 pages
Python - 1 Year - Unit-3
No ratings yet
Python - 1 Year - Unit-3
72 pages
Loading Pandas
No ratings yet
Loading Pandas
23 pages
Data Science Python Cheat Sheet
No ratings yet
Data Science Python Cheat Sheet
25 pages
Python Notes ch3
100% (1)
Python Notes ch3
22 pages
Prologppt
No ratings yet
Prologppt
9 pages
Unit Iv Lists, Tuples, Dictionaries Insertion Sort
No ratings yet
Unit Iv Lists, Tuples, Dictionaries Insertion Sort
12 pages
10. Tuples
No ratings yet
10. Tuples
29 pages
unit-4 & 5 python
No ratings yet
unit-4 & 5 python
27 pages
R Master Sheet - All codes, inbuilt functions and packages needed for the course
No ratings yet
R Master Sheet - All codes, inbuilt functions and packages needed for the course
2 pages
Linear Data Structures
No ratings yet
Linear Data Structures
59 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Presentation 1
No ratings yet
Presentation 1
34 pages
Samanyu Kaushal-Program File
No ratings yet
Samanyu Kaushal-Program File
26 pages
Assignment 1701572062
No ratings yet
Assignment 1701572062
38 pages
complément 1 chap 5
No ratings yet
complément 1 chap 5
2 pages
Summary in Python
No ratings yet
Summary in Python
23 pages
Python Matplotlib 2
No ratings yet
Python Matplotlib 2
48 pages
14 Tup Xi Methods
No ratings yet
14 Tup Xi Methods
23 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Lecture 10 24-11
100% (1)
Lecture 10 24-11
19 pages
Python Notes
No ratings yet
Python Notes
21 pages
Unit 4
No ratings yet
Unit 4
21 pages
Python Unit 3
No ratings yet
Python Unit 3
19 pages
Lecture 8
No ratings yet
Lecture 8
12 pages
List, Tuple, String, Dictionary Functions
No ratings yet
List, Tuple, String, Dictionary Functions
22 pages
ALevel 1 Python 22apr SS
No ratings yet
ALevel 1 Python 22apr SS
5 pages
Data Structures and Applications(BCS304) June - July 2024
No ratings yet
Data Structures and Applications(BCS304) June - July 2024
36 pages
IV Unit
No ratings yet
IV Unit
13 pages
Department of Cse: D.Sai Vivek 18R21A05D4 Data Structures Lab WEEK - 13
No ratings yet
Department of Cse: D.Sai Vivek 18R21A05D4 Data Structures Lab WEEK - 13
11 pages
Unit 4
No ratings yet
Unit 4
12 pages
Lecture 6
No ratings yet
Lecture 6
30 pages
Unit III Unit IV Strings List Tuple Dictionary
No ratings yet
Unit III Unit IV Strings List Tuple Dictionary
9 pages
CSC 11th (ch12&13) Worksheet
No ratings yet
CSC 11th (ch12&13) Worksheet
3 pages
Lab Session - 5
No ratings yet
Lab Session - 5
24 pages
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
No ratings yet
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
15 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Lecture 13 Nov 19 2019
No ratings yet
Lecture 13 Nov 19 2019
24 pages
List
No ratings yet
List
18 pages
IV Unit-New
No ratings yet
IV Unit-New
10 pages
Tutorial TCL
100% (1)
Tutorial TCL
27 pages
ReactNativeBlobUtilTmp_f94lpccjf3cgbrz3afhov7
No ratings yet
ReactNativeBlobUtilTmp_f94lpccjf3cgbrz3afhov7
25 pages
Tuples
No ratings yet
Tuples
4 pages
Matrix, Dataframes, List
No ratings yet
Matrix, Dataframes, List
8 pages
R Programming Cheat Sheet: by Via
No ratings yet
R Programming Cheat Sheet: by Via
2 pages
Tuple Notes
No ratings yet
Tuple Notes
7 pages
To Calculate The Sum of Values in The List
No ratings yet
To Calculate The Sum of Values in The List
16 pages
Pandas
No ratings yet
Pandas
21 pages
Operator Description
No ratings yet
Operator Description
6 pages
Record
No ratings yet
Record
15 pages
R Introduction II
No ratings yet
R Introduction II
45 pages
Tutorial 5
No ratings yet
Tutorial 5
51 pages
PythonForMachineLearning
No ratings yet
PythonForMachineLearning
66 pages
10)PYTHON REVIEW TUPLE IN PYTHON
No ratings yet
10)PYTHON REVIEW TUPLE IN PYTHON
8 pages
Masterofcode C-Collection
No ratings yet
Masterofcode C-Collection
5 pages
Imp Details
No ratings yet
Imp Details
6 pages
lab record dev
No ratings yet
lab record dev
20 pages
Python Week-2
No ratings yet
Python Week-2
9 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

R and Python Tables

Uploaded by

R and Python Tables

Uploaded by

Operation R Python​ (pandas)

Class name data.frame data.table DataFrame

Load package/module # built-in library(data.table) import pandas as pd

Get help ?data.frame help(package=data.table) help(pd)

Save table to text file write.table(T, "table.txt", write.table(T, "table.txt", T.to_csv("table.txt",

Read table from text file read.delim("table.txt", fread("table.txt", pd.read_table("table.txt",

Print first/last rows head(T); tail(T) head(T); tail(T) T.head(); T.tail()

Table quartiles summary(T) summary(T) T.describe()

Table information str(T) str(T); tables() T.info()

Table dimensions dim(T) dim(T) T.shape

Number of rows nrow(T) nrow(T) len(T)

Number of columns ncol(T) ncol(T) len(T.columns)

Index rownames(T) rownames(T) T.index

Columns colnames(T) colnames(T) T.columns

Get 1st row T[1, ] T[1, ]; T[1] T.iloc[0]; T.iloc[0, :]

Get row "x" T["x", ] T["x", ]; T["x"] T.loc["x"]; T.loc["x", :]

Get 1st column T[[1]]; T[1] T[[1]]; T[, 1, with=F] T.iloc[:, 0]

Reorder columns* T <- T[order(colnames(T))] setcolorder(T, new_order) T = T.reindex_axis(new_order, axis=1)

Sort* T <- T[order(T), ]? setorder(T, i) ​# setkey also sorts T.sort(0, inplace=True)

Apply func elementwise* apply(T, 1:2, func) ??? T.applymap()

You might also like

Operation R Python (pandas)

Sort* T <- T[order(T), ]? setorder(T, i) # setkey also sorts T.sort(0, inplace=True)