Open In App

Different ways to create Pandas Dataframe

Last Updated : 02 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

It is the most commonly used Pandas object. The pd.DataFrame() function is used to create a DataFrame in Pandas. There are several ways to create a Pandas Dataframe in Python.

Example: Creating a DataFrame from a Dictionary

Python
import pandas as pd

# initialize data of lists.
data = {'Name': ['Tom', 'nick', 'krish', 'jack'],
        'Age': [20, 21, 19, 18]}

# Create DataFrame
df = pd.DataFrame(data)

print(df)

Output:

 Name  Age
0 Tom 20
1 nick 21
2 krish 19
3 jack 18

Explanation: Here, a dictionary named data is created. The dictionary contains two keys: 'Name' and 'Age'.

  • The value for 'Name' is a list of names: ['Tom', 'nick', 'krish', 'jack'].
  • The value for 'Age' is a list of corresponding ages: [20, 21, 19, 18].
  • This dictionary structure is suitable for creating a DataFrame, as it allows each key to represent a column in the resulting DataFrame.

Pandas Create Dataframe Syntax

pandas.DataFrame(data, index, columns)

Parameters:

  • data: It is a dataset from which a DataFrame is to be created. It can be a list, dictionary, scalar value, series, and arrays, etc.
  • index: It is optional, by default the index of the DataFrame starts from 0 and ends at the last data value(n-1). It defines the row label explicitly.
  • columns: This parameter is used to provide column names in the DataFrame. If the column name is not defined by default, it will take a value from 0 to n-1.

Returns:

  • DataFrame object

Now that we have discussed about DataFrame() function, let’s look at Different ways to Create Pandas Dataframe.

Create an Empty DataFrame

Pandas Create Dataframe can be created by the DataFrame() function of the Pandas library. Just call the function with the DataFrame constructor to create a DataFrame.

Python
# Importing Pandas to create DataFrame
import pandas as pd

# Creating Empty DataFrame and Storing it in variable df
df = pd.DataFrame()

print(df)

Output:

Empty DataFrame
Columns: []
Index: []

Creating a DataFrame from Lists or Arrays

Python
import pandas as pd

# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Name', 'Age'])

print(df)

Output:

 Name  Age
0 tom 10
1 nick 15
2 juli 14

Explanation: To create a Pandas DataFrame from a list of lists, you can use the pd.DataFrame() function. This function takes a list of lists as input and creates a DataFrame with the same number of rows and columns as the input list.

Create DataFrame from List of Dictionaries

Python
import pandas as pd

# Initialize data to lists.
data = [{'a': 1, 'b': 2, 'c': 3},
        {'a': 10, 'b': 20, 'c': 30}]

# Creates DataFrame.
df = pd.DataFrame(data)

print(df)

Output:

a   b   c
0 1 2 3
1 10 20 30

Explanation: Pandas DataFrame can be created by passing lists of dictionaries as input data. By default, dictionary keys will be taken as columns.

Another example is to create a Pandas DataFrame by passing lists of dictionaries and row indexes.

Python
import pandas as pd

# Initialize data of lists
data = [{'b': 2, 'c': 3}, {'a': 10, 'b': 20, 'c': 30}]

# Creates pandas DataFrame by passing
# Lists of dictionaries and row index.
df = pd.DataFrame(data, index=['first', 'second'])

print(df)

Output:

b   c     a
first 2 3 NaN
second 20 30 10.0

Creating a DataFrame from Another DataFrame

Python
original_df = pd.DataFrame({
    'Name': ['Tom', 'Nick', 'Krish', 'Jack'],
    'Age': [20, 21, 19, 18]
})

new_df = original_df[['Name']] 
print(new_df)

Output:

    Name
0 Tom
1 Nick
2 Krish
3 Jack

Explanation: You can create a new DataFrame based on an existing DataFrame by selecting specific columns or rows.

Create DataFrame from a Dictionary of Series

Python
import pandas as pd

# Initialize data to Dicts of series.
d = {'one': pd.Series([10, 20, 30, 40],
                      index=['a', 'b', 'c', 'd']),
     'two': pd.Series([10, 20, 30, 40],
                      index=['a', 'b', 'c', 'd'])}

# creates Dataframe.
df = pd.DataFrame(d)

print(df)

Output:

   one  two
a 10 10
b 20 20
c 30 30
d 40 40

Explanation: To create a dataframe in Python from a dictionary of series, a dictionary can be passed to form a DataFrame. The resultant index is the union of all the series of passed indexed.

Create DataFrame using the zip() function

Python
import pandas as pd

# List1
Name = ['tom', 'krish', 'nick', 'juli']

# List2
Age = [25, 30, 26, 22]

# get the list of tuples from two lists.
# and merge them by using zip().
list_of_tuples = list(zip(Name, Age))

# Assign data to tuples.
list_of_tuples


# Converting lists of tuples into
# pandas Dataframe.
df = pd.DataFrame(list_of_tuples,
                  columns=['Name', 'Age'])

print(df)

Output:

 Name  Age
0 tom 25
1 krish 30
2 nick 26
3 juli 22

Explanation: Two lists can be merged by using the zip() function. Now, create the Pandas DataFrame by calling pd.DataFrame() function.

Create a DataFrame by Proving the Index Label Explicitly

Python
import pandas as pd

# initialize data of lists.
data = {'Name': ['Tom', 'Jack', 'nick', 'juli'],
        'marks': [99, 98, 95, 90]}

# Creates pandas DataFrame.
df = pd.DataFrame(data, index=['rank1',
                               'rank2',
                               'rank3',
                               'rank4'])

# print the data
print(df)

Output:

 Name  marks
rank1 Tom 99
rank2 Jack 98
rank3 nick 95
rank4 juli 90

Explanation: To create a DataFrame by providing the index label explicitly, you can use the index parameter of the pd.DataFrame() constructor. The index parameter takes a list of index labels as input, and the DataFrame will use these labels for the rows of the DataFrame.



Next Article

Similar Reads