A Pandas DataFrame is a data structure for storing and manipulating data in a table format (rows and columns), similar to Excel or SQL. It makes handling, filtering and analyzing large datasets easy. A DataFrame can be created using various data structures like lists, dictionaries, NumPy arrays etc.
Creating an Empty DataFrame
An empty Pandas DataFrame is a table with no data, though it can have defined columns or indexes. It’s useful for setting up a structure before adding data and can be created using the DataFrame constructor.
import pandas as pd
df = pd.DataFrame()
print(df)
Output
Empty DataFrame Columns: [] Index: []
Creating a DataFrame from a List
One way to create a DataFrame is by using a single list. Pandas automatically assigns index values to the rows when you pass a list.
- Each item in the list becomes a row.
- The DataFrame consists of a single unnamed column.
import pandas as pd
lst = ['Geeks', 'For', 'Geeks', 'is', 'portal', 'for', 'Geeks']
df = pd.DataFrame(lst)
print(df)
Output
0 0 Geeks 1 For 2 Geeks 3 is 4 portal 5 for 6 Geeks
Creating DataFrame from dict of Numpy Array
We can create a Pandas DataFrame using a dictionary of NumPy arrays. Each key in the dictionary represents a column name and the corresponding NumPy array provides the values for that column.
import numpy as np
import pandas as pd
data = { 'A': np.array([1, 4, 7]),
'B': np.array([2, 5, 8]),
'C': np.array([3, 6, 9]) }
df = pd.DataFrame(data)
print(df)
Output
A B C 0 1 2 3 1 4 5 6 2 7 8 9
Creating a DataFrame from a List of Dictionaries
We can create a DataFrame using a list of dictionaries, where each dictionary represents a row. This is useful for handling structured data from APIs or JSON, and is commonly used in web scraping and API processing.
import pandas as pd
data = [
{'name': 'Mike', 'degree': 'MBA', 'score': 90},
{'name': 'Dan', 'degree': 'BCA', 'score': 40},
{'name': 'Emilia', 'degree': 'M.Tech', 'score': 80},
]
df = pd.DataFrame(data)
print(df)
Output
name degree score 0 Mike MBA 90 1 Dan BCA 40 2 Emilia M.Tech 80
To understand more methods of creating dataframe in detail refer to: