Reading specific columns of a CSV file using Pandas
When working with large datasets stored in CSV (Comma-Separated Values) files, it’s often unnecessary to load the entire dataset into memory. Instead, you can selectively read specific columns using Pandas in Python.
Read Specific Columns From CSV File
Let us see how to read specific columns of a CSV file using Pandas. This can be done with the help of the pandas.read_csv() method. We will pass the first parameter as the CSV file and the second parameter as the list of specific columns in keyword usecols. It will return the data of the CSV file of specific columns.
The usecols
parameter in the read_csv()
function filters the columns to be loaded into the DataFrame. This is particularly useful when:
- The dataset contains hundreds or thousands of columns.
- You are only interested in analyzing a subset of the data.
Below are some examples by which we can read specific columns of a CSV file using Pandas.
Read Entire Columns of a CSV File
In this example, the Pandas library is imported, and the code reads the entire content of the “student_scores2.csv” file into a DataFrame ‘df’ using Pandas. The printed output displays the entire dataset for further examination.
import pandas as pd
# read specific columns of csv file using Pandas
df = pd.read_csv("student_scores2.csv")
print(df)
Output:
Read Specific Columns of a CSV File Using usecols
In this example, the Pandas library is imported, and the code uses it to read only the ‘IQ’ and ‘Scores’ columns from the “student_scores2.csv” file, storing the result in the DataFrame ‘df’. The printed output displays the selected columns for analysis.
import pandas as pd
# read specific columns of csv file using Pandas
df = pd.read_csv("student_scores2.csv", usecols=['IQ', 'Scores'])
print(df)
Output:

With another example, the code reads the ‘Survived’ and ‘Pclass’ columns from the “titanic.csv” file using Pandas. The resulting DataFrame ‘df’ displays the selected columns for analysis.
import pandas as pd
# read specific columns of csv file using Pandas
df = pd.read_csv("titanic.csv", usecols = ['Survived','Pclass'])
print(df)
Output:

Selecting Specific Columns by Index
If you don’t know the column names or prefer working with indices, you can pass a list of integers representing column positions:
# Read only the thid and last column (indices 2 and 3)
df = pd.read_csv('student_scores2.csv', usecols=[2, 3])
df
Output:
Scores Pass
0 18 0
1 45 1
2 25 0
3 72 1
4 30 0
5 20 0
6 88 1
..
..
Using Lambda for Dynamic Selection
You can also use functions or regular expressions to dynamically select columns. For example:
# Dynamically select columns containing "Name" or "Salary"
df = pd.read_csv('/content/student_scores2.csv', usecols=lambda col: 'IQ' in col or 'Pass' in col)
df
Output:
IQ Pass
0 80 0
1 80 1
2 70 0
3 90 1
4 70 0
5 80 0
6 100 1
7 90 1
..