Open In App

How to Load a Dataset From the Google Drive to Google Colab

Last Updated : 08 Oct, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Google Colab (short for Collaboratory) is a powerful platform that allows users to code in Python using Jupyter Notebook in the cloud. This free service provided by Google enables users to easily and effectively load a dataset in Google Colab without the need for local resources. One of the advantages of Google Colab is that it offers access to GPU and TPU, allowing for faster computations similar to what you would find in a local Jupyter Notebook. Additionally, Colab facilitates quick installation and real-time sharing of Notebooks among users, making it a great choice for collaborative projects.

Step-by-Step Guide to Loading Datasets from Google Drive

To read a dataset in Google Colab from an external source, such as Google Drive, you will need to write a few lines of code. Here’s a step-by-step guide on how to upload a dataset in Google Colab from Drive:

Step 1: Mount Google Drive

Using the built-in code cell in Google Colab, you can load a dataset in Google Colab by mounting your Google Drive. This process grants you access to the documents and folders in your Google Drive account. To mount your Google Drive, use the following code:

from google.colab import drive
drive.mount("/content/drive")

Using the mount() function in Google Colab allows any code in the notebook to access any file in Google Drive. This is a crucial step for users wanting to import a dataset in Google Colab directly from their Google Drive, as it enables seamless interaction with the files stored there. Once the drive is mounted, you can easily navigate to your datasets, making it simple to read datasets in Google Colab for your machine learning projects.

Step 2: Authorisation Access

When you run the code cell to load a dataset in Google Colab, you will be prompted with a request for permission to grant Google Colab access to your Google Drive files. This is an essential step for how to upload a dataset in Google Colab from Drive.

auth_req-(2)-Geeks For Geeks


After allowing the permission, you will be redirected to a page displaying your email ID access. Following this, an authentication key will be provided, which you need to input into the prompt in Google Colab. This process is crucial for ensuring that you can import a dataset in Google Colab securely and seamlessly access the files stored in your Google Drive.

drive-authentication-660- Geeks For Geeks

Step 3: Google Drive Mounted

After completing Step 2, your Google Drive will be mounted, as illustrated in the image below. At this point, you can easily read your dataset file from Google Drive.

drive-mounted-Geeks For Geeks

However, before proceeding, it's essential to check your current working directory using the command:

!pwd

pwd stands for print working directory. It is a command that is used in Unix-like operating systems, such as Linux and macOS, to display the current working directory, or the location or working directory in the file system that you are now using in the command line interface.

When you run the pwd command, the entire path to the current directory will be printed to the terminal. This is useful when exploring directories and interacting with files and directories via the command line because it helps you remember where you are in the file system.

!pwd

As shown in the image above, after executing the command in the colab cell, it is said that the current working directory is /content and the drive is mounted at /content/drive. Therefore, one must start from /content/drive, which is the drive, in order to access the dataset.

Step 4: Accessing the dataset

Once step 3 is completed, you can easily navigate to the folder where your dataset is stored. for this, a command will be used called

!ls

ls is a command commonly used in Unix-like operating systems, including Linux and macOS, for listing the files and directories in the current directory (or a specified directory). It provides a way to view the contents of a directory from the command line.

For example, we will use a sales.csv to show the steps:

!ls /content/drive/MyDrive/sales.csv

drive_dataset-(1)-Geeks For Geeks

Here, the sales.csv dataset is located in the folder named MyDrive. By using this command, you can effectively manage your datasets and ensure that you can seamlessly load a dataset in Google Colab for your machine learning tasks.

Step 5: Loading Dataset

Now, depending on the structure of your dataset, you can load a dataset in Google Colab using Python libraries like Pandas for tabular data or NumPy for arrays.

import pandas as pd
df=pd.read_csv("/content/drive/MyDrive/sales.csv")

pd.read_csv is a function provided by the popular Python library called Pandas. Pandas is commonly used for data manipulation and analysis in data science and data engineering tasks. The pd.read_csv function specifically is used to read data from CSV (Comma-Separated Values) files into a Pandas DataFrame.

Finally, you can now work with the dataset in your Google Colab, similar to as you would have done in any other Python environment.

By following these steps, you can effectively add a dataset in Google Colab from Google Drive and begin working with it. This process allows you to easily use a dataset from Google Drive in Colab and facilitates smooth data analysis and model training.

Conclusion

In this article, we discussed how to load a dataset in Google Colab from Google Drive, emphasizing the platform's benefits for machine learning. We covered the steps to mount Google Drive, locate your dataset, and use commands like !ls to access files. By learning how to read a dataset in Google Colab and import a dataset in Google Colab, you can efficiently manage your data for analysis and model training. With these skills, you are well-prepared to upload a dataset in Google Colab and effectively use a dataset from Google Drive in Colab for your projects.


Next Article
Practice Tags :

Similar Reads