Extract Emails From a Text File Using Grep Command in Linux
Last Updated :
25 Oct, 2024
When dealing with large text files containing various information, it’s often necessary to extract specific data such as email addresses. While manual extraction is possible, it can be time-consuming and error-prone. This is where the powerful grep command in Linux comes to our rescue. In this article, we’ll explore how to use grep to efficiently extract email addresses from text files.
Grep Command in Linux
The grep command is a powerful tool in Linux used for searching and matching patterns within files or text streams. It uses regular expressions to find and print lines that match a specified pattern.
Syntax
grep [options] pattern [file...]
Where,
- options: Modify the behavior of grep (optional)
- pattern: The search pattern or regular expression
- file: The file(s) to search in (optional, grep can also read from standard input)
Basic Example
Let’s start with a basic example of using grep to search for a simple pattern in a file:
grep "example" sample.txt
This command will search for the word “example” in the file sample.txt and print all lines containing that word.

Basic grep command output
Key Options for Grep
Grep offers various options to modify its behavior and output. Here are some commonly used options:
Option
|
Description
|
-i
|
Ignore case distinctions
|
-v
|
Invert the match (select non-matching lines)
|
-n
|
Print line numbers along with matching lines
|
-r
|
Recursively search subdirectories
|
-e
|
Use a regular expression pattern
|
-o
|
Print only the matched parts of a matching line
|
Extracting Email Addresses
Now, let’s focus on our main task: extracting email addresses from a text file. We’ll use a regular expression to match the general format of email addresses.
Email Format and Regular Expression
A typical email address follows this format: [email protected]
We can create a regular expression to match this pattern:
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}
This regular expression matches:
- One or more characters that can be letters, numbers, or certain symbols (username)
- Followed by an @ symbol
- Followed by one or more characters that can be letters, numbers, dots, or hyphens (domain)
- Followed by a dot and two or more letters (top-level domain)
Example Dataset
Let’s create a sample text file (sample.txt) with some content including email addresses:
Welcome to our company!
Contact us at [email protected] for more information.
Our support team can be reached at [email protected].
For sales inquiries, email [email protected] or call 555-1234.
John Doe: [email protected]
Jane Smith: [email protected]
Invalid email: not.an.email
Another invalid: @missing.username.com
Extracting Emails Using Grep
Now, let’s use grep with our regular expression to extract email addresses:
grep -E -o '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}' sample.txt
Here’s what each part of the command does:
- -E: Use extended regular expressions
- -o: Print only the matched parts of a matching line
- The regular expression pattern we created earlier
- sample.txt: The input file

Grep command output for email extraction
Conclusion
The grep command, combined with regular expressions, provides a powerful and efficient way to extract email addresses from text files in Linux. By understanding the basic syntax and options of grep, along with crafting an appropriate regular expression, you can easily automate the process of finding and extracting specific patterns of data from large text files.
This technique can be extended to search for other types of data patterns, making grep an invaluable tool for text processing and data extraction tasks in Linux environments.
Similar Reads
How to extract text from a web page using Selenium java and save it as a text file?
Extracting text from a web page using Selenium in Java is a common requirement in web automation and scraping tasks. Selenium, a popular browser automation tool, allows developers to interact with web elements and retrieve data from a webpage. In this article, we will explore how to extract text fro
3 min read
How to Use the grep Command in Linux with Examples?
Grep is a very powerful utility in Linux that is used for searching patterns within files or a stream of text. It's one of those essential tools that system administrators and developers use for parsing logs, cleaning up data, or otherwise dealing with large text apa. This tutorial will walk you thr
4 min read
fgrep command in Linux with examples
The 'fgrep' filter is used to search for the fixed-character strings in a file. There can be multiple files also to be searched. This command is useful when you need to search for strings that contain lots of regular expression metacharacters, such as "^", "$", etc. This makes 'fgrep' particularly v
4 min read
Extracting text from HTML file using Python
Extracting text from an HTML file is a common task in web scraping and data extraction. Python provides powerful libraries such as BeautifulSoup that make this task straightforward. In this article we will explore the process of extracting text from an HTML file using Python. Use the below command t
3 min read
How to View the Content of File in Linux | cat Command
The cat command in Linux is more than just a simple tool, it's a versatile companion for various file-related operations, allowing users to view, concatenate, create, copy, merge, and manipulate file contents. Let's see the details of some frequently used cat commands, understanding each example alo
7 min read
How to Find a File in Linux | Find Command
Linux, renowned for its robust command-line interface, provides a suite of powerful tools for efficient file and directory management. Among these, the "find" command stands out as an indispensable asset, offering unparalleled versatility in searching for files based on diverse criteria. This articl
10 min read
Extracting Image Metadat using Exif Tool in Linux
Exif stands for Exchangeable image file format. It is a standard that specifies the formats for images, sound, and ancillary tags used by cameras, scanners, and other systems handling image and sound files recorded by cameras. Developed by Japan Electronic Industries Development Association (JEIDA).
2 min read
How to Extract Text from XML File Using R
A markup language that defines the set of rules for encoding documents in a format that is both human-readable and machine-readable. XML can be widely used to represent arbitrary data structures, such as those used in web services. Extracting XML (Extensible Markup Language) is a markup language tha
4 min read
How to plot data from a text file using Matplotlib?
Perquisites: Matplotlib, NumPy In this article, we will see how to load data files for Matplotlib. Matplotlib is a 2D Python library used for Date Visualization. We can plot different types of graphs using the same data like: Bar GraphLine GraphScatter GraphHistogram Graph and many. In this article,
3 min read
Extract Filename From the Full Path in Linux
Linux is a family of open-source operating systems and comes as various distributions or distros. The full path in Linux means starting from the root directory "/", the address of the file includes the directories and subdirectories until the file name. A full file path in Linux looks as follows: /h
2 min read