
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Get Tag Name Using BeautifulSoup in Python
BeautifulSoup is known as one of the most widely used Python packages for web scraping. It is one of the most fantastic tools used for parsing HTML and XML documents, making it simpler and quicker to extract data from websites. Extraction of the tag name for particular HTML and XML components is one of the most frequent tasks in web scraping. Getting the tag name of a given element is one of the most frequent tasks when working with HTML and XML documents.
Python's BeautifulSoup library can be installed using the below command:
pip install beautifulsoup4
Approach
Using the name attribute
Method 1: Using the name attribute
This method includes getting the tag name using BeautifulSoup, the name attribute of the Tag object. This attribute returns the string value as the name of the tag. Below is the syntax of the name attribute:
Syntax
tag.name
Return Type String value containing the name of the Tag.
Algorithm
Import the BeautifulSoup module.
Define an HTML multi-line string that will be used to get the tag from.
Create a BeautifulSoup object by supplying the HTML document and a parser as inputs to the BeautifulSoup constructor. The html.parser is being used as the parser in this case.
Find the first occurrence of <p> tag in the document using the soup.find() method.
Use the name attribute for getting the name of the p Tag object.
Print the tag name using the print() statement.
Example 1
Below are the example codes that demonstrates this approach:
from bs4 import BeautifulSoup # HTML document to be parsed html_doc = """ <html> <head> <title>TutorialsPoint</title> </head> <body> <p>TutorialsPoint</p> </body> </html> """ # Parse the HTML document using BeautifulSoup soup = BeautifulSoup(html_doc, 'html.parser') # Get the first <p> tag in the HTML document p_tag = soup.find('p') # Get the tag name using the name attribute tag_name = p_tag.name # Print the tag name print("Tag name is:", tag_name)
Output
Tag name is: p
Example 2
In this example, we are parsing the XML document and getting the tag name from a custom tag.
from bs4 import BeautifulSoup xml_doc = ''' <book> <title>Harry Potter</title> <author>J.K. Rowling</author> <publisher>Bloomsbury</publisher> </book> ''' # Parse the XML document using BeautifulSoup soup = BeautifulSoup(xml_doc, 'xml') # Get the first <author> tag in the XML document tag = soup.find('author') # Get the tag name using the name attribute tag_name = tag.name # Print the tag name print("Tag name is:", tag_name)
Output
Tag name is: author
Example 3
In this example, we are getting the tag using its class and then applying the name attribute for getting the name of the tag.
from bs4 import BeautifulSoup # HTML document to be parsed html_doc = """ <html> <head> <title class="tut">TutorialsPoint</title> </head> <body> <p>TutorialsPoint</p> </body> </html> """ # Parse the HTML document using BeautifulSoup constructor soup = BeautifulSoup(html_doc, 'html.parser') # Get the tag using its class p_tag = soup.find(class_='tut') # Get the tag name using the name attribute tag_name = p_tag.name # Print the tag name print("Tag name is:", tag_name)
Output
Tag name is: title
Example 4
In this example, we are getting the tag using its id and then applying the name attribute for getting the name of the tag.
from bs4 import BeautifulSoup # HTML document to be parsed html_doc = """ <html> <head> <title id="tut">TutorialsPoint</title> </head> <body> <p>TutorialsPoint</p> </body> </html> """ # Parse the HTML document using BeautifulSoup soup = BeautifulSoup(html_doc, 'html.parser') # Get the tag using its id p_tag = soup.find(id='tut') # Get the tag name using the name attribute tag_name = p_tag.name # Print the tag name print("Tag name is:", tag_name)
Output
Tag name is: title
Conclusion
We can say that BeautifulSoup is a robust Python module that makes parsing HTML and XML texts simple. It offers a variety of tools and options for searching, navigating, and modifying the document tree.
Each example has its own advantages and disadvantages based on the method or function used. You can choose the method you want based on the complexity of the expression you want to have and your personal preference for writing the code.