Extracting Numeric Entities using Duckling in Python
Last Updated :
24 Apr, 2025
Wit.ai is a natural language processing (NLP) platform that allows developers to build conversational experiences for various applications. One of the key features of Wit.ai is its entity extraction system, which can recognize and extract entities from user input.
One of the key features provided by Wit.ai is its entity extraction system called Duckling. Duckling is an open-source library that can extract entities such as time, date, duration, and numbers from text input.
In this article, we will focus on the numeric entity tagging functionality provided by Duckling and how it can be implemented using Python.
Getting Started with Duckling :
Required Modules :
pip install wit
pip install duckling
pip install --force-reinstall JPype1==0.6.3 # To avoid a common dependency error
We can then create a Python file and import the necessary libraries:
Python3
import wit
import duckling
|
We can now use the wit library to connect to the Wit.ai API:
Python3
access_token = "your-access-token"
client = wit.Wit(access_token)
|
We can test our connection by sending a message to the Wit.ai API:
Python3
example0 = "I want to read 3 geekforgeeks articles."
response = client.message(example0)
print (response)
|
The client. message() method sends a message to the Wit.ai API and returns a JSON response. The response should include the entities that Wit.ai was able to extract from the message. In this case, the response should look something like this:
{
"text": "I want to read 3 geekforgeeks articles.",
"intents": [],
"entities": {
"wit$amount_of_money:amount_of_money": [
{
"id": "12345678-1234-5678-1234-567812345678",
"name": "wit$amount_of_money",
"role": "amount_of_money",
"start": 16,
"end": 20,
"body": "3",
"confidence": 0.9975,
"entities": [],
"value": 3.0,
"type": "value"
}
]
}
}
We can see that Wit.ai was able to extract the numeric entity “3” from the message and tag it as an amount of money. However, in this case, we want to extract the numeric entity without any specific tag. This is where Duckling comes in.
Using Duckling for Numeric Entity Tagging :
To use Duckling for numeric entity tagging, we first need to create a Duckling parser.
Python3
parser = duckling.Duckling()
|
We can then use the parser to extract numeric entities from a message:
Example 1:
Python3
import json
example1 = "I want to read 3 geeksforgeeks articles."
response = parser.parse(example1)
print (json.dumps(response, indent = 3 ))
|
The parser. parse() method sends the message to the Duckling parser and returns a list of entities that Duckling was able to extract. In this case, the response should look something as follows:
Output:
[
{
"dim": "number",
"text": "3",
"start": 15,
"end": 16,
"value": {
"value": 3.0
}
},
{
"dim": "distance",
"text": "3",
"start": 15,
"end": 16,
"value": {
"value": 3.0,
"unit": null
}
},
{
"dim": "volume",
"text": "3",
"start": 15,
"end": 16,
"value": {
"value": 3.0,
"unit": null,
"latent": true
}
},
{
"dim": "temperature",
"text": "3",
"start": 15,
"end": 16,
"value": {
"value": 3.0,
"unit": null
}
},
{
"dim": "time",
"text": "3",
"start": 15,
"end": 16,
"value": {
"value": "2023-03-28T03:00:00.000+05:30",
"grain": "hour",
"others": [
{
"grain": "hour",
"value": "2023-03-28T03:00:00.000+05:30"
},
{
"grain": "hour",
"value": "2023-03-28T15:00:00.000+05:30"
},
{
"grain": "hour",
"value": "2023-03-29T03:00:00.000+05:30"
}
]
}
}
]
We can observe that Duckling was able to extract the numeric entity “3” from the message and tag it as a number. We get multiple different possible entities from the parsed text but we can extract this entity from the response using the following code:
Python3
entities = response[ 0 ][ "value" ][ "value" ]
print (entities)
|
Output:
3.0
Example 2:
Python3
example2 = u "Let\'s meet at tomorrow at half past six to read a geekforgeeks article."
duck_parsed = parser.parse_time(example2)
print (json.dumps(duck_parsed[ 0 ], indent = 3 ))
|
Output:
{
"dim": "time",
"text": "tomorrow at half past six",
"start": 14,
"end": 39,
"value": {
"value": "2023-03-28T06:30:00.000+05:30",
"grain": "minute",
"others": [
{
"grain": "minute",
"value": "2023-03-28T06:30:00.000+05:30"
},
{
"grain": "minute",
"value": "2023-03-28T18:30:00.000+05:30"
}
]
}
}
As you can observe, duckling can recognize the relative date-time from the raw text and returns a date-time string as the value which is tomorrow’s date 2023-03-28 (28th March 2023), and the specified time in the raw text(half past six) i.e. 6:30 AM. We can parse the date-time string which is present in the iso format to get a DateTime object as such:
Python3
import datetime
print (datetime.datetime.fromisoformat(duck_parsed[ 0 ][ 'value' ][ 'value' ]))
|
This outputs the DateTime object like so:
2023-03-28 06:30:00+05:30
Similar Reads
NLP | Extracting Named Entities
Recognizing named entity is a specific kind of chunk extraction that uses entity tags along with chunk tags. Common entity tags include PERSON, LOCATION and ORGANIZATION. POS tagged sentences are parsed into chunk trees with normal chunking but the trees labels can be entity tags in place of chunk p
2 min read
How To Extract Data From Common File Formats in Python?
Sometimes work with some datasets must have mostly worked with .csv(Comma Separated Value) files only. They are really a great starting point in applying Data Science techniques and algorithms. But many of us will land up in Data Science firms or take up real-world projects in Data Science sooner or
6 min read
Python | Set 3 (Strings, Lists, Tuples, Iterations)
In the previous article, we read about the basics of Python. Now, we continue with some more python concepts. Strings in Python: A string is a sequence of characters that can be a combination of letters, numbers, and special characters. It can be declared in python by using single quotes, double quo
3 min read
Deploy Machine Learning Model using Flask
In this article, we will build and deploy a Machine Learning model using Flask. We will train a Decision Tree Classifier on the Adult Income Dataset, preprocess the data, and evaluate model accuracy. After training, weâll save the model and create a Flask web application where users can input data a
8 min read
Numpy - String Functions & Operations
NumPy String functions belong to the numpy.char module and are designed to perform element-wise operations on arrays. These functions can help to handle and manipulate string data efficiently. Table of Content String OperationsString Information String Comparison In this article, weâll explore the v
5 min read
Python dictionary values()
values() method in Python is used to obtain a view object that contains all the values in a dictionary. This view object is dynamic, meaning it updates automatically if the dictionary is modified. If we use the type() method on the return value, we get "dict_values object". It must be cast to obtain
2 min read
NLP | Proper Noun Extraction
Chunking all proper nouns (tagged with NNP) is a very simple way to perform named entity extraction. A simple grammar that combines all proper nouns into a NAME chunk can be created using the RegexpParser class. Then, we can test this on the first tagged sentence of treebank_chunk to compare the res
2 min read
Python Pandas - get_dummies() method
In Pandas, the get_dummies() function converts categorical variables into dummy/indicator variables (known as one-hot encoding). This method is especially useful when preparing data for machine learning algorithms that require numeric input. Syntax: pandas.get_dummies(data, prefix=None, prefix_sep='
3 min read
Processing text using NLP | Basics
In this article, we will be learning the steps followed to process the text data before using it to train the actual Machine Learning Model. Importing Libraries The following must be installed in the current working environment: NLTK Library: The NLTK library is a collection of libraries and program
2 min read
NLP | Splitting and Merging Chunks
In natural language processing (NLP), text division into pieces that are smaller and easier to handle with subsequent recombination is an essential process. These actions, referred to as splitting and merging, enable systems to comprehend the language structure more effectively and allow for analysi
3 min read