ZAX RESEARCH_PAPER_2
ZAX RESEARCH_PAPER_2
ISSN: 2319-7064
SJIF (2022): 7.942
Abstract: ZAX, a virtual embedded voice assistant that includes cutting-edge technology based on gTTS and Python in developing a
personalized assistant. ZAX integrates the functionality of AIML and, together with Google, the industry leader, a text-to-speech
platform and thus male/female voices into the gTTS libraries powered by the Marvel world. This is often the result of adopting the
dynamic base Pyttsx Pythons considered wise in contiguous phases of gTTS, facilitating the establishment of essentially fine-tuned
dialogues between assistants management and users. It will help end users in their daily activities like general human speech, query
search in Google, Bing or Yahoo, video search, image retrieval, live weather, word meaning, predict and remind users of scheduled
events and tasks. This is often the sole result of over-contributing by multiple contributors, such as AIML’s usability and ability to
dynamically merge with platforms like Python [pyttsx] and gTTS [Google Text to Speech] ] results in the same ZAX standard structure
showing general reusability and almost zero or no maintainability.
1. Introduction AI voice assistants often perform simple tasks for end users,
such as adding tasks to the calendar; provide information
AI voice assistant, also known as a virtual or digital that can usually be searched in an Internet browser; or
assistant, is a device that uses voice recognition technology, control and check the health of sensitive devices in the
natural language processing, and Artificial Intelligence (AI) home, send emails, setting up of alarms, getting weather
to respond to people. Through technology, the device reports, can give your location, perform some basic
aggregates user messages, breaks them down, rates them, mathematical calculations, check news, start the music, and
and gives meaningful feedback in return. Artificial open different websites like stack overflow, you tube,
intelligence can bring real conversations. Virtual assistants, Facebook etc.
understand natural language voice commands and performs
tasks for users. These tasks, previously performed by a 2. Related Work
personal assistant or secretary, include dictation, reading text
messages or exchanging email messages aloud, schedule 2.1 Generalization
appointments for end users. The AI assistant can also
perform other activities, such as sending messages, The below mentioned pie chart shows the analysis of virtual
answering phone calls, and getting directions. It also helps to assistants in context to education as well as purpose of this
read news and weather updates, open Google, You Tube, work with a total of papers from 13 countries. The highest
Stack Overflow, etc. , answer any questions, web scraping, contribution was made by country England with most
play mu-sic, etc. Although this definition emphasizes the number of papers (3), followed by Russia and Switzerland
digital style of a virtual assistant, the term virtual assistant or (2 papers each). The remaining countries, namely Singapore,
virtual personal assistant is additionally unremarkably wont Pakistan, Canada, India, France, Bulgaria, Saudi Arabia and
to describe contract employees United Nations agency work Germany are also mentioned with 1 paper each (Figure 2.1)
from home and perform body tasks un-remarkably
performed by executives, assistant or secretary. Digital
assistants can also be compared with other form of
consumer-facing AI programming known as responsive
advisors. Sensible adviser programs are topic-oriented,
whereas virtual assistants are task-oriented.
3. System Analysis • And, with the help of machine learning modules and
Deep Learning modules built emotions in the model and
3.1 Training Model dataset to help the model in training.
• With the help of NN as neural network and NLP as 3.2 Neural Networks
natural language processing, create a brain of the model.
"NN reflects the behavior of the human brain, enabling
computer programs to recognize patterns and solve common
When an input layer is specified, weight area units are Figure 3.2: Natural Language Processing
assigned. These weights make it easy to see the importance
of a particular variable. Large variables pay a lot of attention 3.3 Speech Recognition System
to the output for different inputs. Then the units of all input
areas are incremented and summed with different weights. The speech recognition system is the core of the voice
Then the output is passed. When this output exceeds a application system, which is capable of understanding the
certain threshold, the node is triggered and knowledge is voice input given by the user, and at the same time operating
propagated to future layers in the network. This makes the the applications efficiently and generating voice feedback to
exit of one node the entrance of future nodes. This method the user. This system is an important component for users as
of passing knowledge from one layer to the future layer a gateway to use their voice as an input component. (Figure
defines this neural network as a feed forward network. 3.3) . In a word, in order to clearly recognize the user’s
speech command and get a response from the system, we
should consider that the speech recognition system contains
the whole process by which the application system directs
the generation. voice signal to text data and some important
meanings, forms of speech.
Initially the condition is that if the ZAX voice assistant is Now after the proceedings if the skills to be executed are
active or not, if it is active then it asks for the user input adequate to ZAX then it gives a positive response to the
otherwise make ZAX active(make it on). Then user provides user in form of speech and then executes the commands for
the input in the form of speech or text, after that if the input operations, hence gives the console output and speech.On
provided is in text then it goes for the action to be taken or the other hand if the skills to be executed are not adequate or
the skills to be executed, else if the input is in speech then it inappropriate to ZAX it gives a negative response and
uses the speech recoginition feature and converts it into text executes no further commands to give console output.
and goes for the action. (Figure 4.1) (Figure 4.1)
• The user sends command to the voice assistant ZAX Figure 4.3: Use Case Diagram
then it forwards it to Interpreter i.e. speech recoginition
feature here and then is directed perform the specific 4.4 Activity Diagram
task, after the processing in task model ZAX executes
the task and give the response or feedback to the user.
(Figure 4.2)
• If after the processing at task model there is some
missing information then ZAX asks for that information,
takes the input again, gathers all information and follow
the same process as detailed above. (Figure 4.2)
5. Experimental Result
On User speech command voice assistant display google
search of the query asked and read the solution for the user
too.(Figure 5.1)
On User speech command voice assistant open ‘my location’ (Figure 5.3)
References
[1] Alotto, F., Scidà, I., and Osello, A. (2020). “Building
modeling with artificial intelligence and speech
recognition for learning purpose.” Proceedings of
EDULEARN20 Conference, Vol. 6. 7th.
[2] Beirl, D., Rogers, Y., and Yuill, N. (2019). “Using
voice assistant skills in family life.” Computer-
Supported Collaborative Learning Conference, CSCL,
Vol. 1, Inter-national Society of the Learning Sciences,
Inc. 96–103.
[3] Canbek, N. G. and Mutlu, M. E. (2016). “On the track
of artificial intelligence: Learning with intelligent
personal assistants.” Journal of Human Sciences,
13(1), 592–601.
[4] Malodia, S., Islam, N., Kaur, P., and Dhir, A. (2021).
“Why do people use artificial intelligence (AI)-enabled
voice assistants?.” IEEE Transactions on Engineering
Management.