0% found this document useful (0 votes)
32 views

Technical Paper

Uploaded by

pameluft
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Technical Paper

Uploaded by

pameluft
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

AI - Voice Desktop Assistant

Prof. Aarti Dharmani Mayuri Khatpe Priyanka Gayake Suhasini Sharma


Usha Mittal Institute of Technology Usha Mittal Institute of Technology Usha Mittal Institute of Technology Usha Mittal Institute of Technology
SNDT University SNDT University SNDT University SNDT University

Abstract—AI desktop assistants like Apple’s ”SIRI” and at the end of the paper, highlighting the sources that provided
Google’s ”Google Voice Search,” can perform tasks and provide information and support for the research project.
services based on user commands. These systems use speech
recognition to respond to synthetic speech, allowing users to
communicate with their devices. The proposed system, which can II. L ITERATURE S URVEY
work with or without internet connectivity, uses voice recognition
to process user input and provide various outputs. AI-based In the realm of technology, researchers are continuously
personal assistants aim to bridge the communication gap between advancing the capabilities of virtual assistants and AI-
humans and machines, creating a more engaging user experience. driven communication systems. Leandro Tibola and Liane
Margarida Rockenbach Tarouco [1] emphasize the importance
Index Terms—AI, SIRI, Google Voice Search, Speech Recog-
of interoperability in virtual worlds, highlighting the role
nition, Internet, Personal Assistants, User Experience
of WWW services using HTTP and XML to enhance
I. I NTRODUCTION communication between virtual and real-world entities
AI assistants have a long history, starting with the Turing while bolstering security measures against modern operating
Test in the 1950s. Advances in machine learning, particularly systems.
deep learning and neural networks, led to breakthroughs like
OpenAI’s GPT-3 in 2020. Today, AI assistants are used in Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya
customer service, translation, content generation, and personal Sutskever [2] showcase significant advancements in natural
virtual assistants like Siri and Alexa. language understanding through generative pre-training
and discriminative fine-tuning. Their task-agnostic model
The goal of creating an AI desktop assistant that talks like outperforms discriminatively trained models in various
humans is to enhance user experience, facilitate efficient task language understanding tasks, marking a notable stride in
execution, make technology more accessible and inclusive, language processing capabilities.
provide personalized assistance, and push the boundaries of
natural language processing. This innovative project requires Deepak Shende, Ria Umahiya, Monika Raghorte,
advanced natural language understanding and pushes the Aishwarya Bhisikar, and Anup Bhange [3] present an
boundaries of NLP research and development. The goal is AI-based voice assistant project implemented using Python,
to create an AI-powered desktop assistant that communicates leveraging open-source software modules and community
with users in a natural and relatable way, making technology support to ensure adaptability to future updates.
more approachable, efficient, and user-centric.
Rajat Sharma and Adweteeya Dwivedi [4] introduce
The research aims to enhance natural language ”JARVIS,” an AI voice assistant system employing speech
understanding in AI desktop assistants, train models recognition, gTTS, neural networks, and natural language
effectively, mitigate biases, and improve user experience, processing to deliver intelligent and responsive interactions
promoting inclusivity, enhancing productivity, and pushing tailored to specific circumstances.
NLP capabilities.
Afra Ali, Shweta Dubey, Shyam Dwivedi, Divisha Pandey,
The research paper explores the development of artificial in- Md. Saif Raza, and Muskan Srivastava [5] unveil a voice
telligence desktop assistants, focusing on recent developments assistant service for desktop users, integrating internet-
and techniques. It presents a literature survey in section II, of-things technology, speech recognition, and modern AI
discusses the proposed work in section III, and analyzes the technologies to provide enhanced functionalities and a
outcomes. Section V presents the results and discussion that seamless user experience.
provides insights into the assistant’s limitations and efficacy.
Section IV gives the implementation details including the Vedant Kulkarni, Shreyas Kallurkar, Vipul Waikar, Saurabh
design and implementation of the virtual desktop assistant. The Patil, and Swarupa Deshpande [6] present a framework
study concludes with a conclusion and future directions for for a virtual assistant that overcomes existing constraints,
further research in Section VI. A list of references is provided promising improved effectiveness and usability in processing
voice inputs.

Dr. C. K. Gomathy, Redrouthu Venkata Narayana,


Thota Vamsi Khrishna, and DR. V. Geetha [7] contribute
to automated communication systems with an artificial
intelligence chatbot using Python, showcasing the potential
of leveraging technologies like Pip, NumPy, TensorFlow, and
random, albeit with limited functionality for specific queries
and conversation types.

T R and Mahesh [8] propose a personal AI desktop


assistant utilizing Python’s speech recognition and text-to-
speech libraries, aiming to enhance user convenience and
productivity in daily computer-related tasks.

Sangpal, Ravivanshikumar, Gawand, Tanvee, Vaykar,


Sahil, and Madhavi, Neha [9], explore the integration of
gTTS, AIML, and Python in their interpretation of JARVIS,
highlighting the benefits while acknowledging dependency
issues on specific platforms.

Finally, Nagappan, Umapathi, Ganesan, Karthick,


Venkateswaran, Natesan, Ramalingam, Jegadeesan, and
Srinivas, Dava [10], contribute to the discourse with the
development of a desktop virtual assistant using Python,
leveraging speech recognition, APIs, and text-to-speech
capabilities to bridge the gap between human and computer
interactions.

As the literature survey unfolds, a tapestry of innovation and


exploration emerges, weaving together diverse perspectives
and methodologies in pursuit of more intuitive, responsive,
and inclusive AI-driven solutions.
Figure 1. The flowchart details entails the various steps involved in the
III. P ROPOSED W ORK development of virtual desktop assistant.
In this section, we present our proposed work aimed at
developing an advanced AI-driven voice assistant system
leveraging Python and various modern technologies. Building − Speech Recognition: Converts speech to text, enabling
upon the foundational research conducted by experts in the the AI assistant to understand spoken commands.
field, our project endeavors to create a versatile and intelligent − Pyttsx3: Converts text to speech, allowing the AI
voice assistant capable of seamlessly integrating with users’ assistant to provide spoken responses to users.
daily routines while offering enhanced functionalities and user − Datetime: Provides access to current date and time
experiences. information, facilitating time-based functionalities such
as setting alarms or providing time-related responses.
A. Methodology − OS: Enables interaction with the operating system,
To achieve our objectives, we will adopt a multi-faceted allowing the AI assistant to perform system-related
methodology that encompasses several key components. The tasks such as file operations or launching applications.
flowchart given below for an overview of the overall steps in − Pyaudio: Handles audio input and output, essential for
the methodology: voice-based applications like speech recognition and
The various steps involved in the methodology are briefly voice assistants.
explained as follows: Additionally, other important libraries may include:
• Installing necessary libraries: We began by installing − Wikipedia: Conducts searches on Wikipedia, provid-
the necessary Python libraries and modules. It’s essential ing access to vast amounts of information.
to prioritize the important libraries that are crucial for − Requests: Sends HTTP requests, useful for accessing
the functionality of virtual assistant. Some of them are as web-based data or APIs.
follows: − Webbrowser: Allows for opening web browsers or
displaying web-based documents directly from code. IV. I MPLEMENTATION D ETAILS
− Random: Generates pseudo-random numbers, facilit- For implementation of the voice desktop assistant following
ating randomness in various actions. are the minimum hardware configurations that are required:
− Beautiful Soup: Extracts data from XML and HTML
• Processor: The program needs a processor that is at least
files, useful for web scraping tasks.
as powerful as an Intel Core 2 Duo. Although this is the
And so on for additional libraries. bare minimum, a more potent CPU is advised for lag-
• Speech Recognition and Text to Speech: We im- free performance. Performance will be much enhanced
plemented a speech recognition system that converts with a multi-core processor, like an Intel Core i5 or
spoken language into text. For this we will use the i7, especially for jobs requiring voice recognition and
SpeechRecognition library in Python. Along with this artificial intelligence.
we implement a text-to-speech system that converts the • RAM (Random Access Memory): A minimum of 6
assistant’s responses (text) into natural-sounding speech. GB of RAM is required. However, to handle resource-
• Task Execution and Personalization: Develop modules intensive tasks and ensure smooth multitasking, it’s ad-
to perform various tasks based on user intents, such as visable to have more RAM, such as 8 GB or 16 GB. The
opening websites, playing music, asking time and date, amount of RAM affects the software’s ability to process
etc. voice commands and AI tasks efficiently.
• User Interface Design: A user-friendly interface will • Hard Drive (HDD/SSD): A minimum of 256 GB of
be developed to facilitate intuitive interactions with the storage space is essential for software installation, data
voice assistant, ensuring a smooth user experience across storage, and updates. Consider using a Solid State Drive
various devices and platforms. (SSD) instead of a Hard Disk Drive (HDD) for faster
• Integration of External APIs: Integrate external APIs data access and improved overall system performance.
for accessing relevant information and services, such Additionally, having extra storage space is beneficial if
as weather forecasts, news updates, and online search you plan to store a significant amount of data.
functionalities, to enrich the assistant’s capabilities. The minimum software configurations required are as follows:
• Operating System: This software is compatible with
B. System Design Windows 11 (64-bit) to access the latest features and
The virtual desktop assistant takes voice commands, recog- security updates and ensure that your system meets the
nizes them, processes them, and executes the requested task. required requirements.
• Integrated Development Environment (IDE): Visual
It responds dynamically to user input, executing tasks based
on their desires, as depicted by the data flow diagrams (DFD) Stodio Code (VS Code) is a free, open-source code editor
of Figure 2 and Figure 3 respectively. that supports Python and offers real-time functionality,
creating an efficient environment for code development,
debugging, and collaboration.
• Programming Language (Python): The software, de-
veloped using Python, is suitable for AI and voice recog-
nition tasks and requires Python version 3.9.11 for proper
functionality.
Figure 2. The diagram depicts Level 0 DFD of a end user-virtual assistant − Why Choose Python Language for Vikram?
interaction. The user sends a command, the assistant processes it taking the Python, originating before the surge in popularity of
relevant information using APIs and the user receives a response. The diagram
shows the user’s role in the communication process, with the assistant using
machine learning and AI, remains a preferred choice
various APIs. owing to its distinctive attributes, which distinguish
it from other programming languages. These qualities
include:
1) Rich Collection of Packages and Libraries: Python
boasts an extensive array of packages and libraries,
which, despite its inception predating the rise of ma-
chine learning and AI, render it invaluable for these
domains.
2) Enhanced Code Readability: Python’s elegant and
concise syntax significantly aids machine learning
endeavors by simplifying program composition and
enhancing readability, facilitating comprehension and
Figure 3. The diagram depicts Level 1 DFD illustrating end user input process maintenance.
and subsequent system response and actions. 3) Versatility and Flexibility: Python’s inherent ver-
satility empowers developers to tackle complex tasks
efficiently with minimal overhead, making it an ideal
choice for diverse applications and rapid prototyping.
The enduring popularity of the Python language is
evident, as illustrated in Figure 4.

Figure 7. Controlling video playback on YouTube; allows user to perform


actions like pausing, playing, seeking backward or forward, toggling full
screen, muting, etc. Here on user’s request to ”pause” the video, the assistant
pauses the video.

Figure 4. The above figure showcases the top 10 most popular programming
languages globally, including Python, Java, and JavaScript.

V. R ESULTS
This section presents visual project results, including
screenshots of our AI assistant’s functionality and user
interface. These provide a firsthand view of the system’s
performance and usability, highlighting user interaction Figure 8. On saying ”WhatsApp”, the assistant enables the user to send
flow, interface design, and tasks executed. WhatsApp messages to specified contacts by dictating the message via voice
input.

Figure 9. As it can be seen the WhatsApp message is sent to the specified


person.
Figure 5. Greets the user based on the time of day and asks how it can assist

VI. C ONCLUSION AND F UTURE S COPE


AI desktop assistants revolutionize human-machine
interactions, benefiting customer service, healthcare, and
education by streamlining tasks, enhancing productivity,
and providing personalized assistance, shaping the future
of technology.

While AI desktop assistants have already demonstrated


their value, there are several avenues for future devel-
opment and enhancement to fully unlock their potential
considering the factors of limitations, security and stabil-
Figure 6. On saying ”Wikipedia (query)”, the assistant gives the relevant ity of the system. Here are some suggestions for future
information for the given query scope:
• Offline Functionality: Develop offline capabilities to
ensure uninterrupted functionality even without an in- [6] Vedant Kulkarni, Shreyas Kallurkar, Vipul Waikar, Saurabh Patil,
ternet connection, expanding the assistant’s utility and Swarupa Deshpande (2022). ”Virtual Assistant Using Python.”
[7] Dr. C. K. Gomathy, Redrouthu Venkata Narayana, Thota Vamsi Khrishna,
accessibility in diverse environments. DR. V. Geetha (2022). ”Artificial Intelligence Chatbot using Python.”
• Enhanced Security Features: Implement advanced se- [8] T R, Mahesh (2023). ”Personal A.I. Desktop Assistant.”
curity measures such as encryption, multi-factor authen- [9] Sangpal, Ravivanshikumar, Gawand, Tanvee, Vaykar, Sahil, Madhavi,
Neha (2019). ”JARVIS: An interpretation of AIML with integration of
tication, and biometric recognition to safeguard user gTTS and Python.”
data and privacy, bolstering trust and confidence in the [10] Nagappan, Umapathi, Ganesan, Karthick, Venkateswaran, Natesan,
assistant’s security measures. Ramalingam, Jegadeesan, Srinivas, Dava (2023). ”DESKTOP’S VIR-
TUAL ASSISTANT USING PYTHON.”
• AI Model Training: Continuously refine AI models
through ongoing training and optimization to enhance
accuracy, responsiveness, and natural language under-
standing, ensuring more personalized and effective inter-
actions.
• Voice Recognition Improvement: Invest in research and
development to improve voice recognition algorithms,
enabling the assistant to better understand diverse accents
and environments for seamless interaction.
• Integration with Smart Home Devices: Expand in-
tegration with smart home devices to enable seamless
control and automation of connected devices, enhancing
convenience and efficiency for users.
• User Feedback Mechanisms: Implement mechanisms
for gathering user feedback and suggestions to iteratively
improve the assistant’s features, usability, and overall user
experience.
By addressing these areas for future development, we can
further elevate the capabilities, security, and user experience of
AI desktop assistants, ensuring their continued relevance and
effectiveness in meeting the evolving needs of users across
various domains.
ACKNOWLEDGMENT
First, we would like to thank Professor Rajesh Kolte,
Head of Department(Data Science) and our guide, Ms. Aarti
Dharmani for her valuable guidance and continuous support
during the project; her patience, motivation, enthusiasm, and
immense knowledge. Her direction and mentoring helped us
to work sucessfully on the project topic.

Our sincere gratitude to Dr. Yogesh Nerkar, Principal


(Usha Mittal Institute of Technology) for his valuable
encouragement and insightful comments.
We would also like to thank to all the teaching and non-
teaching staff for their valuable support.

Last but not the least we would like to thank to our parents
and friends.
R EFERENCES
[1] Leandro Tibola, Liane Margarida Rockenbach Tarouco (2013). ”Interop-
erability in Virtual World.”
[2] Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever (2018).
”Improving Language Understanding by Generative PreTraining.”
[3] Deepak Shende, Ria Umahiya, Monika Raghorte, Aishwarya Bhisikar,
Anup Bhange (2019). ”AI Based Voice Assistant Using Python.”
[4] ajat Sharma, Adweteeya Dwivedi (2022). ”JARVIS - AI Voice Assistant.”
[5] Divisha Pandey, Afra Ali, Shweta Dubey, Muskan Srivastava, Shyam
Dwivedi, Md. Saif Raza (2022). ”Voice Assistant Using Python and AI.”

You might also like