An AI Powered Desktop Voice Assistant For Windows
An AI Powered Desktop Voice Assistant For Windows
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - This work introduces a Windows-compatible 1. INTRODUCTION
AI assistant that executes tasks via voice commands.
Our architecture implements two key AI components: What is voice assistant and how it works. Many of us
(1) real-time language interpretation (2) adaptive might have already known about this voice assistant and
command recognition. This assistant will enable users to we use this in our day-to-day life. A voice assistant is a
perform a range of tasks, such as managing files, digital assistant that uses voice recognition, language
opening and closing different applications, searching the processing algorithms, and voice synthesis to listen to
information on web, setting reminders, creating a note, specific voice commands and return relevant information
opening and managing the files, opening various AI or perform specific functions as requested by the user.
tools, managing the drive, opening the calculator and do
These personal assistants can be easily configured to
some basic calculations and even controlling system
perform many of your regular tasks by simply giving
settings all through voice commands. The voice assistant
voice commands. The Most famous application of
leverages advanced speech recognition technology to
iPhone is “SIRI” which helps the end user to
accurately interpret and process human voice
communicate end user mobile with voice and it also
commands. Upon receiving user input, the system
responds to the voice commands of the user. Same kind
dynamically executes requested tasks or retrieves
of application is also developed by the Google that is
relevant information. Key capabilities include real-time
“Google Voice Search”
voice-to-text conversion, context-aware natural language
understanding, and seamless integration with third-party which is used for in Android Phones. But this
APIs to extend its functionality. Designed for universal Application mostly works on the desktop.
accessibility, the intuitive interface ensures effortless It accepts spoken commands via microphone array or
interaction for users of all technical proficiencies. The typed queries through the GUI. Modern AI voice
user interface is designed to be intuitive and user- assistants boost productivity by enabling hands-free PC
friendly, providing a seamless experience for both control, reducing task time by 30-40%. Through AI-
novice and experienced users. By creating a personal powered voice automation we save our time and
desktop assistant that combines convenience, contribute in other works.
automation, and personalized features, this project aims
to enhance users' productivity and efficiency in their 1.1 LITERATURE REVIEW
day-to-day computer tasks.
This study examines the acceptability of voice-
activated personal assistants (VAPA) in public areas and
Key Words: Desktop, Voice-based, Integration, emphasizes the concerns of users related to privacy,
Language, Voice Assistant, Action, Response, GUI social norms and usability. Research emphasizes the
importance of VAP design, which is in line with the
expectation of users and public labels. It also discusses
the potential of VAP to increase the interaction with the
human computer in a shared environment. These findings
provide valuable knowledge for developers aimed at availability and productivity. It also deals with
improving public usability of voice assistants [1]. restrictions on internet connection and language support
[6].
This article represents an AI -based voice assistant
to improve users' interaction through natural language The study discusses the integration of a
and machine learning. The authors discuss the technical combination of NLP techniques and machine learning
architecture and functionality of the system and models, and voice recognition technologies to create
emphasize its potential applications in intelligent responsive and context-aware personal assistants. The
environments. The study emphasizes the growing role of authors examine various existing intelligent assistants,
AI in increasing more intuitive and efficient assistants. It such as Apple’s Siri, Google Assistant, and Amazon
also deals with challenges such as user accuracy and Alexa, comparing their functionalities, architectures, and
adaptability [2]. limitations. Additionally, the paper highlights challenges
like data privacy, user adaptation, and multilingual
This research focuses on the development of a AI-
support, while also speculating on future advancements,
powered voice assistant developed with Python’s speech
including enhanced contextual understanding and
recognition libraries (Speech Recognition, PyAudio) and
emotion recognition. This work contributes to the
NLP frameworks (NLTK, spaCy) for intent processing
broader discourse on human-computer interaction and
and describes in detail the implementation of speech
AI-driven automation. [7].
recognition and text functions on speech. The post
emphasizes the simplicity and efficiency of Python The study explores the integration of automatic
libraries, such as speech recognition and pyaudio in speech recognition (ASR) paired with NLP, and
building voice systems. It also discusses potential automation techniques to enhance user interaction with
applications in automation and user. The study serves as computers. The authors describe the development
a practical guide for developers who are interested in process, including the use of Python libraries such as
creating voice solutions [3]. Speech Recognition, pyttsx3, and nltk to enable voice
command execution. The paper highlights various
This work examines the creation of A Python-
functionalities of the assistant, such as web searches,
based ASR system leveraging Speech Recognition and
email automation, and smart home control. Additionally,
PyAudio and emphasizes the integration of libraries such
it discusses challenges related to accuracy, response
as PytSX3 and speech recognition. The authors discuss
time, and security, providing insights into future
the ability of the system to perform tasks such as voice
enhancements for more efficient and intelligent virtual
commands and text conversions. The contribution
assistants [8].
emphasizes the growing availability of speech
technology for developers. It also emphasizes the 1.2 TECHNOLOGIES USED:
potential of such systems in increasing productivity and
user experience [4]. a. Python: Python is a well-liked programming language
at a high level that is recognized for its
This study represents the development of AI
straightforwardness, comprehensibility, and user-
assistant on the area using Python focusing on its ability
friendliness. Python 3.10.0 is being utilized in the
to perform tasks through voice commands. The authors
development of this AI powered voice assistant.
discuss the integration of AI processing and natural
b. Visual Studio Code: Microsoft's Visual Studio Code
language to improve the user interaction. The
(VS Code), the extensible code editor, provides
contribution emphasizes the potential of the system in
comprehensive
automating routine tasks and improving efficiency. It
language support including Python, JavaScript, and
also deals with challenges such as accuracy and response
C++ through its modular extension system.
time [5].
2. METHODOLOGY
This research focuses on designing a voice personal
assistant for PCS using Python and emphasizes its
There are three modules in this assistant. The first
application in simplifying user interactions. The authors
step is for the assistant to bring the voice user input.
discuss the use of Python libraries to achieve speech
Second, analysing the user's input and translate it into the
recognition and automation of tasks. The contribution
appropriate intention and function. The third is an
emphasizes the potential of the system to increase
assistant who provides the user with the result all the
© 2025, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM45648 | Page 2
International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 09 Issue: 04 | April - 2025 SJIF Rating: 8.586 ISSN: 2582-3930
• The user has the option to configure the assistant by elif "open new tab" in response:
participating in the” Configure Assistant” use case. speak("Opening a new tab in Notepad.")
• The” Perform Voice Command” use case leads to temp_file_path =
the” Process Command” use case, where the assistant os.path.join(os.environ["USERPROFILE"], "Desktop",
f"Temp_Notepad_File_{len(open_files) + 1}.txt")
analyses and comprehends the received voice
subprocess.Popen(["notepad.exe",
command.
temp_file_path])
• Upon understanding the command, the assistant open_files.append(temp_file_path)
proceed to the” Execute Action” use case, where it
performs the appropriate action. elif "previous" in response or "saved" in response:
speak("What is the name of the file you want to
Once the action is executed, the assistant enters the”
open?")
Provide Response” use case, generating a response for
file_name = takeCommand().strip()
the user.
• The user can interact with the output or response if file_name: # Only proceed if a valid file name is
through the” Interact with Output” use case. given
• The loop indicates that the user can continue search_directories = {
performing voice commands and interacting with the "Desktop":
assistant as needed. os.path.join(os.environ["USERPROFILE"], "Desktop"),
"Documents":
os.path.join(os.environ["USERPROFILE"],
3.3 SAMPLE CODE "Documents")
}
Several activities based on the desktop can be
performed using this AI powered voice assistant. For found_file = None
every such activities separate codes are developed. A for location, directory in search_directories.items():
sample code out of these codes is given in Table 1. file_path = os.path.join(directory, file_name + ".txt")
if os.path.exists(file_path):
Table 1. Sample code for opening a Notepad speak(f"Opening the file {file_name} saved in
def open_notepad(): {location}.")
"""Opens Notepad and listens for user commands to subprocess.Popen(["notepad.exe",
open new tabs, write, or manage files.""" file_path])
try: open_files.append(file_path)
open_files = [] # List to track opened files found_file = file_path
break
while True:
speak("Do you want to open a new file, a previously if not found_file:
saved file, or open a new Notepad window?") speak(f"Sorry, I couldn't find a file named
response = takeCommand().lower() {file_name} in your Documents or Desktop.")
else:
if "new file" in response or "open new window" in speak("I didn't understand. Opening a new Notepad
response: file by default.")
speak("Opening a new Notepad file.") temp_file_path =
temp_file_path = os.path.join(os.environ["USERPROFILE"], "Desktop",
os.path.join(os.environ["USERPROFILE"], "Desktop", f"Temp_Notepad_File_{len(open_files) + 1}.txt")
4. RESULTS & FUTURE SCOPE Opening Apps Can open apps Can open apps
like Notepad, but fails in 20%
4.1 RESULTS Calculator, of cases (e.g.,
Chrome, etc. misinterprets the
The GUI created and used for this work is shown app name).
in Fig. 4. All the use cases were thoroughly tested for
functionality. The results are encouraging. All these Opening Can open Cannot handle
results were used then for comparison with existing voice Notepad and Notepad with file-saving
assistants. One of the results, which is response to “Open Asking to Save multiple tabs commands.
Notepad” voice command is shown as a sample in Fig. 5 when asked to Fails to
open, ask to recognize
save the file, "save" or "file
and specify the name"
file name. commands.
(Copy, Move, with voice operations. The graphical representation of the above
Delete Files) commands. comparison is shown in Fig. 6. The overall efficiency of
the proposed work comes out to be 92.33%, which is
AI-Powered Includes AI- Available in very good against 52.78% efficiency of the existing
Summarization powered Copilot for voice assistants [9, 10].
summarization Microsoft 365
and web search (Word, Outlook)
functionality via but not system-
AI wide
performs tasks such as opening and closing applications, in Science, Communication and Technology
file management, mathematical calculations, volume (IJARSCT), Volume 6, Issue 2, June 2021. ISSN
control, and video playback using voice commands. (Online): 2581-9429
[6] V Geetha & Gomathy, C K & Kottamasu, Manasa
A comparative analysis with an existing project & Kumar, Nukala. (2021). The Voice Enabled
highlights significant improvements in accuracy and Personal Assistant for Pc using Python. International
performance. Our assistant achieves an overall accuracy Journal of Engineering and Advanced Technology.
of 92.33%, compared to 52.78% for the existing system. 10. 162-165. 10.35940/ijeat.D2425.0410421
Notable enhancements include 100% accuracy in [7] Aditya Sinha, Gargi Garg, GouravRajwani, Shimona
opening applications, 96% accuracy in file searching, Tayal, “Intelligent Personal Assistant”, International
and 92% accuracy in mathematical calculations, Journal of Informative Futuristic Research, Volume.
outperforming previous models in every key 4, Issue 8, April 2017.
functionality.
[8] Vadaboyina Appalaraju, V Rajesh, K Saikumar, P.
Sabitha” Design and Development of Intelligent
The results indicate the proposed assistant offers
Voice Personal Assistant using Python” 2021 3rd
a more reliable and user-friendly experience, with
International Conference on Advances in
potential applications in productivity, automation, and
Computing, Communica- tion Control and
accessibility. Future work could explore AI-driven
Networking (ICACCCN)
enhancements for better contextual understanding and
expanded functionality. [9] Silky Sharma, Prof.(Dr.) Gurinder Singh,
“Comparison of Voice Based Virtual Assistants
REFERENCES fostering Indian Higher Education – A Technical
Perspective”, 2021 International Conference on
[1] Easwara Moorthy, Aarthi & Vu, Kim-Phuong, Technological Advancements and Innovations
“Voice Activated Personal Assistant: Acceptability (ICTAI), November 2021, DOI:
of Use in the Public Space” HIMI 2014. Lecture 10.1109/ICTAI53825.2021.9673307
Notes in Computer Science, vol 8522. Springer, pp. [10] Andreas M. Klein, Maria Rauschenberger, Jorg
324-334, 10.1007/978- 3-319-07863-2_32.J. Clerk Thomaschewski, and Maria Jos´e Escalona,
Maxwell, A Treatise on Electricity and Magnetism, “Comparing Voice Assistant Risks and Potential
3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73. with Technology-Based Users: A Study from
[2] Subhash, P. N. Srivatsa, S. Siddesh, A. Ullas and B. Germany and Spain”, Journal of Web Engineering,
Santhosh, "Artificial Intelligence-based Voice Vol. 20 7, pp 1991–2016, doi: 10.13052/jwe1540-
Assistant," 2020 Fourth World Conference on Smart 9589.2071, November 2021.
Trends in Systems, Security and Sustainability
(WorldS4), 2020, pp. 593-596, doi:
10.1109/WorldS450073.2020.9210344K. Elissa,
“Title of paper if known,” unpublished.
[3] Harshit Agrawal, Nivedita Singh, Gaurav Kumar,
Dr. Diwakar Yagyasen, Mr. Surya Vikram Singh
“Voice Assistant Using Python” 2021, IJIRT
Volume 8 Issue 2, ISSN: 2349-6002, pp.419-423.
[4] Mrs.A.M.Sermakani, J.Monisha, G.Shrisha,
G.Sumisha, “Creating Desktop Speech
Recognization Using Python Programming.”
IJARCCE, Vol. 10, Issue 3, March 2021, ISSN
(Online), pp.129-134
[5] Abeed Sayyed, AshpakShaikh,
AshishSancheti,Swikar Sangamnere, Prof. Jayant H
Bhangale. “Desktop Assistant AI Using Python”
(2021) International Journal of Advanced Research
© 2025, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM45648 | Page 9