


default search action
Siddhant Arora
- > Home > Persons > Siddhant Arora
Publications
- 2026
[i56]Siddhant Arora, Jinchuan Tian, Jiatong Shi, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
Optimizing Conversational Quality in Spoken Dialogue Systems with Reinforcement Learning from AI Feedback. CoRR abs/2601.19063 (2026)- 2025
[c51]Yosuke Kashiwagi, Hayato Futami, Emiru Tsunoo, Siddhant Arora, Shinji Watanabe
:
Hypothesis Clustering and Merging: Novel MultiTalker Speech Recognition with Speaker Tokens. ICASSP 2025: 1-5
[c48]Siddhant Arora, Jinchuan Tian, Hayato Futami, Jee-weon Jung, Jiatong Shi, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
Chain-of-Thought Training for Open E2E Spoken Dialogue Systems. INTERSPEECH 2025
[c47]Hayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Yuki Ito, Hassan Shahmohammadi, Siddhant Arora, Shinji Watanabe:
Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs. INTERSPEECH 2025
[c42]Siddhant Arora, Yifan Peng, Jiatong Shi, Jinchuan Tian, William Chen, Shikhar Bharadwaj, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Shuichiro Shimizu, Vaibhav Srivastav, Shinji Watanabe:
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems. NAACL (System Demonstrations) 2025: 248-259
[i52]Siddhant Arora, Yifan Peng, Jiatong Shi, Jinchuan Tian, William Chen, Shikhar Bharadwaj, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Shuichiro Shimizu, Vaibhav Srivastav, Shinji Watanabe
:
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems. CoRR abs/2503.08533 (2025)
[i48]Siddhant Arora, Jinchuan Tian, Hayato Futami, Jee-weon Jung, Jiatong Shi, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe
:
Chain-of-Thought Training for Open E2E Spoken Dialogue Systems. CoRR abs/2506.00722 (2025)
[i47]Hayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Yuki Ito, Hassan Shahmohammadi, Siddhant Arora, Shinji Watanabe
:
Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs. CoRR abs/2506.10299 (2025)
[i43]Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe:
Spiralformer: Low Latency Encoder for Streaming Speech Recognition with Circular Layer Skipping and Early Exiting. CoRR abs/2510.00982 (2025)
[i41]Siddhant Arora, Jinchuan Tian, Hayato Futami, Jiatong Shi, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
Chain-of-Thought Reasoning in Streaming Full-Duplex End-to-End Spoken Dialogue Systems. CoRR abs/2510.02066 (2025)- 2024
[c39]Hayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Hiroaki Ogawa, Siddhant Arora, Shinji Watanabe
:
Phoneme-Aware Encoding for Prefix-Tree-Based Contextual ASR. ICASSP 2024: 10641-10645
[c36]Hayato Futami, Siddhant Arora, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe
:
Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model. INTERSPEECH 2024
[c34]Yosuke Kashiwagi, Hayato Futami, Emiru Tsunoo, Siddhant Arora, Shinji Watanabe
:
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting. INTERSPEECH 2024
[c32]Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe
:
Decoder-only Architecture for Streaming End-to-end Speech Recognition. INTERSPEECH 2024
[c31]Siddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng, Roshan S. Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Karen Livescu, Shinji Watanabe
:
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions. NAACL-HLT 2024: 2754-2774
[i37]Hayato Futami, Siddhant Arora, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe
:
Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model. CoRR abs/2406.12317 (2024)
[i36]Yosuke Kashiwagi, Hayato Futami, Emiru Tsunoo, Siddhant Arora, Shinji Watanabe
:
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting. CoRR abs/2406.12611 (2024)
[i35]Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe
:
Decoder-only Architecture for Streaming End-to-end Speech Recognition. CoRR abs/2406.16107 (2024)
[i33]Yao-Fei Cheng, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Wen Shen Teo, Siddhant Arora, Shinji Watanabe
:
Task Arithmetic for Language Expansion in Speech Translation. CoRR abs/2409.11274 (2024)
[i32]Yosuke Kashiwagi, Hayato Futami, Emiru Tsunoo, Siddhant Arora, Shinji Watanabe
:
Hypothesis Clustering and Merging: Novel MultiTalker Speech Recognition with Speaker Tokens. CoRR abs/2409.15732 (2024)- 2023
[c25]Siddhant Arora, Hayato Futami, Emiru Tsunoo, Brian Yan, Shinji Watanabe
:
Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History. ICASSP 2023: 1-5
[c24]Siddhant Arora, Hayato Futami, Shih-Lun Wu, Jessica Huynh, Yifan Peng
, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe
:
A Study on the Integration of Pipeline and E2E SLU Systems for Spoken Semantic Parsing Toward Stop Quality Challenge. ICASSP 2023: 1-2
[c23]Hayato Futami, Jessica Huynh, Siddhant Arora, Shih-Lun Wu, Yosuke Kashiwagi, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe
:
The Pipeline System of ASR and NLU with MLM-based data Augmentation Toward Stop Low-Resource Challenge. ICASSP 2023: 1-2
[c22]Hayato Futami, Emiru Tsunoo, Kentaro Shibata, Yosuke Kashiwagi, Takao Okuda, Siddhant Arora, Shinji Watanabe
:
Streaming Joint Speech Recognition and Disfluency Detection. ICASSP 2023: 1-5
[c21]Yosuke Kashiwagi, Siddhant Arora, Hayato Futami, Jessica Huynh, Shih-Lun Wu, Yifan Peng
, Brian Yan, Emiru Tsunoo, Shinji Watanabe
:
E-Branchformer-Based E2E SLU Toward Stop on-Device Challenge. ICASSP 2023: 1-2
[c20]Yosuke Kashiwagi, Siddhant Arora, Hayato Futami, Jessica Huynh, Shih-Lun Wu, Yifan Peng
, Brian Yan, Emiru Tsunoo, Shinji Watanabe
:
Tensor decomposition for minimization of E2E SLU model toward on-device processing. INTERSPEECH 2023: 710-714
[c19]Siddhant Arora, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe
:
Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding. INTERSPEECH 2023: 720-724
[c18]Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe
:
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition. INTERSPEECH 2023: 1369-1373
[i29]Siddhant Arora, Hayato Futami, Emiru Tsunoo, Brian Yan, Shinji Watanabe
:
Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History. CoRR abs/2305.00926 (2023)
[i28]Hayato Futami, Jessica Huynh, Siddhant Arora, Shih-Lun Wu, Yosuke Kashiwagi, Yifan Peng
, Brian Yan, Emiru Tsunoo, Shinji Watanabe:
The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge. CoRR abs/2305.01194 (2023)
[i27]Siddhant Arora, Hayato Futami, Shih-Lun Wu, Jessica Huynh, Yifan Peng
, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe
:
A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge. CoRR abs/2305.01620 (2023)
[i24]Siddhant Arora, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding. CoRR abs/2307.11005 (2023)
[i23]Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe:
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition. CoRR abs/2307.12767 (2023)
[i22]Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe
:
Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation. CoRR abs/2309.08876 (2023)
[i18]Siddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng
, Roshan S. Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe
:
UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network. CoRR abs/2310.02973 (2023)
[i17]Hayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Hiroaki Ogawa, Siddhant Arora, Shinji Watanabe
:
Phoneme-aware Encoding for Prefix-tree-based Contextual ASR. CoRR abs/2312.09582 (2023)- 2022
[i10]Hayato Futami, Emiru Tsunoo, Kentaro Shibata, Yosuke Kashiwagi, Takao Okuda, Siddhant Arora, Shinji Watanabe
:
Streaming Joint Speech Recognition and Disfluency Detection. CoRR abs/2211.08726 (2022)

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-03-27 00:09 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID






