Research
Table of Contents
As a Software Engineer working for GenAI organization at Google DeepMind, I specialize in developing and refining large multimodal AI models, leveraging Large Language Models (LLMs) alongside techniques like Parameter Efficient Fine Tuning (PEFT), Supervised Fine-Tuning (SFT), and Reinforcement Learning from Human Feedback (RLHF). Previously, as part of Google’s Speech Research Team, I contributed to building large-scale Automatic Speech Recognition (ASR) models, with a focus on domain adaptation, data minimization through unsupervised learning, parameter-efficient fine-tuning, speech personalization, contextualization, and bias mitigation."
Strongest Areas: GenAI, AI, LLM, ASR, Deep Learning, Natural Language Processing, Speech, Machine Learning Data Structures and Algorithms
Publications #
Improving Speech Recognition for African American English with Audio Classification
Authors: Shefali Garg, Zhouyuan Huo, Khe Chai Sim, Suzan Schwartz, Mason Chua, Alëna Aksënova, Tsendsuren Munkhdalai, Levi King, Darryl Wright, Zion Mengesha, Dongseong Hwang, Tara Sainath, Françoise Beaufays, Pedro Moreno Mengibar
ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing, 2024
Link to PaperLarge-scale ASR Domain Adaptation Using Self-and Semi-supervised Learning
Authors: Dongseong Hwang, Ananya Misra, Zhouyuan Huo, Nikhil Siddhartha, Shefali Garg, David Qiu, Khe Chai Sim, Trevor Strohman, Françoise Beaufays, Yanzhang He
ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing, 2022
Link to PaperA Comparison of Supervised and Unsupervised Pre-Training of End-to-End Models
Authors: A. Misra, D. Hwang, Z. Huo, S. Garg, N. Siddhartha, A. Narayanan, K.C. Sim
Interspeech, 731-735, 2021
Link to PaperUserLibri: A Dataset for ASR Personalization Using Only Text
Authors: Theresa Breiner, Swaroop Ramaswamy, Ehsan Variani, Shefali Garg, Rajiv Mathews, Khe Chai Sim, Kilol Gupta, Mingqing Chen, Lara McConnaughey
Interspeech 2022
Link to PaperPentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers Using Language Inference and Question Entailment
Authors: H. Pugaliya, K. Saxena, S. Garg, S. Shalini, P. Gupta, E. Nyberg, T. Mitamura
ACL-BioNLP Workshop 2019 arXiv preprint arXiv:1907.01643, 2019
Link to PaperIncremental Layer-wise Self-supervised Learning for Efficient Speech Domain Adaptation on Device
Authors: Zhouyuan Huo, Dongseong Hwang, Khe Chai Sim, Shefali Garg, Ananya Misra, Nikhil Siddhartha, Trevor Strohman, Françoise Beaufays
arXiv preprint arXiv:2110.00155, 2021
Link to Paper
Press & media #
- Intern developing facial recognition app for Google Glass [link]
Memberships & academic services #
- Reviewer for International Conference on Speech and Computer (SPECOM) 2024
- Reviewer for Conference on Neural Information Processing Systems (NeurIPS) 2024
- Reviewer for IEEE Spoken Language Technology Workshop (SLT) 2024
- Member IEEE Signal Processing Society (SPS) Society
Scholarships & Awards #
- Mitacs Globalink Research Internship by Mitacs, Canada in Mar 2015
- Inspire Scholarship for Higher Education (S.H.E) : Issued by Department of Science and Technology (DST), Ministry of Science and Technology, Government of India in Aug 2011
- National Talent Search Examination Scholar (NTSE) : Issued by National Council of Educational Research and Training in Mar 2009