TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context NR Koluguri, T Park, B Ginsburg ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 52 | 2022 |
SpeakerNet: 1D depth-wise separable convolutional network for text-independent speaker recognition and verification NR Koluguri, J Li, V Lavrukhin, B Ginsburg arXiv preprint arXiv:2010.12653, 2020 | 34 | 2020 |
Meta-learning for robust child-adult classification from speech NR Koluguri, M Kumar, SH Kim, C Lord, S Narayanan ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 18 | 2020 |
Comparison of Speech Tasks and Recording Devices for Voice Based Automatic Classification of Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis. BN Suhas, D Patel, NR Koluguri, Y Belur, P Reddy, A Nalini, R Yadav, ... INTERSPEECH, 4564-4568, 2019 | 18 | 2019 |
Spectrogram enhancement using multiple window Savitzky-Golay (MWSG) filter for robust bird sound detection NR Koluguri, GN Meenakshi, PK Ghosh IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 (6), 1183 …, 2017 | 17 | 2017 |
Multi-scale speaker diarization with dynamic scale weighting TJ Park, NR Koluguri, J Balam, B Ginsburg arXiv preprint arXiv:2203.15974, 2022 | 11 | 2022 |
AmberNet: A Compact End-to-End Model for Spoken Language Identification F Jia, NR Koluguri, J Balam, B Ginsburg arXiv preprint arXiv:2210.15781, 2022 | 3 | 2022 |
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition D Rekesh, S Kriman, S Majumdar, V Noroozi, H Juang, O Hrinchuk, ... arXiv preprint arXiv:2305.05084, 2023 | 2 | 2023 |
The CHiME-7 Challenge: System Description and Performance of NeMo Team’s DASR System TJ Park, H Huang, A Jukic, K Dhawan, KC Puvvada, N Koluguri, N Karpov, ... CHiME-7 Workshop, 2023 | 1 | 2023 |
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition KC Puvvada, NR Koluguri, K Dhawan, J Balam, B Ginsburg arXiv preprint arXiv:2309.10922, 2023 | | 2023 |
Investigating End-to-End ASR Architectures for Long Form Audio Transcription NR Koluguri, S Kriman, G Zelenfroind, S Majumdar, D Rekesh, V Noroozi, ... arXiv preprint arXiv:2309.09950, 2023 | | 2023 |
Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach TJ Park, K Dhawan, N Koluguri, J Balam arXiv preprint arXiv:2309.05248, 2023 | | 2023 |
NeMo Open Source Speaker Diarization System}} TJ Park, NR Koluguri, F Jia, J Balam, B Ginsburg Proc. Interspeech 2022, 853-854, 2022 | | 2022 |
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation TJ Park, H Huang, C Hooper, N Koluguri, K Dhawan, A Jukic, J Balam, ... | | |
Prototypical Networks for Robust Automatic Child-Adult Classification from Speech M Kumar, N Koluguri, SH Kim, C Lord, S Narayanan INSAR 2020 Virtual Meeting, 0 | | |