Nithin Rao Koluguri

Получаване на мой собствен потребителски профил

Позовавания

	Всички	От 2019
Позовавания	280	276
h-индекс	8	8
i10-индекс	7	7

120

20182019202020212022202320243 4 10 21 29 105 104

Публичен достъп

Преглед на всички

2 статии

0 статии

налични

неналични

Въз основа на изисквания при финансирането

Съавтори

Boris GinsburgNVIDIAПотвърден имейл адрес: nvidia.com
Taejin ParkNVIDIAПотвърден имейл адрес: nvidia.com
Prasanta Kumar GhoshAssociate Professor, Indian Institute of Science (IISc), BangaloreПотвърден имейл адрес: iisc.ac.in
Shrikanth (Shri) NarayananUniversity Professor and Niki & Max Nikias Chair in Engineering, University of Southern CaliforniaПотвърден имейл адрес: sipi.usc.edu

Следене

Nithin Rao Koluguri

NVIDIA Corporation

Потвърден имейл адрес: nvidia.com - Начална страница

Speech Processing Deep Neural Networks Machine Learning


Заглавие Сортиране по цитати Сортиране по година Сортиране по заглавие	Позовавания Позовавания	Година
Titanet: Neural model for speaker representation with 1d depth-wise separable convolutions and global context NR Koluguri, T Park, B Ginsburg ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	85	2022
SpeakerNet: 1D depth-wise separable convolutional network for text-independent speaker recognition and verification NR Koluguri, J Li, V Lavrukhin, B Ginsburg arXiv preprint arXiv:2010.12653, 2020	42	2020
Fast conformer with linearly scalable attention for efficient speech recognition D Rekesh, NR Koluguri, S Kriman, S Majumdar, V Noroozi, H Huang, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023	37	2023
Comparison of Speech Tasks and Recording Devices for Voice Based Automatic Classification of Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis. BN Suhas, D Patel, NR Koluguri, Y Belur, P Reddy, A Nalini, R Yadav, ... INTERSPEECH, 4564-4568, 2019	26	2019
Meta-learning for robust child-adult classification from speech NR Koluguri, M Kumar, SH Kim, C Lord, S Narayanan ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	20	2020
Spectrogram enhancement using multiple window Savitzky-Golay (MWSG) filter for robust bird sound detection NR Koluguri, GN Meenakshi, PK Ghosh IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 (6), 1183 …, 2017	20	2017
Multi-scale speaker diarization with dynamic scale weighting TJ Park, NR Koluguri, J Balam, B Ginsburg arXiv preprint arXiv:2203.15974, 2022	18	2022
Enhancing speaker diarization with large language models: A contextual beam search approach TJ Park, K Dhawan, N Koluguri, J Balam ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	9	2024
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition KC Puvvada, NR Koluguri, K Dhawan, J Balam, B Ginsburg ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	8	2024
Ambernet: A compact end-to-end model for spoken language identification F Jia, NR Koluguri, J Balam, B Ginsburg arXiv preprint arXiv:2210.15781, 2022	5*	2022
Property-aware multi-speaker data simulation: A probabilistic modelling technique for synthetic data generation TJ Park, H Huang, C Hooper, N Koluguri, K Dhawan, A Jukic, J Balam, ... arXiv preprint arXiv:2310.12371, 2023	3	2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System TJ Park, H Huang, A Jukic, K Dhawan, KC Puvvada, N Koluguri, N Karpov, ... arXiv preprint arXiv:2310.12378, 2023	3	2023
Spectral Codecs: Spectrogram-Based Audio Codecs for High Quality Speech Synthesis R Langman, A Jukić, K Dhawan, NR Koluguri, B Ginsburg arXiv preprint arXiv:2406.05298, 2024	2	2024
Investigating End-to-End ASR Architectures for Long Form Audio Transcription NR Koluguri, S Kriman, G Zelenfroind, S Majumdar, D Rekesh, V Noroozi, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	1	2024
NeMo Open Source Speaker Diarization System. T Park, NR Koluguri, F Jia, J Balam, B Ginsburg INTERSPEECH, 853-854, 2022	1	2022
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations K Dhawan, NR Koluguri, A Jukić, R Langman, J Balam, B Ginsburg arXiv preprint arXiv:2407.03495, 2024		2024
Less is More: Accurate Speech Recognition & Translation without Web-Scale Data KC Puvvada, P Żelasko, H Huang, O Hrinchuk, NR Koluguri, K Dhawan, ... arXiv preprint arXiv:2406.19674, 2024		2024
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5 Z Chen, H Huang, O Hrinchuk, KC Puvvada, NR Koluguri, P Żelasko, ... arXiv preprint arXiv:2406.19954, 2024		2024
Multi-scale speaker diarization for conversational ai systems and applications T Park, NR Koluguri, J Balam, B Ginsburg US Patent App. 17/979,989, 2024		2024
Speaker identification, verification, and diarization using neural networks for conversational ai systems and applications NR Koluguri, T Park, B Ginsburg US Patent App. 17/962,248, 2024		2024

Системата не може да изпълни операцията сега. Опитайте отново по-късно.

Статии 1–20

Позовавания годишно

Дублирани описания

Обединени библиографски описания

Добавяне на съавториСъавтори

Следене

Позовавания

Съавтори