Autovc: Zero-shot voice style transfer with only autoencoder loss K Qian, Y Zhang, S Chang, X Yang, M Hasegawa-Johnson International Conference on Machine Learning, 5210-5219, 2019 | 329 | 2019 |
Unsupervised speech decomposition via triple information bottleneck K Qian, Y Zhang, S Chang, M Hasegawa-Johnson, D Cox International Conference on Machine Learning, 7836-7846, 2020 | 127 | 2020 |
Speech Enhancement Using Bayesian Wavenet. K Qian, Y Zhang, S Chang, X Yang, D Florêncio, M Hasegawa-Johnson Interspeech, 2013-2017, 2017 | 89 | 2017 |
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder K Qian, Z Jin, M Hasegawa-Johnson, GJ Mysore ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 81 | 2020 |
Deep learning based speech beamforming K Qian, Y Zhang, S Chang, X Yang, D Florencio, M Hasegawa-Johnson 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 38 | 2018 |
Global rhythm style transfer without text transcriptions K Qian, Y Zhang, S Chang, J Xiong, C Gan, D Cox, M Hasegawa-Johnson arXiv preprint arXiv:2106.08519, 2021 | 27* | 2021 |
Parp: Prune, adjust and re-prune for self-supervised speech recognition CIJ Lai, Y Zhang, AH Liu, S Chang, YL Liao, YS Chuang, K Qian, ... Advances in Neural Information Processing Systems 34, 21256-21272, 2021 | 25 | 2021 |
Contentvec: An improved self-supervised speech representation by disentangling speakers K Qian, Y Zhang, H Gao, J Ni, CI Lai, D Cox, M Hasegawa-Johnson, ... International Conference on Machine Learning, 18003-18017, 2022 | 20 | 2022 |
Unsupervised text-to-speech synthesis by unsupervised automatic speech recognition J Ni, L Wang, H Gao, K Qian, Y Zhang, S Chang, M Hasegawa-Johnson arXiv preprint arXiv:2203.15796, 2022 | 11 | 2022 |
Speechsplit2. 0: Unsupervised speech disentanglement for voice conversion without tuning autoencoder bottlenecks CH Chan, K Qian, Y Zhang, M Hasegawa-Johnson ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 9 | 2022 |
Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding. H Gao, J Ni, Y Zhang, K Qian, S Chang, M Hasegawa-Johnson Interspeech, 1304-1308, 2021 | 9 | 2021 |
Wavprompt: Towards few-shot spoken language understanding with frozen language models H Gao, J Ni, K Qian, Y Zhang, S Chang, M Hasegawa-Johnson arXiv preprint arXiv:2203.15863, 2022 | 8 | 2022 |
On the interplay between sparsity, naturalness, intelligibility, and prosody in speech synthesis CIJ Lai, E Cooper, Y Zhang, S Chang, K Qian, YL Liao, YS Chuang, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 5 | 2022 |
Speech denoising with auditory models MR Saddler, A Francl, J Feather, K Qian, Y Zhang, JH McDermott arXiv preprint arXiv:2011.10706, 2020 | 5* | 2020 |
Continuous cnn for nonuniform time series H Shi, Y Zhang, H Wu, S Chang, K Qian, M Hasegawa-Johnson, J Zhao ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 4* | 2021 |
Application of local binary patterns for SVM based stop consonant detection K Qian, Y Zhang, M Hasegawa-Johnson Proc. Speech Prosody, 1114-1118, 2016 | 3 | 2016 |
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing Y Fu, Y Zhang, K Qian, Z Ye, Z Yu, CIJ Lai, C Lin Advances in Neural Information Processing Systems 35, 20902-20920, 2022 | 1 | 2022 |
Monaural singing voice separation using fusion-net with time-frequency masking F Li, K Qian, M Hasegawa-Johnson, M Akagi 2019 Asia-Pacific Signal and Information Processing Association Annual …, 2019 | 1 | 2019 |
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos K Su, K Qian, E Shlizerman, A Torralba, C Gan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | | 2023 |
Global prosody style transfer without text transcriptions K Qian, Y Zhang, S Chang, JJ Xiong, C Gan, D Cox US Patent App. 17/337,518, 2022 | | 2022 |