Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation Y Jia, Y Ding, A Bapna, C Cherry, Y Zhang, A Conneau, N Morioka arXiv preprint arXiv:2203.13339, 2022 | 14 | 2022 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 13 | 2024 |
Libritts-r: A restored multi-speaker text-to-speech corpus Y Koizumi, H Zen, S Karita, Y Ding, K Yatabe, N Morioka, M Bacchiani, ... arXiv preprint arXiv:2305.18802, 2023 | 9 | 2023 |
Residual adapters for few-shot text-to-speech speaker adaptation N Morioka, H Zen, N Chen, Y Zhang, Y Ding arXiv preprint arXiv:2210.15868, 2022 | 8 | 2022 |
Miipher: A robust speech restoration model integrating self-supervised speech and text representations Y Koizumi, H Zen, S Karita, Y Ding, K Yatabe, N Morioka, Y Zhang, W Han, ... 2023 IEEE Workshop on Applications of Signal Processing to Audio and …, 2023 | 7 | 2023 |
Translatotron 3: Speech to speech translation with monolingual data E Nachmani, A Levkovitch, Y Ding, C Asawaroengchai, H Zen, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 1 | 2024 |
Residual adapters for few-shot text-to-speech speaker adaptation N Morioka, B Chun, N Chen, Y Zhang, D Yifan US Patent App. 18/493,770, 2024 | | 2024 |
LibriTTS-R: Restoration of a Large-Scale Multi-Speaker TTS Corpus Y Koizumi, H Zen, S Karita, Y Ding, K Yatabe, N Morioka, MAU Bacchiani, ... | | 2023 |
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech Representation and Linguistic Features Y Koizumi, H Zen, S Karita, Y Ding, K Yatabe, N Morioka, Y Zhang, W Han, ... | | 2023 |