Kentaro Mitsui
Kentaro Mitsui
Home
Experience
Projects
Publications
Light
Dark
Automatic
Projects
PSLM (Parallel Speech Language Model)
Parallel generation of text and speech using LLM for low-latency spoken dialogue.
arXiv
Demo
Nue ASR
Integrating pretrained HuBERT and GPT for automatic speech recognition.
arXiv
GitHub
Hugging Face
CHATS (CHatty Agents Text-to-Speech)
Natural AI-to-AI conversation with spoken content control over written dialogue.
arXiv
Demo
Koemotion
Japanese text to speech and facial keypoint with a speaker control over 2D map.
UniFLG (Unified Facial Landmark Generator)
Integrating audiovisual speech synthesis (text to speech and face) and speech-driven facial animation (speech to face) for multimodal interaction.
arXiv
Demo
Cite
×