Python PyQt Convert Speech Recognition CodeSource

Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision

Abstract: There exist three approaches for multilingual and crosslingual automatic speech recognition (MCL-ASR) - supervised pretraining with phonetic or graphemic transcription, and self-supervised ...

IEEE

Multi-Stage Confidence-Guided Diffusion and Emotional Bidirectional Mamba for Robust Speech Emotion Recognition

Abstract: Speech Emotion Recognition (SER) in noisy environments is challenging due to the overlap between emotional and noise-related signals. We propose a novel emotion-diffusion approach to enhance ...

Microsoft

Understand your customers better with constrained speech recognition

In today’s voice-first world, it’s not enough for systems to simply hear what users say. They need to understand it with precision. In high-stakes environments like healthcare, finance, or enterprise ...

GitHub

WenetSpeech-Yue: A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation

This is the official repository 👑 for the WenetSpeech-Yue dataset and the source code for WenetSpeech-Pipe speech data preprocessing pipeline. To address the unique linguistic characteristics of ...

GitHub

TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

[2025.06.26] - This paper has been accepted by ICCV2025 🎉! [2025.02.13] - The benchmark and evaluation code are available! [2024.12.05] - The training dataset and generative dataset(v1: 0.43m and v2: ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results