Instructions to use microsoft/speecht5_tts with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/speecht5_tts with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="microsoft/speecht5_tts")# Load model directly from transformers import AutoProcessor, AutoModelForTextToSpectrogram processor = AutoProcessor.from_pretrained("microsoft/speecht5_tts") model = AutoModelForTextToSpectrogram.from_pretrained("microsoft/speecht5_tts") - Notebooks
- Google Colab
- Kaggle
Speech to Speech Translation
#3
by Owos - opened
Can I use this model for speech to speech translation?
If yes, please how can I tweak the model for it?
I'm currently trying to do speech to text with Whisper (which can do translation while transcribing the audio), and then text to speech using this model. But the problem is that the timestamp doesn't get utilized, so the generated speech isn't in sync with the original one.
Have you found a satisfying way to do speech to speech that syncs the two speeches?