microsoft text-to-speech platform uses the TTS engine of the Microsoft Speech Service to read a text with natural sounding voices. This integration uses an API that is part of the Cognitive Services offering and is known as the Microsoft Speech API. For this integration to work, you need a free API key. You can use your Azure subscription to create an Azure Speech resource.
To enable text-to-speech with Microsoft, add the following lines to your
# Example configuration.yaml entry tts: - platform: microsoft api_key: YOUR_API_KEY
The language to use. Note that if you set the language to anything other than the default, you will need to specify a matching voice type as well. For the supported languages check the list of available languages.
The gender you would like to use for the voice. Accepted values are
The voice type you want to use. Accepted values are listed as the service name mapping in the documentation.
Change the rate of speaking in percentage. Example values:
Change the volume of the output in percentage. Example values:
Change the contour of the output in percentages. This overrides the pitch setting. See the W3 SSML specification for what it does. Example value:
The region of your API endpoint. See documentation.
Not all Azure regions support high-quality neural voices. Use this overview to determine the availability of standard and neural voices by region/endpoint.
New users (any newly created Azure Speech resource after August 31st, 2021) can only use neural voices. Existing resources can continue using standard voices through August 31st, 2024.
If you set the language to anything other than the default
en-us, you will need to specify a matching voice type as well.
A full configuration sample including optional variables:
# Example configuration.yaml entry tts: - platform: microsoft api_key: YOUR_API_KEY language: en-gb gender: Male type: RyanNeural rate: 20 volume: -50 pitch: high contour: (0, 0) (100, 100) region: eastus