google_cloud platform allows you to use Google Cloud Platform API and integrate them into Home Assistant.
To use Google Cloud Platform, you need to provide
config directory relative path of
API key file you are going to use. Place it under
config folder and set
key_file parameter in
# Example configuration.yaml entry tts: - platform: google_cloud key_file: googlecloud.json
API key obtaining process described in corresponding documentation:
Basic instruction for all APIs:
Visit Cloud Resource Manager.
CREATE PROJECTbutton at the top.
Project nameand click
Enable needed Cloud API visiting one of the links below or APIs library, selecting your
Projectfrom the dropdown list and clicking the
Set up authentication:
- Visit this link
- From the toolbar above the
Service accountlist, select
Create service account.
- In the
Service account namefield, enter any name.
If you are requesting a text-to-speech API key:
- Don’t select a value from the Role list. No role is required to access this service.
Create. If a note appears, warning that this service account has no role, you may ignore that.
- Return to the
Service accountlist page and click on the service account you created in step 5 to see the details for this service account.
- Choose the
Keystab within the details view for this service account.
- In the
Add Keydropdown, select
Create New Key.
- Specify a
JSONkey type and click
[serviceaccountname].jsonfile will download to your browser.
Google Cloud text-to-speech converts text into human-like speech in more than 100 voices across 20+ languages and variants. It applies groundbreaking research in speech synthesis (WaveNet) and Google’s powerful neural networks to deliver high-fidelity audio. With this easy-to-use API, you can create lifelike interactions with your users that transform customer service, device interaction, and other applications.
The Cloud text-to-speech API is priced monthly based on the amount of characters to synthesize into audio sent to the service.
|Voice||Monthly free tier||Paid usage|
|Neural2||0 to 1 million bytes||$16.00 USD / 1 million bytes|
|Polyglot (Preview)||0 to 1 million bytes||$16.00 USD / 1 million bytes|
|Studio (Preview)||0 to 100 thousand bytes||$160.00 USD / 1 million bytes|
|Standard||0 to 4 million characters||$4.00 USD / 1 million characters|
|WaveNet||0 to 1 million characters||$16.00 USD / 1 million characters|
API key file to use with Google Cloud Platform. If not specified
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] path will be used.
Default gender of the voice, e.g.,
male. Supported languages, genders and voices listed here.
Default voice name, e.g.,
en-US-Wavenet-F. Supported languages, genders and voices listed here. Important! This parameter will override
gender parameters if set.
Default audio encoder. Supported encodings are
Default rate/speed of the voice, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed.
Default pitch of the voice, in the range [-20.0, 20.0]. 20 means increase of 20 semitones from the original pitch. -20 means decrease of 20 semitones from the original pitch.
Default volume gain (in dB) of the voice, in the range [-96.0, 16.0]. If unset, or set to a value of 0.0 (dB), will play at normal native signal amplitude. A value of -6.0 (dB) will play at approximately half the amplitude of the normal native signal amplitude. A value of +6.0 (dB) will play at approximately twice the amplitude of the normal native signal amplitude. Strongly recommend not to exceed +10 (dB) as there’s usually no effective increase in loudness for any value greater than that.
An identifier which selects ‘audio effects’ profiles that are applied on (post synthesized) text-to-speech. Effects are applied on top of each other in the order they are given. Supported profile ids listed here.
Default text type. Supported text types are
ssml. Read more on what is that and how to use SSML here.
The Google Cloud text-to-speech configuration can look like:
# Example configuration.yaml entry tts: - platform: google_cloud key_file: googlecloud.json language: en-US gender: male voice: en-US-Wavenet-F encoding: linear16 speed: 0.9 pitch: -2.5 gain: -5.0 text_type: ssml profiles: - telephony-class-application - wearable-class-device