More and more, the team at Sayspring has been using earcons and non-verbal audio as part of our voice design projects and prototypes. Because encoding audio for voice... Read More
Embedding mp3s into a voice interface is one of the most compelling capabilities that SSML (Speech Synthesis Markup Language) has to offer voice designers. Intro music, sound effects, and recorded voices are underused yet powerful elements of voice design projects.
Add MP3 audio to your voice interface:
1. Encode your mp3 to make it compatible.
MP3 files need to have the following specifications: MPEG version 2, bit rate 48 kbps, sample rate 16000 Hz.
Our team at Sayspring has created an easy way to convert files to a format compatible with voice assistants. Just drag and drop your mp3 or wav file into our converter and you’ll have the compatible file. Try it out here.
2. Host your encoded mp3. Grab the link to the file.
You must host your mp3 at an internet-accessible HTTPS. The domain hosting MUST have a valid, trusted SSL certificate.
3. Insert your sound using simple SSML.
Write your speech inside the speech brackets. Embed your mp3 in the audio brackets. Follow the lead of the example below.
The audio clip will now play as part of the response in your project.
The above example sounds like this:
Some important limitations to note.
- You can use up to 5 audio tags in one singular response.
- The time used by all your audio files can’t be more than 90 seconds cumulatively.
Play audio as your entire response, or as an accompaniment to a voice response. The audio tag lets you include sound effects, earcons and short music. If your brand has a particular voice, you can include recordings of that in your design.
Audio is a compelling and memorable way to brand a voice-first user experience.
Think of the NBC chimes, the McDonald’s “I’m Lovin’ it” jingle, or the Law & Order dun-dun. With this simple code, all voice designs have the same capability to have a more emotional, more delightful and more memorable brand.