To get started, make sure you set the Say element type attribute to SSML. See https://docs.ytel.com/docs/say for more information.

Break

The Break element controls pausing or other prosodic boundaries between words. Using Break between any pair of tokens is optional. If this element is not present between words, the break is automatically determined based on the linguistic context.

Element Attributes

Attribute	Description
`time` optional	Sets the length of the break by seconds or milliseconds (e.g. "3s" or "250ms")
`Strength` optional	Sets the strength of the output's prosodic break by relative terms. Valid values are: "x-weak", weak", "medium", "strong", and "x-strong". The value "none" indicates that no prosodic break boundary should be outputted, which can be used to prevent a prosodic break that the processor would otherwise produce. The other values indicate monotonically non-decreasing (conceptually increasing) break strength between tokens. The stronger boundaries are typically accompanied by pauses.

Attribute

Description

time optional

Sets the length of the break by seconds or milliseconds (e.g. "3s" or "250ms")

Strength optional

Sets the strength of the output's prosodic break by relative terms. Valid values are: "x-weak", weak", "medium", "strong", and "x-strong". The value "none" indicates that no prosodic break boundary should be outputted, which can be used to prevent a prosodic break that the processor would otherwise produce. The other values indicate monotonically non-decreasing (conceptually increasing) break strength between tokens. The stronger boundaries are typically accompanied by pauses.

Example

The following example shows how to use the Break element to pause between steps:

<response>  
  <say type='ssml'>
    Step 1, take a deep breath. <break time="200ms"/>
    Step 2, exhale.
    Step 3, take a deep breath again. <break strength="weak"/>
    Step 4, exhale.
  </say>

`<say-as>`

The <say-as> element lets you specify information about the type of text construct that is contained within the element.

Attributes

interpret-as: Determines how the value is spoken
format: Optional formatting for specific interpret-as values
detail: Optional detail level for specific interpret-as values

Supported Values

cardinal: Speaks numbers as words
ordinal: Speaks numbers as ordinal terms
characters: Spells out words letter by letter
fraction: Converts fractions to spoken words
expletive or beep: Censors text
unit: Converts units to appropriate form
verbatim or spell-out: Spells out text
time: Speaks time in a natural format
date: Speaks dates with configurable detail

Example

<response>  
  <say type='ssml'>
    Cardinal value <say-as interpret-as="cardinal">12345</say-as>
    Ordinal value <say-as interpret-as="ordinal">1</say-as>
    Characters value <say-as interpret-as="characters">can</say-as>
    Fraction value <say-as interpret-as="fraction">5+1/2</say-as>
    Expletive or beep value <say-as interpret-as="expletive">censor this</say-as>
    Unit value <say-as interpret-as="unit">10 foot</say-as>
    Verbatim or spell-out value <say-as interpret-as="verbatim">abcdefg</say-as>
    Time value <say-as interpret-as="time" format="hms12">2:30pm</say-as>  
    Date value with detail 1 <say-as interpret-as="date" format="yyyymmdd" detail="1"> 1960-09-10</say-as> 
    Date value with detail 2 <say-as interpret-as="date" format="dmy" detail="2"> 10-9-1960 </say-as>    
  </say>

`<p>` and `<s>`

The <p> and <s> element lets you create paragraphs and sentences.

Use <s>...</s> tags to wrap full sentences, especially if they contain SSML elements that change prosody (that is, <audio>, <break>, <emphasis>, <par>, <prosody>, <say-as>, <seq>, and <sub>).

If a break in speech is intended to be long enough that you can hear it, use <s>...</s> tags and put that break between sentences.

Example

<response>  
  <say type='ssml'>
    <p>
      <s>This is sentence one.</s>
      <s>This is sentence two.</s>
    </p>
  </say>

Additional Notes

The <s> tag helps define sentence boundaries
SSML elements within <s> tags can modify how the sentence is spoken
Proper use of these tags can improve speech synthesis clarity

Prosody

The Prosody element lets you customize the pitch, speaking rate, and volume of text contained by the element. Currently the rate, pitch, and volume attributes are supported.

The rate and volume attributes can be set according to the W3 specification. There are three options for setting the value of the pitch attribute:

Option	Description
Relative	Specify a relative value (e.g. "low", "medium", "high", etc) where "medium" is the default pitch.
Semitones	Increase or decrease pitch by "N" semitones using "+Nst" or "-Nst" respectively. Note that "+/-" and "st" are required.
Percentage	Increase or decrease pitch by "N" percent by using "+N%" or "-N%" respectively. Note that "%" is required but "+/-" is optional.

Example

<response>  
  <say type='ssml'>
    <prosody rate="slow" pitch="-2st">Can you hear me now?</prosody>
  </say>
</response>

Audio

The Audio element supports the insertion of recorded audio files and other audio formats in conjunction with synthesized speech output.

Refer to the W3 specification for detailed information: https://www.w3.org/TR/speech-synthesis/#S3.3.1

Attributes

Attribute	Required	Default	Description
`src`	Yes	N/A	URI referring to the audio media source (HTTPS only)
`clipBegin`	No	0	Offset from audio source's beginning to start playback
`clipEnd`	No	Infinity	Offset from audio source's beginning to end playback
`speed`	No	100%	Playback rate relative to normal input rate
`repeatCount`	No	1	Number of times to insert the audio
`repeatDur`	No	Infinity	Duration limit for inserted audio
`soundLevel`	No	+0dB	Sound level adjustment in decibels

Example

<response>  
  <say type='ssml'>
    <audio src="cat_purr_close.ogg">
      <desc>a cat purring</desc>
      PURR (sound didn't load)
    </audio>
  </say>
</response>

Break

Element Attributes

Example

<say-as>

Attributes

Supported Values

Example

<p> and <s>

Example

Additional Notes

Prosody

Example

Audio

Attributes

Example

`<say-as>`

`<p>` and `<s>`