SlideShare a Scribd company logo
THE FUTURE OF
VOICE
WEBDAGENE 2017
CHERYL PLATZ
Owner, IDEAPLATZ
Senior Designer, MICROSOFT
I’ve been designing for voice
and multimodal interfaces
since 2006.
AT AMAZON:
First designer on Echo Look
and Alexa Notifications
AT MICROSOFT:
Designer for voice and
multimodal interfaces on
Windows Automotive and
Cortana
WEBDAGENE 2017
COMPUTER, WHO IS CHERYL?
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
HUMANS HAVE
DEVELOPED THE ART OF
CONVERSATION FOR
THOUSANDS OF YEARS.
CHERYL PLATZ //
@MUPPETAPHRODITE
The accessibility benefits are vast, and not just
limited to those with permanent accessibility
challenges.
WEBDAGENE 2017
Voice user
interfaces leverage
this experience to
improve lives.
CHERYL PLATZ //
@MUPPETAPHRODITE
“
”
My wife passed away 4 years ago leaving
me, not only a widow, but a widowed
quadriplegic trying to survive on his own…
Alexa has been a blessing beyond my
imagination. She has given me an opportunity
that I never thought would be possible.
AMAZON ECHO REVIEW FROM MICHAEL DAVIS, FEB
2017
DESCRIBING ECHO’S AID IN HIS LIFE AS A
QUADRIPLEGIC
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITE
35.6 MILLION
AMERICANS USE
A VOICE-
ACTIVATED
ASSISTANT
DEVICE AT
LEAST ONCE A
MONTH.
SOURCE: eMarketer
WEBDAGENE 2017
VOICE UI IS NOW
MAINSTREAM, BUT IT’S FAR
FROM MATURE.
IN TODAY’S WEAKNESSES
LIE THREE KEY
OPPORTUNITIES FOR THE
FUTURE OF VOICE UI.
CHERYL PLATZ //
@MUPPETAPHRODITE
Limited training data and a an affluent user
base is excluding underrepresented groups
with inaccuracy.
WEBDAGENE 2017
Today’s voice
interfaces are
inherently biased.
CHERYL PLATZ //
@MUPPETAPHRODITE
OPPORTUNITY 1
“
”
“…looking at race, I found that
Caucasian speakers had by far the
lowest error rate. African-American
speakers and speakers with a mixed
racial background had higher error rates.
DR. RACHEL TATMAN, LINGUISTICS, UNIVERSITY OF
WASHINGTON
ON ACCURACY OF SIRI FOR VARIOUS DEMOGRAPHIC
GROUPS
KUOW, SEPTEMBER 19 2017
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITE
GENDER
Systems were initially
trained with internal
data collection – at
companies where
engineering teams
are still largely male.
ETHNICITY
Training data
expands to include
early adopters, often
affluent.
This may exclude
underrepresented
ethnicities due to
wage gaps.
ACCENT
The North American
focus of most of
today’s products
mean we have yet to
attain critical mass of
training data for
second-language
speakers.
WEBDAGENE 2017
DECONSTRUCTING VOICE UI BIAS
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITE
Biased
Training Data
Poor Accuracy
for Excluded
Groups
High Attrition
by Excluded
Groups
BIAS
SPIRAL
WE MUST FIND A WAY TO BREAK THE BIAS
SPIRAL,
AND MAKE THE FUTURE OF VOICE UI
We are wasting time re-implementing the same
basic tasks on multiple systems. Most systems
emphasize a single modality at a time.
WEBDAGENE 2017
Today’s voice
interfaces are
simple and siloed.
CHERYL PLATZ //
@MUPPETAPHRODITE
OPPORTUNITY 2
We currently have an ecosystem of voice
assistants chasing each others’ tails.
What could we accomplish if we relied on each
other’s expertise?
WEBDAGENE 2017
TIME LOST TO TIMERS
CHERYL PLATZ //
@MUPPETAPHRODITE
The voice of the future (en) – med Cheryl Platz
WEBDAGENE 2017
Complicated
CHERYL PLATZ //
@MUPPETAPHRODITEClip from Adobe vision video: “What if you had an intelligent agent for voice editing?”
WEBDAGENE 2017
DO WE NEED ONE ASSISTANT TO
RULE THEM ALL?
CHERYL PLATZ //
@MUPPETAPHRODITE
“
”
Through its collaboration with
Microsoft, Amazon said, Alexa
users will get answers to some
of the same questions that
Cortana can now answer – for
instance, when is the next
budget review with the boss?NICK WINGFIELD, NEW YORK TIMES
AUGUST 30, 2017
ILLUSTRATION: MENGXIN LI
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITE
LET’S BUILD A CHOIR OF
HARMONIOUS VOICE
INTERFACES TOGETHER.
Alexa, Google Home and Cortana essentially
allow only command-and-control scenarios.
WEBDAGENE 2017
Today’s voice UIs
aren’t
conversational –
yet.
CHERYL PLATZ //
@MUPPETAPHRODITE
OPPORTUNITY 3
IT LOOKS LIKE YOU MIGHT BE
IN THE AWKWARD EARLY
STAGES OF
CONVERSATIONAL UI. CAN I
HELP?PLEASE
NO
RUN
AWAY
AUDIBLE CUES PHYSICAL CUES
WEBDAGENE 2017
Tone
Speed
Volume
Eye contact & gaze
Heart rate
Posture
Gesture
SPOKEN CONVERSATION IS MORE
THAN WORDS
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITEClip from “Her”: Warner Brothers / Anapurna Pictu
CONVERSATION REQUIRES
TRUST.
HUMANS BUILD TRUST
OVER TIME.
WEBDAGENE 2017
WHAT BENEFIT CAN
HUMANS GAIN FROM
TRUSTING THESE
ASSISTANTS?
CHERYL PLATZ //
@MUPPETAPHRODITE
“
”
The other night, I found Gary playing his
own version of a memory game with
Alexa. He was trying to come up with
songs he remembered and hadn't heard
for awhile and would ask her to play
them.
AMAZON ECHO REVIEW FROM ALEX S.
DESCRIBING ECHO’S AID IN HUSBAND’S STRUGGLE WITH
PARKINSON’S
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITE
“
”
People have serious conversations with Siri.
People talk to Siri about all kinds of things,
including when they’re having a stressful day
or have something serious on their mind.
They turn to Siri in emergencies or when they
want guidance on living a healthier life.
APPLE JOB POSTING, SIRI SOFTWARE ENGINEER, HEALTH
AND WELLNESS
APRIL 4, 2017
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITEClip from “Her”: Warner Brothers / Anapurna Pictu
 How can we (ethically) model a relationship over
time?
 What information is saved, and what is discarded?
 What level of transparency and control is required?
 Does the assistant’s personality adapt, or remain
fixed?
WEBDAGENE 2017
WHAT DOES A RELATIONSHIP
LOOK LIKE?
CHERYL PLATZ //
@MUPPETAPHRODITE
HOW DO WE GET TO
THE FUTURE OF
VOICE UI?
 Inclusive and unbiased speech
recognition
 Harmonious cross-product partnerships
 Semantic web represents common
knowledge
 Trust built over time with shared context
 Conversation informed by non-verbal
cuesWEBDAGENE 2017
THE FUTURE OF VOICE
CHERYL PLATZ //
@MUPPETAPHRODITE
WEBDAGENE 2017
THESE ADVANCES WILL
COMBINE TO OPEN NEW
OPPORTUNITIES AND A
NEW ERA IN HUMAN
EMPOWERMENT.
CHERYL PLATZ //
@MUPPETAPHRODITE
ENHANCED PRODUCTIVITY
MULTIMODAL ADAPTIVITY
COMPANIONSHIP AND
COMFORT
WEBDAGENE 2017
CHERYL PLATZ //
@MUPPETAPHRODITEClip from Star Trek IV: The Voyage Home / Paramount P
WEBDAGENE 2017
LET’S BUILD A FUTURE
OF INTERFACES WHERE
OUR HUMANITY IS
AMPLIFIED,
NOT ATROPHIED.
CHERYL PLATZ //
@MUPPETAPHRODITE
May the voice be with you.
http://ideaplatz.com
WEBDAGENE 2017
CHERYL PLATZ
Owner, IDEAPLATZ -- Senior Designer, MICROSOFT
Twitter & Medium: @MuppetAphrodite

More Related Content

The voice of the future (en) – med Cheryl Platz

  • 1. THE FUTURE OF VOICE WEBDAGENE 2017 CHERYL PLATZ Owner, IDEAPLATZ Senior Designer, MICROSOFT
  • 2. I’ve been designing for voice and multimodal interfaces since 2006. AT AMAZON: First designer on Echo Look and Alexa Notifications AT MICROSOFT: Designer for voice and multimodal interfaces on Windows Automotive and Cortana WEBDAGENE 2017 COMPUTER, WHO IS CHERYL? CHERYL PLATZ // @MUPPETAPHRODITE
  • 3. WEBDAGENE 2017 HUMANS HAVE DEVELOPED THE ART OF CONVERSATION FOR THOUSANDS OF YEARS. CHERYL PLATZ // @MUPPETAPHRODITE
  • 4. The accessibility benefits are vast, and not just limited to those with permanent accessibility challenges. WEBDAGENE 2017 Voice user interfaces leverage this experience to improve lives. CHERYL PLATZ // @MUPPETAPHRODITE
  • 5. “ ” My wife passed away 4 years ago leaving me, not only a widow, but a widowed quadriplegic trying to survive on his own… Alexa has been a blessing beyond my imagination. She has given me an opportunity that I never thought would be possible. AMAZON ECHO REVIEW FROM MICHAEL DAVIS, FEB 2017 DESCRIBING ECHO’S AID IN HIS LIFE AS A QUADRIPLEGIC WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITE
  • 6. WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITE 35.6 MILLION AMERICANS USE A VOICE- ACTIVATED ASSISTANT DEVICE AT LEAST ONCE A MONTH. SOURCE: eMarketer
  • 7. WEBDAGENE 2017 VOICE UI IS NOW MAINSTREAM, BUT IT’S FAR FROM MATURE. IN TODAY’S WEAKNESSES LIE THREE KEY OPPORTUNITIES FOR THE FUTURE OF VOICE UI. CHERYL PLATZ // @MUPPETAPHRODITE
  • 8. Limited training data and a an affluent user base is excluding underrepresented groups with inaccuracy. WEBDAGENE 2017 Today’s voice interfaces are inherently biased. CHERYL PLATZ // @MUPPETAPHRODITE OPPORTUNITY 1
  • 9. “ ” “…looking at race, I found that Caucasian speakers had by far the lowest error rate. African-American speakers and speakers with a mixed racial background had higher error rates. DR. RACHEL TATMAN, LINGUISTICS, UNIVERSITY OF WASHINGTON ON ACCURACY OF SIRI FOR VARIOUS DEMOGRAPHIC GROUPS KUOW, SEPTEMBER 19 2017 WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITE
  • 10. GENDER Systems were initially trained with internal data collection – at companies where engineering teams are still largely male. ETHNICITY Training data expands to include early adopters, often affluent. This may exclude underrepresented ethnicities due to wage gaps. ACCENT The North American focus of most of today’s products mean we have yet to attain critical mass of training data for second-language speakers. WEBDAGENE 2017 DECONSTRUCTING VOICE UI BIAS CHERYL PLATZ // @MUPPETAPHRODITE
  • 11. WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITE Biased Training Data Poor Accuracy for Excluded Groups High Attrition by Excluded Groups BIAS SPIRAL
  • 12. WE MUST FIND A WAY TO BREAK THE BIAS SPIRAL, AND MAKE THE FUTURE OF VOICE UI
  • 13. We are wasting time re-implementing the same basic tasks on multiple systems. Most systems emphasize a single modality at a time. WEBDAGENE 2017 Today’s voice interfaces are simple and siloed. CHERYL PLATZ // @MUPPETAPHRODITE OPPORTUNITY 2
  • 14. We currently have an ecosystem of voice assistants chasing each others’ tails. What could we accomplish if we relied on each other’s expertise? WEBDAGENE 2017 TIME LOST TO TIMERS CHERYL PLATZ // @MUPPETAPHRODITE
  • 16. WEBDAGENE 2017 Complicated CHERYL PLATZ // @MUPPETAPHRODITEClip from Adobe vision video: “What if you had an intelligent agent for voice editing?”
  • 17. WEBDAGENE 2017 DO WE NEED ONE ASSISTANT TO RULE THEM ALL? CHERYL PLATZ // @MUPPETAPHRODITE
  • 18. “ ” Through its collaboration with Microsoft, Amazon said, Alexa users will get answers to some of the same questions that Cortana can now answer – for instance, when is the next budget review with the boss?NICK WINGFIELD, NEW YORK TIMES AUGUST 30, 2017 ILLUSTRATION: MENGXIN LI WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITE
  • 19. LET’S BUILD A CHOIR OF HARMONIOUS VOICE INTERFACES TOGETHER.
  • 20. Alexa, Google Home and Cortana essentially allow only command-and-control scenarios. WEBDAGENE 2017 Today’s voice UIs aren’t conversational – yet. CHERYL PLATZ // @MUPPETAPHRODITE OPPORTUNITY 3
  • 21. IT LOOKS LIKE YOU MIGHT BE IN THE AWKWARD EARLY STAGES OF CONVERSATIONAL UI. CAN I HELP?PLEASE NO RUN AWAY
  • 22. AUDIBLE CUES PHYSICAL CUES WEBDAGENE 2017 Tone Speed Volume Eye contact & gaze Heart rate Posture Gesture SPOKEN CONVERSATION IS MORE THAN WORDS CHERYL PLATZ // @MUPPETAPHRODITE
  • 23. WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITEClip from “Her”: Warner Brothers / Anapurna Pictu
  • 25. WEBDAGENE 2017 WHAT BENEFIT CAN HUMANS GAIN FROM TRUSTING THESE ASSISTANTS? CHERYL PLATZ // @MUPPETAPHRODITE
  • 26. “ ” The other night, I found Gary playing his own version of a memory game with Alexa. He was trying to come up with songs he remembered and hadn't heard for awhile and would ask her to play them. AMAZON ECHO REVIEW FROM ALEX S. DESCRIBING ECHO’S AID IN HUSBAND’S STRUGGLE WITH PARKINSON’S WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITE
  • 27. WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITE
  • 28. “ ” People have serious conversations with Siri. People talk to Siri about all kinds of things, including when they’re having a stressful day or have something serious on their mind. They turn to Siri in emergencies or when they want guidance on living a healthier life. APPLE JOB POSTING, SIRI SOFTWARE ENGINEER, HEALTH AND WELLNESS APRIL 4, 2017 WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITE
  • 29. WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITEClip from “Her”: Warner Brothers / Anapurna Pictu
  • 30.  How can we (ethically) model a relationship over time?  What information is saved, and what is discarded?  What level of transparency and control is required?  Does the assistant’s personality adapt, or remain fixed? WEBDAGENE 2017 WHAT DOES A RELATIONSHIP LOOK LIKE? CHERYL PLATZ // @MUPPETAPHRODITE
  • 31. HOW DO WE GET TO THE FUTURE OF VOICE UI?
  • 32.  Inclusive and unbiased speech recognition  Harmonious cross-product partnerships  Semantic web represents common knowledge  Trust built over time with shared context  Conversation informed by non-verbal cuesWEBDAGENE 2017 THE FUTURE OF VOICE CHERYL PLATZ // @MUPPETAPHRODITE
  • 33. WEBDAGENE 2017 THESE ADVANCES WILL COMBINE TO OPEN NEW OPPORTUNITIES AND A NEW ERA IN HUMAN EMPOWERMENT. CHERYL PLATZ // @MUPPETAPHRODITE
  • 37. WEBDAGENE 2017 CHERYL PLATZ // @MUPPETAPHRODITEClip from Star Trek IV: The Voyage Home / Paramount P
  • 38. WEBDAGENE 2017 LET’S BUILD A FUTURE OF INTERFACES WHERE OUR HUMANITY IS AMPLIFIED, NOT ATROPHIED. CHERYL PLATZ // @MUPPETAPHRODITE
  • 39. May the voice be with you. http://ideaplatz.com WEBDAGENE 2017 CHERYL PLATZ Owner, IDEAPLATZ -- Senior Designer, MICROSOFT Twitter & Medium: @MuppetAphrodite

Editor's Notes

  1. https://www.amazon.com/gp/customer-reviews/R2B1YLP2SI17OK/ref=cm_cr_srp_d_rvw_ttl?ie=UTF8&ASIN=B00Y3QOH5G
  2. https://www.emarketer.com/Article/Alexa-Say-What-Voice-Enabled-Speaker-Usage-Grow-Nearly-130-This-Year/1015812
  3. http://kuow.org/post/turns-out-siri-might-be-racist
  4. We must lead the call for a more representative speech user experience across all platforms. This may take the form of new products, improvement of existing products, or open-source speech models.
  5. A semantic web – or third party ontologies for specific subject matter, like healthcare or IT, could allow each voice assistant to understand a similar concepts and innovate on the response.
  6. “What if You Had an intelligent agent for voice editing?” – Adobe - https://www.youtube.com/watch?v=e6TccXFBY5g Powerful and yet simplistic. Could this semantic structure be exposed to multiple assistants?
  7. (Is it a coincidence that circles factor so prominently in the branding of this generation of assistants?)
  8. https://www.nytimes.com/2017/08/30/technology/amazon-alexa-microsoft-cortana.html
  9. Clip from “Her”: Warner Bros. / Anapurna Pictures
  10. Love letters: a time-honored human tradition that builds shared context. Today’s voice systems don’t build shared context. Today’s popular voice assistants only maintain conversational context for a matter of seconds. Human relationships require a shared understanding built over time. Trust without memory is difficult.
  11. https://www.amazon.com/gp/customer-reviews/RTRDKUJDZCO4B/ref=cm_cr_arp_d_rvw_ttl?ie=UTF8&ASIN=B00Y3QOH5G
  12. Screen capture of Apple job posting for “health and wellness” domain
  13. Screen capture from BMW/Alexa announcement: https://www.theverge.com/2017/9/27/16372566/bmw-alexa-integration-2018
  14. We are an aging population, and many of us need companionship at times when we are alone. As our voice interfaces become sophisticated, when is it appropriate for our digital voice assistants to fill this gap?
  15. Star Trek IV: The Voyage Home. Copyright Paramount Pictures. http://www.youtube.com/watch?v=LkqiDu1BQXY