Technology vs. Conversation

One of the curious things about human language is that it is so infrequently written. Of the 6,000 or so currently spoken languages, only about 200 have a writing system. And the earliest writing system wasn’t developed very long ago when you consider that we’ve been talking for somewhere between 80,000 and 120,000 years.

So you’d think that voice recognition technology would be enormously intuitive and helpful and wonderful. And it almost is, but not quite.
Let’s take a small step back, or just off to the side: who here really loves leaving voice mail messages? Anybody? I don’t—not even when I know exactly what I want to say.

Not really even when the message is going to be: “Hey, call me when you get a chance.” Why?

Part of it might be that there’s not a person there to look at while you’re talking, but that seems kinda thin. I think it has to do more with the fact that there isn’t someone on the other end to have a conversation with. If you hesitate or stutter or say “...um” 3 times in a row there isn’t anybody on the other end to give you any feedback. Nobody is there to say, “Oh, yeah, cool.” or “Hey, can I call you back?” or “Look, the restraining order covers phone calls, too!” You could just as well be shouting into the void, because electronics basically look like a void to the human voice.

We learned to talk—whether it was 100,000 years ago or 10 years ago—from other people. We learned speech as assertions, questions, conversations, demands and explorations—not as mere recitation or performance. We haven’t evolved beyond needing someone to listen and respond; to signal to us that they understand. Even talking to ourselves is helpful (another form of embodied cognition?) for formulating thoughts and clarifying ideas. But trying to get your demands met by voice recognition is hit-and-miss without the fluid (if sometimes absolutely maddening) exchange you would have with another person.

For now we’re stuck performing for our devices, in the hopes that they will perform for us. We’re probably in an uncanny valley, where voice recognition is weirdly good but not quite perfect, for another decade or so. In the meantime, I’ll just have to get used to turning toward my smart speaker and trying to make it understand that I just want it to listen to me.

Previous
Previous

Cooling Lap

Next
Next

Principles