The Art of Dictation

Discovering What Lies Outside of Human Dictation

Dictation is a human process. While some might not take an interest in the process of dictation, closer consideration reveals that it is a fascinating, unique part of the modern human experience.

Consider the following:

  • The ability to use language is a recent development, evolutionarily speaking
  • The ability to empathize with others may have developed only as recently as our mirror neurons
  • Dictation involves both of these defining human characteristics, immortalized for all time

Interested yet??

What’s the “Best” Dictation Software?

Before we attempt to answer this question, it’s important that we first define what we mean by “best.”

By the “best” dictation software, we don’t mean the cheapest, or the most time efficient, or the most accurate. These are simply measurable attributes by which we describe the differences between human dictation – the gold standard – and its machine alternatives.

What we mean by the “best” dictation software is the one that by its very nature best preserves the human element of dictation.

For the purposes of simplification, here we will compare two options for dictation software that not only represent the top options currently available on the market, but also represent two very different models for how machine dictation can be performed.

Here we will compare LilySpeech to Dragon.

LilySpeech is a relatively young player in the world of PC dictation, and has not had too much time to establish itself as of yet. Dragon Dictate on the other hand has been around a long time and is considered the industry leader.

The two software companies have a lot in common, but they are driven by two totally different Technologies.

Dragon dictate is owned by a company called nuance.  They have developed their own speech-to-text engine which for a very long time was the only highly accurate engine in the world.

This was true until Google entered the arena getting very serious about their speech-to-text engine. You may be familiar with it if you’ve ever sent a text message with your voice on an Android phone.

While Dragon dictate seems to rely more so on being able to understand each individual word which requires a longer training process for the user and more rigid requirements for your microphone, the Google speech to text service seems to rely heavily on the context of what you’re saying. This seams to allow them to produce better results with less user training and in situations where there is lots of background noise or the user is using a lower quality microphone.

LilySpeech is the first desktop software to make this available for use in any application.

In our opinion if you’re comparing LilySpeech to Dragon dictate,  LilySpeech is in the winter because they utilize what we feel is the superior speech to text dictation engine. Also, LilySpeech costs $2.49 per month which for the vast majority of people is much cheaper than Dragon dictate. They also offer a free 30-day trial.

So while no dictation software can replace the gold standard of human dictation, LilySpeech (and the recent technological revolution) it represents comes the closest.

The Idea of Accent in Dictation

As anyone who has experience dictating knows, the presence of accents can add a layer of difficulty to the process.

However, by considering the idea of accent more closely, we access a very interesting and important truth about language and why it matters.

Let’s take a look.

Accents make dictation difficult for humans as well as for machines. When a human scribe is unfamiliar with an individual’s accent, chances are good they will make more mistakes. If an accent is too thick, the scribe may be unable to understand the person at all. Similarly, dictation software generally have difficulties performing well when an individual has a thick or unrecognized accent.

So what’s the solution? And what does it say about language?

Well for human dictation, the solution is simple. If you need dictation for speech that features a thick accent, then simply find a scribe who either shares that accent or can easily understand it. This usually involves finding someone from the same geographical region, whether they grew up in that part of the world or if they immigrated there. Find someone who closely identifies with the other person.

For dictation software, often times you simply click a drop-down menu, and instead of selecting “English – Canadian,” you choose “English – Australian,” or any number of other options.

It’s interesting to think that with this new drop-down menu solution to the problem of accent in dictation, modern dictation technologies have removed the element of geographical closeness from the equation. No longer is personal connection – however slight – between speaker and scribe requisite for understanding different accents. As dictation moves to machines, that closeness is no longer necessary.

Is this a bad thing? Is there any danger in this?

Language serves an important unifying function, in that it allows us to communicate with other members of the human species regardless of age, creed, race, or background. Paradoxically, however, language also serves an important separating function, facilitating groups of individuals to adhere into social groups defined by their geography and protect each other.

Now of course moving from human dictation to machine dictation isn’t going to destroy the human race.

However, it’s nonetheless interesting to consider the human element that lies at the heart of the process of dictation, and how easy it is for technology to enable human beings to live in a more isolated fashion.

Individuals who share our particular dialect or accent and individuals who don’t are both a part of the rich linguistic fabric of our world, and it would certainly be a shame to lose this diversity!

Will Human Dictation Be Surpassed by Machines?

At the heart of dictation lies a very human element: understanding.

While the simple transcription of a blasé court proceeding or the copying down of a medical procedure for insurance purposes may feel mechanical, there are other uses for dictation that remind us of the amazing power that dictation has to connect us to other people in mutual understanding.

For example, consider a clinical interview.

In this case, an individual struggling with some sort of psychological issue brings their difficulties to another individual who has some professional training in how to help them grow and thrive. The psychologist diligently records the interview, as is part of the job, and later dictates the conversation (by hand or using a dictation software) in order to review.

In the act of dictation, the psychologist not only listens to every word of his or her patient, but also takes the time to immortalize those words.

There is something poetic about the ability to listen to other members of the human species and present their deepest thoughts and intuitions in language through the act of dictation. Upon deep reflection, dictation comes to represent that which is most advanced about human beings.

This, in part, is why it can feel so wrong-headed to consider the idea that machines could do it for us.

Dictation software and dictation technologies remove the human element from dictation, yielding a process that is mute, impersonal, thoughtless, and dry. The very notion doesn’t compute: It is illogical to consider the idea of a machine “listening” to a human speak or “understanding” what a human is saying. And yet these terms are used unthinkingly each and every day.

Recent advances in speech recognition technology have certainly closed the gap between human dictation and machine dictation, as far as accuracy. But what do these technological advances miss?

Machine dictation lacks the ability to assign meaning and value to the process of human communication.

Whereas a human scribe, when copying down the words that someone is speaking or words they have recorded, has the ability – simply through empathically listening and understanding – to change the record of what has been said in order to make it more meaningful. Even something as simple as a punctuation mark or a bit of text formatting like italicization can tell a story.

Machines do not tell stories. Machines simply enact complex sets of rules.

So while the tech revolution has the potential to change the way that dictation is performed in a practical sense, no matter how advanced the technology gets it will never surpass human dictation, in one sense, because it will never be able to replace that fundamental human skill that lies at the heart of dictation: the ability to understand.

