Could Siri Be a Game-Changer for Physicians?

26 October 2011

Editor’s Note: Henry Wei, MD, is a board-certified internist and a Clinical Instructor in Medicine at Weill-Cornell Medical College. He is currently a Senior Medical Director at Aetna, where he leads Clinical Research & Development for ActiveHealth Management. (The views and opinions expressed here do not necessarily reflect those of Dr. Wei’s employers.)

By now you’ve probably heard plenty about Siri. Or perhaps Siri has heard plenty about you. Siri, the speech recognition software Apple’s built into the iPhone 4S, has been billed as an agile virtual assistant, able to perform tasks ranging from taking dictation and sending the outcome as an email to finding facts and scheduling appointments. But here’s something that hasn’t gotten quite as much popular press: Siri is really good at medical terminology. And swear words (more on that later).

I unboxed my own iPhone 4S after a recent clinic week. With a backlog of patient notes, I became curious about dictating these cases–leaving out patient names for reasons I’ll get into later. After watching some Will Ferrell clips (who doesn’t?), I tested out the Siri feature using some very precise medical terms–language I wouldn’t expect anyone but a physician to be using in real life. But Siri got every word right off the bat, without any ambiguity–a task that can be challenging for most humans, let alone a computerized technology.

Impressed, I began testing Siri from the other perspective: the patient’s. What if I say I have chest pain, or I’m depressed? Parsing through Siri’s replies, which mostly had to do with finding the nearest hospital, I noticed that these answers were coming from something more than just speech recognition–actual meaning was being recognized. That was exciting to me; it showed a layer of design and consideration beyond the literal. You see, my day job involves developing clinical decision support and innovating healthcare IT. When I see a system that’s strong enough with semantic meaning to, say, take an order at the drive-up window of a fast food joint, I get excited.

Siri’s technology isn’t unique to Apple, which apparently has partnered with Nuance, the company with the near-monopoly on spech recognition. Elsewhere, Google Voice and others have made great strides with similar services.

Regardless, this feels magical. That is, Steve Jobs seems to have taken Arthur Clarke’s law to heart: “Any sufficiently advanced technology is indistinguishable from magic.” Siri is a consumer-level tool that starts working right away–no instruction manual needed–and though most of us have no idea how, it can match speech and meaning with appropriate tasks. When it works well, it enables us to skip an entire step in the process of getting things done. We can move from having an idea to being ready to put it in motion without needing to search or transcribe.

Imagine, for example, if Siri or a similar service evolves to the point where a doctor can just say out loud, “please order a CBC with differential, chemistry panel and a lipid panel,” and the program were smart enough–and had enough data on you and the patient–to interpret your meaning and remind you that you’d recently ordered a CBC for this case, and you forgot to order a liver function test. Having that kind of rapid work-check would be an incredibly valuable safety and efficiency feature.

What if, even further down the line, Siri (or a device like it) were able to sit as a fly on the wall, listening to the doctor-patient conversation to a point where it could act as a second expert opinion for diagnoses. Today, physicians practice primarily in a vacuum, with little oversight. It’s too expensive and outright embarassing to have someone observe and coach you, as a physician. Atul Gawande wrote recently on coaching and the experience of being coached as a surgeon, noting that “we treat guidance for professionals as a luxury.” But think if you had a technology that could be watching what physicians did in real time with the goal of offering real coaching. For instance, a continuous speech recognition system could notice specific syntax and nuances of the patient’s wording that might indicate a higher risk of medication non-adherence, and offering an appropriate strategy to overcome that risk. This type of computerized guidance is–paradoxically–somewhat offensive to me as a physician, but incredibly exciting to me as a patient.

Clearly, with further development, something like Siri can be game-changing for medicine. But here, a caveat. Only if doctors really want it. Change always seems to come with two elements. One is a profound need, but the other–and this is the one we in healthcare frequently miss–is an actual desire to change. That said, the more expert and more successful the workers in any given field become, the less they’ll actually want any change.

More broadly, we should think of Daniel Pink, who speaks to the keys behind motivation as Autonomy, Mastery, and Purpose. Doctors have for generations performed medicine in a very autonomous manner; it’s one of the things we’re good at. How good are we at relinquishing control, or even accepting external guidance?

Still, something like Siri has a leg up because it’s intuitive. Our own brains spend a good deal of time decoding words and sounds, so speech-based communication is more intuitive to us than a keyboard or even a touchscreen. This type of user interface should bear a sense of satisfaction when we notice the job is being done right. This intuitive component is something we just don’t get from a lot of healthcare technology today, which suffers from the “clicky-clicky problem,” a term that my wife (also a physician) coined when describing the irritating way that most EMRs force doctors into drop-down menus, radio buttons, and dozens of text entry boxes–death by clicks.

While I don’t foresee an immediate conversion to Siri-like decision support services among the majority of physicians anytime soon — our desire for autonomy is too strong–I do believe that its ability to slip into iPhone users’ everyday patterns will translate to longer-term adoption. I’d see its course going something like this: We begin by using it in our personal lives. Insidiously, it works its way into the administrative sphere of offices and hospitals–not the doctors at first. From there, it makes its way into the operating room, where, in an environment not dissimilar from driving a car, we’d prefer not to have to type or touch anything to express our ideas, queries, and commands. Finally it becomes truly pervasive not just among early adopters, but also those older doctors–particularly those used to dictating already–who are so adept at what they’re doing that the effort of change seems unnecessary.

One of our rate-limited steps may be the interoperability aspect. Nobody wants a technology that functions in a vacuum; if you take a dictation, you’re going to need it to be stored somewhere, and its semantic meaning interoperable with other systems. We’ll need apps, and those take time to take from a concept in someone’s head to actual living, breathing code and user interfaces. That said, the SMART platform championed by Zak Kohane and Ken Mandl at Harvard, and the iNexx platform from Medicity on the private side, are both very promising at establishing an easier way to interoperable “apps” in the healthcare environment.

Some people, like the good Dr. Alexander Blau at Doximity, have raised important concerns about security and potential HIPAA violations with Siri. It’s my sense that the patent for the technology suggests that the data are either encrypted or have the option of being encrypted. Either way, I suspect that Apple or Nuance will be asked to make this part a little more transparent by healthcare IT companies interested in levering their technology–particularly if they’re using any data as training data for the system. For the time being, though, the possibility of HIPAA-insecurity well worth keeping in mind. Longer-term, we’ll have to see if the FDA will start playing a larger role in regulating medical technologies that employ speech recognition.

Oh, the part about swear words? Well, believe it or not, back in medical school, my roommates and I invented a game where we would call and swear at toll-free automated voice response systems (e.g. MovieFone and airlines) to see what the system would do. The smart systems would actually direct you to a live human operator right away. Well, Siri actually recognizes a lot of obscenities, and even has some pretty humorous responses. And at the end of the day, if speech recognition technology in medicine starts to be able to pick up on the emotional intent of users–not just the semantic meaning of, say, medical terms or swear words–then my bet is that we’ll have entered a truly magical era of Healthcare IT.

Build the Healthcare Web