Another view: of speech recognition
- 6 May 2016
A few years ago, I wrote an article criticising speech recognition. “Useless” was, I think, my general conclusion.
At the time. I couldn’t get my pre-iPhone smart device or my car to understand anything I said. My wife fared slightly better; the car appeared to understand her dulcet tones – although it didn’t always ring the person she wanted.
I “got it” that certain groups doing very specialised things – such as reporting x-rays – would find a system that recognised their specialist vocabulary very useful. I just wasn’t convinced it was a useful tool for general practice.
Well, I may be changing my mind. Not just because Siri on my iPhone understands most of what I say, but because I can start to see a role for it.
This has been helped by the fact that Nuance has lent me a copy of its Dragon medical software and given me some training sessions on it. Simple review – amazingly good at recognition, but the interface needs a lot of work and to some extent totally rethinking.
Making better use of secretaries
Medical secretaries are not cheap and don’t grow on trees. We have some excellent ones, but it’s a shame to use them as typists when what they are really good at is chasing stuff up; especially when that it involves ringing the hospital and speaking to other secretaries.
It also seems a shame to have them taking dictation or transcribing. Dictation through shorthand went ages ago; we moved over to a tape and then a digital dictation system. I have a microphone/handset connected to my computer and I dictate to a centralised queue that our three secretaries pick up.
It works well – it’s not rocket science. However, one of the projects we are working on as a GP federation is how to join up teams across buildings. To realise savings from federations, you need to utilise your distributed workforce better.
This is a problem I’ll address again in the future; but widening the pool of people who can pick up and deal with the dictations is an obvious area to look at.
Meantime, the idea behind using speech recognition is that instead of touch typing the queue of letters, the system does the typing. The secretaries just do some checking; which is quicker so they are freed up to do other things.
On hearing this idea, one of my secretaries nearly had a fit laughing. She recommended that I listen to some of the garbage she had to deal with and translate into English. “Punctuation-less gibberish interspaced with instructions not text” is how she described it.
To be fair, I’m not convinced todays engines are up to this; but a colleague down the road who has tried speech recognition says that a lot of using it effectively is about knowing you are dictating to a machine. He reckons that once you get your mind around that accuracy rates get good.
Getting doctors talking to computers
When I was a junior doctor doing outpatients, I used to take handwritten notes. Then, I’d dictate a letter – nominally to the GP – for every patient I saw.
GPs don’t work like this. Our notes tend to be in the SOAP format, where we have lines for what the patients have told us and what we have found. It can look ordered, but actually it’s often filled in poorly; unless people are guided by a template.
How does speech recognition fit in here? Do I just keep clicking and typing and filling in the odd field by speaking? Or can we do something more radical?
Can you imagine if I dictated something like: “This 44 year old man, recently divorced and a heavy smoker, presents with alcohol problems having been living rough for a few weeks.” And the system extracted all the right codes and put them in the correct places in the records? That would be progress.
In other words, what if we could use speech recognition for more just than entering data; if we could use the system more efficiently?
Linking automation, help and voice
Going back to Siri; what do I use it for? Setting alarms and meetings. It’s actually easier to say: “Siri set an alarm for 6.30am tomorrow” than it is to find the clock app and do it manually.
I seem to spend my working life clicking around tiny icons, trying to remember which one is which and how to get them to do certain tasks over and over.
Can you imagine if I could say: “Computer: print a blood form for FBC and renal bloods, on the DNs’ printer in their room, and send them a message asking them if they could send their phlebotomist to take the bloods.”
Or if I could read a letter and say: “This patient needs yearly PSAs and alert the usual doc if the reading is above 20”; and it did all that was needed to make sure that happened.
Or perhaps even simpler things: “Do any of the medications that this patient is on have hyponatraemia as a common side effect?”
I saw in the ‘Five Year Forward View’ for GPs something about introducing automation into GP systems. I haven’t seen any detail on what this might mean, but hope it means deep API access to automate tasks. Linking this to speech recognition might be amazing.
I saw in the NHS Alliance’s ‘top ten tips’ for general practice document that touch typing training for GPs was a good idea. This was on the grounds it made computer use less stressful and easier.
I don’t disagree with that; however it seems the cop out. Will 30,000 GPs really go on a touch tying course? Perhaps they will if they can get CPD points for it. However, it might be easier to make the systems they use better and easier to interface with.
What do others reckon? Is it time to move away from WIMP to voice?
Neil Paul
|