We’ve built a prototype to show how we could interact with the Internet using a command-driven approach.
- A screen reader, but one that uses machine learning and natural language processing, in order to better understand both what the user wants to do, and what the web page says.
- One that can offer a conversational interface instead of just reading out everything on the page.
It’s a proof-of-concept, but it’s an exciting idea with a lot of potential and we’ve got a demo that shows it in action.
The problem : screen readers today
I’ve written about this before but here is a recap.
Visually impaired people can interact with the web using screen readers. These read out every element on a page.
The user has to make a mental model of the structure of the page as it’s read out, and keep this in their head as they arrow-key around the page.
For example, on a news site’s front page, once the screen reader has read out the page, you have to remember if the story you want is the fifth or sixth story in the list so you can tab the right number of times to get to it.
Imagine an automated telephone menu:
“for blah-blah-blah, press 1, for blather-blather-blather, press 2, for something-or-other, press 3 … for something-else-vague, press 9 …”
Imagine this menu was so long it took 15 minutes or more to read.
Imagine none of the options are an exact match for what you want. But by the time you get to the end, you can’t remember whether the closest match was the third or fourth, or fiftieth option.
The vision : a Conversational Internet
Software could be smarter.
If it understood more about the web page, it could describe it at a higher, task-oriented level. It could read out the relevant bits, instead of everything.
If it understood more about what the user wants to do, the user could just say that, instead of working out the manual navigation steps themselves.
The vision is software that can interpret web pages and offer a conversational interface to web browsing.Continue reading