Skip to Content

Developer's Guide to Creating Talking Menus for Set-top Boxes and DVDs
Speaking of Graphics

The most difficult challenge facing designers of audio-navigation systems is finding the optimal strategy for translating graphics into their spoken equivalents. A well-designed audio-navigation system may not always adhere to the design concept underlying the graphic interface. The audio-navigation interface must take into account the realities of human auditory processing abilities, which are distinct from the means by which we comprehend visual or even tactile information.

That said, no one expects developers to create two distinctly different interfaces, one for sighted users and one for users who are blind. To do so would not be practical or cost-effective and may not even serve the interests of users. Remember that many users will listen to the audio navigation while looking at the visual menus. However, developers of audio-navigation systems should be willing to take a few judicious liberties with the visual interface when creating an audio equivalent. For example, a button labeled "Main Menu" on the visual interface might be spoken as "Return to Main Menu."

In the idealized scenario discussed earlier, Joe used talking menus to make sense of the electronic program guide. One of the first things he did was to instruct the EPG to present information in a linear fashion, reading forward from the current time, down the list of channel choices. The fact that the graphic representation of the material is a two-dimensional grid is unimportant to Joe. While the grid allows a sighted viewer to quickly locate the intersection of a particular channel and a particular time of day, that visual shortcut is not available to Joe. In fact, if the talking menu tried to explain that structure to Joe or to force him to visualize that layout in his mind before making a selection, he would probably have a much harder time navigating. The fact is, information presented in a linear list is easier to hear and to remember than information spoken aloud in two-dimensional grid — if such a thing is even possible.

Let's take another example. Imagine a calendar, December, 2002. On which days of the week do Christmas and New Year's Day fall? A quick glance tells us Wednesday for both. To learn the answer so readily, we rely on the graphic design of the calendar. We know that the weeks are arranged in rows. We know that numbers mark each day. We don't need to begin scanning at the top to find the 25th and the 31st — our eyes can jump right to them. As a piece of graphic design, the two-dimensional calendar combines simplicity with elegance, and is highly usable.

For someone like Joe, however, the power of the graphic design is lost. To him, a month is a string of days, grouped in weeks. Within that simple description, he is free to organize the concept in his own mind any way he chooses. It is unlikely that he merely reconstructs a two-dimensional grid and then "reads" from it using his mind's eye. That strategy presents too many possibilities for error. Rather, he's more likely to want to know on what day of the week the first of the month falls. In this case, December 1st is a Sunday. Quickly adding three weeks, or 21 days, he learns that the 22nd is also a Sunday, putting Christmas three days later, on Wednesday.

If Joe wanted to scan the entire month for appointments, he might choose to do so by the week — asking the calendar to read first through all the days in the first week, then the second and so on. He could also check by day — stepping through the Mondays, the Tuesdays, and so on. These methods are analogous to a sighted person looking across the rows or down the columns of the calendar. But Joe never needed to know, or cared, that the calendar is graphically presented as a grid.

One of the greatest services the designer of talking menus can provide can provide for users who are blind is to temporarily ignore the logic underlying the graphic representation of data, and think how best to help the audio-dependent user absorb the menu items. This is especially true for developers working on complex applications, such as the STB. It is less true for developers working on applications with very simple and easy-to-use menu trees. In other words, thinking outside the box of the visual interface becomes more important as the graphical interface itself grows in complexity.

In order to create an interface for navigating complex menu structures, designers should first ask themselves the following questions:

  • Is the menu already a simple linear list?
  • Is the menu a grid?
  • How many levels does the menu system have?
  • Is the menu system designed to deliver information or make choices to drive a process?
  • Which graphic items that are not actionable choices are necessary and which can be ignored?
  • How much does the user need to know about what a sighted person sees when using the graphic interface?
Answering these questions, as well as the usability questions presented above, can provide a framework for a discussion of best practices when designing talking menus for users who are blind or have vision impairments.