Eye Tracking: Best Way to Test Rich App Usability
Eye tracking has recently been debated on many fronts, with a particular focus on the ways people misuse it, and how some use eye tracking only as a way to "wow" clients. In our experience, however, it's invaluable in bringing to light key findings that are otherwise unattainable through other user testing methods. Eye tracking offers UX people the ability to:
- Leave a participant alone during a test to focus on the task at hand, and therefore
- Capture real physiological data about their conscious and unconscious experiences. This data is unique to eye tracking.
Eye Tracking for Rich Applications
Recently eye tracking has been heavily used in website design and testing. When I became involved about eight years ago, the sites tested were mainly flat HTML. Researchers were able to produce beautiful heat maps that were useful for comparing and optimising simple screen layouts and online advertising placements.
The invention of rich, interactive, and transactional interfaces, however, has meant producing eye tracking results is now more complex. Each interface has multiple states and people can interact in pretty much whatever way they like. People can choose their own way through a task to completion and the eye trackers can't tell which state is what as a person's eyes are tracked. Additional analysis is now required to separate these interactions, and fortunately eye tracking technologies have advanced to make this process relatively simple.
If this new level of sophisticated analysis is not achieved, this will result in eye tracking data being misused and eye tracking will retain (inappropriately) its novelty status.
Usability Testing Fraternities Lock Horns
Think Aloud (TA) is an age-old usability testing method. People are asked to speak their thoughts, feelings, and opinions during a set of usability testing tasks. This is done with the help of the facilitator, who “skilfully interrupts" the process frequently to find out why people do particular things during the test.
In my opinion, when people are faced with lots of interactions on screen, considerable cognitive effort is required. Adding TA to this experience will inappropriately add more cognitive load to the task that would not normally be present. This can lead to misleading additional eye fixations and dwell times on outputs, which clouds the analysis. Often a poor facilitator will prompt users to the next stage (when was the last time someone knocked on your door and helped you find the right button when you browsed at home?), again spoiling the desired realism while testing.
We know there are three types of memory “storage" systems: sensory memory, short term, and long term. Our sensory memory retains an exact copy of what is seen or heard, and is generally thought to last between 300ms and a few seconds. Our short-term memory tends to remember between five and nine “items" (George Miller, et al) of information. If we start to talk about our actions in a TA protocol, these precious (milli) seconds and snapshots of information are quickly forgotten or overwritten. After that, what are they basing their commentary on?
There has been considerable debate about the usefulness of usability testing with eye tracking. Many TA proponents claim their methods, when carefully performed, will find enough issues compared with eye tracking, which they consider to be too difficult, time consuming, and expensive to bother about.
To my mind, this criticism arises from a situation where some people use eye tracking to make wild claims about how all websites should be designed. These can be found in numerous blog posts written that offer tips and guidelines. These include 23 Actionable Lessons From Eye-Tracking Studies, Web Form Design Guidelines: An Eyetracking Study, and Eyetracking Study Reveals 12 Website Tactics. These articles have been widely referenced and retweeted; however, they should be taken with a grain of salt. Without a clear understanding of the methodology used, the information should simply be seen as an investigative tool in your design process, not as the Holy Grail.
Of particular concern is Jakob Nielsen's F-pattern research. This was produced in 2006 and I regularly hear it mentioned in design meetings in Australia. This study was done using the regular TA protocol, which means that participants' eye gaze data is very likely not valid because they were talking to the experimenter during the study. Try doing an everyday task like driving, cooking, or cleaning while all along the way verbalising every little step, and see how your behaviour (actions, methods, or time to complete) is affected.
Retrospective Think Aloud
Retrospective Think Aloud is another usability testing method that has been used for many years. In this case, participants give their opinions of a task after it is completed and the interview is recorded for later reference. Of course, it is hard to remember what you did during a task.
Retrospective TA with eye tracking (RTA) is a method in which participants are quickly calibrated on the eye tracker and then asked to do the testing task without interruption from the facilitator. In fact, the facilitator can even leave the room during a test. Following the test, the facilitator immediately asks the participant to score their experience and then replays the eye gaze video of the participant's experience to them. This replay of their eye gaze triggers the person's memory of what they did, thereby mitigating the memory issue. Expanding on this, the eye gaze can also be removed to ask what the participant thought they looked at before revealing their actual interactions.
Think Eyetracking, an early adopter of the RTA eye tracking protocol (which they renamed PEEP), have published a jointly researched academic paper with Lancaster University, UK. Their academic article can be downloaded on the Think Eyetracking Blog. They also had a very popular blog post about it in 2008 that generated some controversy.
Below are some eye tracking heat maps created by Think Eyetracking that show a comparison of a Google search task done with TA (on the left) and RTA (on the right). Note the dramatic differences! It is obvious that the behaviour is very different, with long dwell times and numbers of fixations apparent in the TA output, probably caused by the participants staring at and browsing the screen while verbalising their actions.
Recently, Tobii Technology from Sweden created a unique feature in their Tobii Studio software where during the eye gaze replay stage of the test, the software records a video and audio record of the participant and facilitator as they review the eye tracking session. This can be paused, replayed, and scrubbed to allowing a full detailed analysis of the session with both visual and audio cues. Find out more about RTA on Scribd or watch this video:
Usability labs are set up to approximate real life. We regularly see experimenters set up their testing facilities like offices or lounge rooms to make the person feel at home. TA asks people to talk to someone while they are busy doing a task—where's the real life in that?
Eye tracking is the only real way to test a rich application without distracting the participant.
See Where People Looked, Not Where They Think They Looked
Here are some examples from our recent work at Objective Digital.
We allow people to complete tasks in a focused way, and also obtain real physiological data about what they are doing. It is difficult to argue with and almost impossible to fake these measures. We are not making assumptions about what they looked at and in what order things captured their attention. Some recent client projects encouraged us to use eye tracking to identify:
- Where do people look first?
- What don't they look at?
- What they looked at before the usability issue occurred?
- How people learn an interface?
1. Where Do People Look First?
Eye tracking measures unconscious behavior—and provides data that people simply cannot verbalize in other common user research methods, especially TA usability testing protocols. Decades of psychology research show that much human behavior occurs at an unconscious level.
The human eye, for example, can make up to 5 fixations per second and this occurs below people's level of conscious awareness. So in a 30 second scan of a typical homepage, the customer may be looking at up to 150 items on the page. Your customers (or research participants) simply cannot verbally tell you where their eyes are going and this is exactly the value that good eye tracking data provides.
Our experience is that visual attention data IS correlated with behavioral performance metrics. If people don't "see" something, then they are less likely to click it.
2. What They Don't Look At
Case study 1: Eye tracking shows what things on the screen people didn't look at. Importantly, the data revealed what space was being wasted in the design and what areas of the page were essentially ignored.
Recently, when we tested an internal CRM application for a finance company, eye tracking proved that customer service staff ignored the very information the company wanted them to focus on. In the task, they weren't even required to click on the screen.
The image here shows clearly that in the first few seconds of usage staff focused primarily on the bottom right rather than the bottom left where they were meant to focus. This would not have been observable if simply interviewing them. Considering this screen is used 300,000 times per day, any improvements to the design that make the correct part of the screen more obvious will drive positive outcomes for the finance company's customer service.
Joanna Lewis, whom I work with, recently wrote a blog post about what people ignore.
3. What They Looked at Before the Usability Issue Occurred
Only with eye tracking can we see all the options that people consider, even unconsciously, before starting and completing a specific task.
Eye tracking shows you where people immediately look on a screen. Yes, they can find a target and do a usability task just fine. But where did they look first, especially for ecommerce where time taken can force customers to leave you or stay? Rob Tannen puts this very clearly:
[Eye tracking] does have value as a secondary diagnostic tool. In the context of usability testing, eye tracking does not determine the presence of a usability problem, but helps determine what led to that problem in conjunction with performance data, facilitator observations and user self-reporting.
Case study 2: As the video clearly shows, this user was looking everywhere except at the Donate area on the right. After looking at the navigation both at the side and at the top, the rest of the page was viewed but at no point did the user focus on the Donate area in the main image. It clearly highlights the fact that this call-to-action does not stand out in the prototype, and users are also expecting to see something within the navigation. Equally, the heat map below gives an indication where all six people we tested would expect to see this link.
Case study 3: When users were asked to change one of the options on this screen, the eye tracking heat map below showed very clearly where they were expecting to go. People did not see the areas they were supposed to (indicated in red). Eye tracking of the first second they looked at the screen allowed us to make the site more efficient as it clearly indicated where the functionality should have been positioned.
The heat map below shows the first second of eye tracking on a prototype applicaiton. Users were heavily fixated on one area of the screen, and it can be assumed that this is where they were expecting to find the function they were asked to look for (the buttons).
This experience can also be seen in our financial institution case study.
How Do People Learn an Interface?
Eye tracking is also useful for change management and training when a new system is introduced to staff within a business.
Where do people look the first time they see an application? How about the second time, and the third time? Eye tracking shows very clearly how people learn to interact with a system.
Case study 4:
A new user visiting the website
The new user is seen to skip back and forth between the right hand side panel and the selections and information on the main part of the page to complete the task.
A frequent user of the website
The frequent user skips back and forth less frequently than the new user and is more focused on completing the task.
An expert user of the website
The expert user is highly focused and directed and completes the task with minimum effort.
This example was again from the banking CRM case study. The client even used an eye tracking video as part of the training package for customer service reps. It was used to show them the best way to look at the interface the instant a customer identifies themselves at a branch.
Role-Played Customer Service
The eye tracking data gathered from the CRM examples above was gathered during a simulated customer service interaction. The bank branch staff member was tracked during a 45-minute role-played customer interview. Afterward, the usability issues were discussed when the staff member's eye gaze and screen interactions were replayed to him. I can't think of any other way to do this type of test that essentially involves three people: participant, role-played customer, and facilitator.
Commonly Reported Eye Tracking Advantages
Eye tracking offers unique advantages above and beyond traditional TA. Other widely known advantages include:
- A more relaxed testing environment where participants give feedback in their own time, and actually find more usability errors.
- Executives like eye tracking because it produces compelling physiological data that can't be argued with.
- Real time eye tracking data also provides for a better observation experience. I frequently find that if I am observing a participant's gaze data in real time while they complete their tasks, I am better engaged and glean more detailed insights about the user interface.
In TA, sometimes it can be very hard to see what a person is talking about during the test. I once mentioned this to a TA proponent and they suggested that if the TA is managed well it wouldn't be a problem. During the test, they would have the test facilitator ask the participant to hover their mouse over the part of the screen they are describing so that the observers can see what is being discussed. I'm sorry, but this just means the participant gets even more distracted from the task at hand.