Since its inception, eye tracking has been employed by cognitive scientists to study reading, learning, attention, and neurology; by marketers to examine the effectiveness of ad and package designs; and by human factors engineers to guide automotive and airplane cockpit design. These and other disciplines have had great success leveraging eye tracking as a behavioral research method and to inform the design of communications and interactions. Recently, as eye tracking technology has become more affordable and accessible, academics, research suppliers, and eye tracking equipment makers have been experimenting with applying eye tracking to behavioral research in new domains.
On the one hand, it makes intuitive sense that knowing what people look at (or don’t look at) on a webpage would be useful in assessing the usability and effectiveness of that page. However, as is the case with many qualitative research methods, it has proven difficult to completely validate this assumption. Even if the methodology adds value to traditional usability testing, some usability practitioners have argued that this value is not justified by the additional cost of the eye tracking equipment, software, and training. This assertion is also difficult to prove or disprove, so it’s still the subject of debate.
But the fact remains that many usability practitioners (ourselves included) are using eye tracking as a tool to better understand how people "see" websites and to make more informed design decisions as a result. Well-planned and executed eye tracking studies can supplement traditional usability research by providing information about user impressions that the test participant can’t report and the researcher can’t observe.
The equipment and technology behind eye tracking has been well explained by others, particularly the vendors and manufacturers of eye tracking equipment and software, so we won’t cover those issues here. Instead, in this article, we hope to provide some important background knowledge about the neurology behind eye-movement behavior and an explanation of when and how eye tracking can be used as an input to the design process. We don’t expect this information to alter the opinions of those who already don’t find value in eye tracking. Our goal is to better inform the debate amongst usability and design practitioners who are interested in how best to collect and analyze this additional level of insight into how users view and interact with digital media.
Many cognitive scientists believe that every moment of experience is a mental reconstruction of the world based on complex "calculations" that combine a vast amount of environmental data. The majority of information input to user experience is visual, and eye tracking provides contextually relevant information that cannot be matched by any other readily available design research method. The goal of eye tracking is not to see the world from another person's point of view or identify his precise thoughts, but instead to provide a detailed account of much of the real-time data a person uses to construct their experience from moment to moment.
Neurology of Vision
Of all the ways that humans collect information about the world around them, vision might be the most ancient, complex, and important. Receptors in the eye convert light reflected by objects in the environment into electrical energy that can be processed by the optical nerve, and eventually by the brain. The most powerful light receptors are in a central region of the eye called the fovea. To read, perceive color, recognize faces, and do most any activity where visual detail is of primary importance, the fovea must linger, or fixate, on a relevant point within the visual field. Several times per second, the eye completes a cycle of fixation, saccade (the movement between fixations), fixation, saccade, and so on. On each fixation, the fovea gathers new detail about the visual field. Each saccade is a rapid adjustment of the fixation point so that the eye can gather data about a slightly different part of the visual field. Interestingly, this cycle is completed by a person’s eyes many times a second, even when he feels he’s staring at a single point.
Once light signals have been converted into electrical signals, they become the input to activity in the visual cortex that follow two primary "pathways" through successively more complex layers of cortical tissue and several functionally specialized regions. This activity is thought to incorporate raw data from successive eye fixations into perceptions of visual cues such as movement, color, texture, and depth. In turn, these visual cues may be sent to a central region of the brain where they become part of the conscious thought stream, and are combined with other data to create "experience."
It’s important at this point to clarify that eye tracking does not directly measure parafoveal vision, which detects less detail and helps plan foveal fixations. Eye tracking also doesn’t measure peripheral vision, which does not distinguish color or fine detail, but is good for coarse detection of movement, shapes in low lighting conditions, and the context of a visual scene. And, of course, many non-visual sources of data such as sound, touch, expectation, knowledge, and past experiences, are important to constructing experience, and can even impact visual perception itself.
Attention is a psychological construct defined as an element of cognitive functioning in which mental focus is maintained on a specific issue, object, or activity. Eye movement is tightly linked to attention. To summarize the corpus of research done to date, you cannot move your eyes without moving attention, but you can move attention without moving your eyes. Even when the eyes and attention are not in synch, as soon as attention moves to a new position, the eyes will want to follow. Therefore, in the context of understanding what people are attending to on a website or application, attention and the focus of the eye are nearly indistinguishable. So the interpretation of eye tracking data must be done with an understanding that the eyes do not directly reflect attention, but they are our best window into what users are paying attention to.
The Ketchup Bottle Problem
As a real-world example (and with thanks to Jared Spool of UIE for suggesting this scenario), consider a person staring into an open refrigerator searching for the ketchup. They may be staring directly at the bottle yet, for a variety of potential reasons, might not be paying sufficient attention to it to realize that it’s there at all. Eye tracking would tell us that this person "saw" the ketchup even though they might personally report that no ketchup was present. The risk of this scenario is that eye tracking could erroneously conclude that the user was "successful" in their task of finding the ketchup when they clearly were not.
However, when combined with other data points, this apparent risk can be converted into an opportunity to derive a potentially important observation. Specifically: does the user fixate on the ketchup but not take it out of the refrigerator, or self-report that they didn’t see it? This would indicate that, in fact, they did not "see" the bottle despite its prominent location. This, in turn, would suggest that researchers need to ask what other factors could be at play that prevent the bottle from attracting attention or comprehension (e.g., the bottle is opaque, uses an unfamiliar label design, etc.).
Automatic and Controlled Visual Processing
In order to adequately handle the complexity of the world, the cognitive system performs automatic processing of select visual data. That is, visual data may be interpreted and filtered before entering the conscious thought stream. This permits the allocation of conscious thought to more complex controlled processing that is typically related to novel, difficult, and unpredictable tasks. By definition, automatic processes occur below the threshold of consciousness and often cannot be assessed by verbal report. Eye tracking observes the input to these processes before they are filtered out of the conscious thought stream, and thereby gathers insight that cannot be gathered via traditional design research methods that rely on the user reporting conscious observations.
Despite being below the level of consciousness, these insights are important to understanding user experience because human experience is influenced by all cognitive processing, even automatic processes that are not part of the conscious thought stream. Understanding how humans process information below the threshold of consciousness has been used to explain many psychological phenomena, such as subliminal effects and the cocktail effect. This understanding has also helped establish heuristics used in behavioral economics, and supported studies showing that automatically processed environmental influences can have a strong impact on people's behavior and health.
As an example of the effect of website design on visual processing tendencies, consider the concept of "banner blindness." Advertising on the Web has become so commonplace and predictable that it is common to observe users fail to "see" traditionally formatted banner advertisements, and even ignoring actual content that resembles banner ads. This might be because users automatically process banner ads and banner-ad–like elements as things they do not need to pay attention to. That is, they effectively ignore it by automatically processing the shape, placement, and/or other characteristics of the feature to determine whether it should be filtered out before it enters the conscious thought stream. The same thing often happens with structural elements of a site, such as the navigation bar, certain sections of a grid, or the site footer. Eye tracking can be used to understand the degree to which users focus on certain sections of the site and determine which can be redesigned to encourage their admission into the conscious thought stream.
Eye Tracking in Usability Studies
Eye tracking provides an account of where participants are looking on a screen as they attempt to explore an interface or complete a task. When used properly, it can deliver deep, unique data that can be visualized in a variety of interesting ways and mined for insights, ideas, and innovation. However, the successful use of eye tracking in design research requires an understanding of when eye tracking will and won’t add value to the research process.
One of the most valuable uses of eye tracking data is to provide data that complement other observations. In social science research, this is referred to as triangulation, and is an important hedge against the weakness or intrinsic biases that come from single method, single-observer, and single-theory studies. Combined with "talk-aloud" usability testing, eye tracking can help complete the research picture by supplementing what users say with what they see. In this way, the most valuable data that can be harvested via eye tracking are those related to behavior that cannot be observed by the naked eye or accurately articulated by the average human being, but can be complemented by those kinds of observations.
Reduce Intrusions on the Respondent
Incorporating eye tracking into usability testing can also reduce intrusive and interrupter probes that are required during "talk-aloud" usability testing. The participant needs to keep his chair still and sit up straight, but beside that his experience is not interrupted or burdened by eye tracking. In fact, the measurement of eye-movement behavior can reduce probes focused on mechanical questions such as "what are you looking at now," or "did you notice this." Such questions can distract a participant or derail his train of thought, and have been shown to increase cognitive load. By capturing eye tracking data and reviewing it after the session, you can answer many of these questions without altering participants’ behavior.
Reduce Social Desirability Effects
Social scientists have repeatedly documented the fact that participants will adjust their actions and tell "white lies" in order to please or impress a researcher during self-report research. Measuring eye-movement behavior, which participants cannot alter as easily as verbal or survey responses, is one way to control the influence of social desirability. In addition, eye tracking data can be used to verify whether participants' self-reports, such as what they looked at first, were accurate.
Focus on the Participant
Many participants are reluctant to focus on themselves during an interview. It is common to for participants to give responses that project feelings and behavior onto others, or to answer questions with generalizations or speculation about how others might answer. The use of eye tracking can refocus the conversation on the one person the participant can really speak for: himself.
High Face Validity
On its surface, or "its face," the focus of a participant’s gaze on an interface seems like it tells you something about their experience. Eye tracking has extremely strong face validity because the focuses of the eye, the location of the mouse, and/or the subject of conversation are often in harmony. While face validity is not a sufficient condition for making causal links between hard data and underlying behavioral phenomena, it is a sufficient condition for adding depth to qualitative observations.
A glance at a call-to-action when beginning a task, a brief distraction before completing a task, or a frantic search for key information can unveil behavioral subtleties that are fundamental to understanding experience. Unfortunately, many of these valuable insights can be overlooked in traditional usability testing and other observational methods. Because eye tracking data and deliverables look like they represent what we want to measure, and drawing a direct link from what you can observe to what people are doing is intuitive and straightforward, eye tracking ensures that these insights are not easily missed.
Eye tracking is a tremendous tool for getting non-usability-experts excited about usability testing due to the simplicity of drawing conclusions, the ability to have real-time data observation, and the compelling visual presentation of results. All observers—including executives, engineers, product managers, creative teams, and other important contributors—can get more involved in usability testing and derive more value from the process.
For Behavior, Not Feelings
As a general rule, when the key research questions are about feelings and attitudes, eye tracking must be complemented by other data. There is no straightforward connection between eye-movement behavior and engagement, interest, understanding, frustration, or confusion. This is because eye tracking cannot tell you what people think, it can only provide insight into what people are doing.
Analysis of Eye tracking Data
Just by observing eye tracking data as it is being collected, designers and other stakeholders can gain unique and valuable insights. But a well-planned, well-executed, and properly analyzed study can also provide several valuable points of reference to understanding user experience.
Eye tracking research reports typically include visual presentations that make it easier to understand the study results, such as:
- Heat Maps to show where on the interface participants focus.
- Time to 1st Fixation to explain how long it takes for participants to focus on specific locations of interest.
- Dwell Time to provide detail about which visual elements occupy participants' time and attention.
Precise Time Measurement
The mechanics of eye tracking measurement demand that the equipment be sophisticated enough to measure time on the order of milliseconds. This has the fortunate side effect of enabling researchers to observe behavior that is of such a short duration that it cannot be measured with traditional design research methods.
- Viewing Time
- Eye tracking can provide measures of time spent viewing a page or screen as well as regions of interest—any region of the screen, large or small, that can be specified before or after research is conducted. Traditional usability tests and website analytics provide gross measures of time spent on a page or screen, but cannot provide as much detail as eye tracking. Eye tracking data can be used to compare how long people spend viewing specific content, pictures, navigation elements, menu items, and empty space. Eye tracking researchers use these measures to assess business goals, understand where the user is spending their time, and inform the quality of design.
- Time on Task
- While traditional user research methods focus on general measures of "time on task," eye tracking can provide detail about the time participants devote to each step in a task. For instance, eye tracking studies can measure how long it takes a user to register after landing on a registration page, as can some traditional methods. But eye tracking can also unpack details related to whether users took the time to view other parts of the page before they began registration, and how long they spent filling out individual parts of the registration form. With this valuable data, we can target redesign efforts on the largest opportunities for improvement in any task, simple or complex.
Precise time measurements can provide data that support and/or prioritize design decisions aimed at improving user experience. Every interaction you expect from your user can be broken down into components that can be optimized for time, engagement, and delight. Luke Wrobleski cites an eye tracking study by Matteo Penzo from July 2006 in his book Web Application Form Design (2008) as one of the reasons he recommends top-aligned form labels. The study showed that this alignment reduces the time to move between input label and input field to 50 milliseconds, one-tenth the time it takes with left-aligned labels, and one-half the time it takes with right-aligned labels. These data were used to support his design recommendations and continue to provide evidence of one impact that form label alignment has on user experience.
When people see a new screen, they quickly scan it to assess its purpose and make judgments about what it can provide them. During the course of a usability test, users may be asked to find information or complete a task, and they often have immediate and instinctive reactions that can reveal a lot about what they expect from an interface. Even though users are notoriously unable to report their own visual scanning patterns, eye tracking gives precise and unequivocal information about these otherwise hard-to-elicit details of a user's experience.
However fleeting a user's first glance, it is a crucial indicator of what visual elements dominate an interface, and has a great influence on important judgments that users make. Users' memory of their first impression may fade in mere moments, and users may report that they first looked at an element that their attention settled upon rather quickly but not first—perhaps a part of the interface they believe must be important to the interviewer, or a visual feature they find fascinating. Eye tracking can be an easy and indisputable method for determining the subject of a user’s first glance without changing the context of interaction. Other methods, such as the "Five Second Test," can be used to assess what a user sees in the first moments of visiting a page, but adds an unnecessary disruption, unnatural pressure on the participant, and a recall step that may be irrelevant.
The influence design has on visual behavior can be delicate, and subtle changes can have unexpected outcomes. Eye tracking can be used to gather the data necessary to make those observations, and capitalize on opportunities or avoid pitfalls. In a popular blog post, James Breeze used eye tracking to demonstrate that the images of faces that capture people’s attention can be used to guide people around a website or an ad. He showed people two versions of an ad, one with a baby looking right back at the observer and one with the baby looking at the headline of the ad. Eye tracking results showed that people looked at the headline faster and focused on more of the copy when the baby was looking at the headline.
Psychological research shows that many visual factors (including layout, colors, relative orientation, and movement) converge to influence what regions and features of the visual field are most salient. In a complex interface, many of these visual factors may be in play, and because of the complex influence the entire visual array can have on visual salience, making decisions based on assumptions or relying on participants' self-reports can be risky. Eye tracking records where a user looks, which helps in understanding what parts of the design are most salient, and determining how best to guide users' eyes to crucial content.
Eye tracking research can give good quality data regarding ease of discovery and help clarify its impact on user experience. If a user is able to find what he is looking for in the 2nd or 3rd place he looks, he may not even realize that he actually failed to find the information right away. Even if the participant does realize what he has done, he may be reluctant to disappoint the experimenter by admitting it. Hard data about whether information is located where users look first can provide opportunities for usability enhancements.
Capture More Data
Talk-aloud interview methods are very useful for understanding what a participant is thinking and feeling while interacting with a website or application. However, because an interview is a conversation that can only move so fast, talk-aloud protocols may gloss over, or entirely miss, some details about the user's experience. Even while talking about one part of an interface, a participant may be exploring and forming judgments about another. As the interview progresses, he may never have the opportunity to address those thoughts. Using clarifying probes or returning to an earlier comment might solicit a response that does not reflect reality and, more importantly, is likely to influence behavior over the rest of the interview. When analyzing eye tracking data, practitioners look for patterns to frame, guide, and supplement analysis of interview data.
Noticing But Not Noticing
Users may automatically scan the basic content of your site (menu options, headlines, logos, images, etc.) but not mention them. Traditional user research methods might lead you to believe that they did not see this content because they did not mention it, leaving you worried about the quality of your design. However, eye tracking can be used to determine whether they looked at specific content or not. If they never focused on it, you can assume that they did not examine it in great detail, and you can be sure they did not read it. If they did focus on it, but did not mention it, you might conclude that this was automatic behavior or that they did not consider it important in the context of their use.
Familiar words and phrases are often read holistically, not viewed one letter or syllable at a time, and specific eye-movements called regressions (going back and forth rather than a smooth flow) are related to reading ambiguous or confusing text. This means that one fixation can be adequate to read a familiar label and smooth progression over a passage of text passage may indicate understanding, but repeated or erratic fixations on content may indicate confusion or a lack of recognition. Subtle differences in clarity can be observed and optimized through eye tracking when participants might not feel the slight confusion was worth noting.
Though it shares some of the precepts and methodologies of many established disciplines, Web usability is still a relatively new field. For it to remain relevant, it must evolve as rapidly as the experiences it is designed to assess. Regarding the rightful role of eye tracking in toolkit of Web usability methodologies, many overreaching claims and assertions have been (and will continue to be) made. However, I hope that this article has at least clarified that there is a rational and scientific basis for the application of eye tracking techniques to Web usability such that the usability and UX communities will remain open to a continued exploration of these opportunities.
 In his book Consciousness Explained, Daniel Dennett describes the "multiple drafts" model of consciousness: experience, which is a subjective, malleable, personal narrative, is composed of sensory inputs that arrive in the brain and are interpreted at several levels of un- or sub-consciousness.
 Parker, Andrew (2003). In the Blink of an Eye: How Vision Sparked the Big Bang of Evolution. Cambridge, MA: Perseus Pub.
 For a detailed account, see Schiffman, H.R. (1996) Sensation and Perception: An Integrated Approach.
 See Hoffman, J. E. (1998). Visual attention and eye movements. In H. Pashler (ed.), Attention (pp. 119-154). Hove, UK: Psychology Press. for an in-depth review.
 Schneider & Shiffrin (1977) and Shiffrin & Schneider (1977) introduced this account of attention to Psychology. Cognitive tasks all fall along a continuum of automatic and controlled processing. All processing effects behavior, but the more automatic the process, the less it requires attention.
 E.g., scowling and smiling faces flashed so fast subjects were unaware of their presence still influenced English speakers' good vs. bad judgments of unrelated Chinese symbols (Murphy and Zajonc, 1993).
 Cherry's (1953) Cocktail Effect demonstrates that people can monitor input without attending it. For instance, at a noisy cocktail party, people can ignore surrounding conversations to maintain their own, but if someone on the other side of the party room says their name, they notice and can respond immediately.
 For instance, Maps of Bounded Rationality: Psychology for Behavioral Economics by Daniel Kahneman (2003) describes the framing effect as partly due to making the "perception of select attributes automatic."
 Denzin, N. (2006). Sociological Methods: A Sourcebook.
 This was one motivation behind the development of the Post Experience Eye tracking Protocol (PEEP), presented by Rob Stevens at UPA 2006.
 Data from virtually all types of self-report data have been found to be systematically biased toward the respondents’ perception of what is correct or socially acceptable. See Fischer (1993), Social desirability bias and the validity of indirect questioning.
 See Is there a place for this in measurement? Roberts, 2000. OR Face Validity in Psychological Assessment: Implications for a Unified Model of Validity, Bornstein, 1996
 The inspiration behind the 5-second test. See http://www.uie.com/articles/five_second_test/
 You look where they look, captured from http://usableworld.com.au/2009/03/16/you-look-where-they-look/
 See Wolfe and Horowitz (2004), What attributes guide the deployment of visual attention and how do they do it?/p>
 See, for example, Catalyst Group’s eye tracking study of friends lists on social network sites: http://www.slideshare.net/JanineCoover/catalystgroupfriendslisteyetracking-1503528
This is the second article in a series we're running on eye tracking. The first article, Eye Tracking: Best Way to Test Rich App Usability, ran two weeks ago.