Usability tests can be seen to fall into two general categories, based on their aim: tests which aim to find usability problems with a specific site, and tests which aim to prove or disprove a hypothesis. This test would fall into the former category. A search of the literature will reveal that tests looking to uncover specific usability problems often use a very small number of participants, coming from Nielsen’s (2000) conclusion that five users is enough to find 85 percent of all usability problems. Nielsen derived this formula from earlier work (Nielsen and Landauer, 1993). Although there is much disagreement (Spool and Schroeder, 2001), this rule of thumb has the advantage of fitting the time and money budget of many projects.
Use of Eye-Tracking Data
In terms of raw data, eye tracking produces an embarrassment of riches. A text export of one test of Mealographer yielded roughly 25 megabytes of data. There are a number of different ways eye tracking data can be interpreted, and the measures can be grouped into measures of search and measures of processing or concentration (Goldberg and Kotval, 1999):
Measures of search:
- Scan path length and duration
- Convex hull area, for example the size of a circle enclosing the scan path
- Spatial density of the scan path.
- Transition matrix, or the number of movements between two areas of interest
- Number of saccades, or sizable eye movements between fixations
- Saccadic amplitude
Measures of processing:
- Number of Fixations
- Fixation duration
- Fixation/saccade ratio
In general, longer, less direct scan paths indicate poor representation (such as bad label text) and confusing layout, and a higher number of fixations and longer fixation duration may indicate that users are having a hard time extracting the information they need (Renshaw, Finlay, Tyfa, and Ward, 2004). Usability studies employing eye tracking data may employ measures that are context-independent such as fixations, fixation durations, total dwell times, and saccadic amplitudes as well as screen position-dependent measures such as dwell time within areas of interest (Goldberg, Stimson, Lewenstein, Scott, and Wichansky, 2002).
Because of the time frame of this investigation, the nature of the study tasks, and the researcher’s inexperience with eye tracking hardware and software, eye tracking data was compiled into “heat maps” based on the number and distribution of fixations. These heat maps are interpreted as a qualitative measure.