Grading State Disclosure 2004 Logo Graphic

Methodology of Usability Testing

golden bar divider

The usability tests determine if the disclosure information provided on the web by state disclosure agencies is accessible to the average citizen. To do this, an experiment was designed to answer the following question: "Can a non-expert find basic, informative data about campaign finances on the Internet in his or her state without undue difficulty or investment of time?" 

Most usability tests compare a handful of web sites, and are concerned with minor differences between them (see Steve Krug's "Don't Make Me Think" (2000)). Web site designers might be concerned about the location of a task bar on a web page or the use of drop down menus. They hire testers to sit in front of computers and do simple tasks, and the web designers watch how they navigate around the site. The Grading State Disclosure usability test is different; the goal of the test is to identify major differences, not minor ones. Dozens of interfaces were compared across 50 states, and the test measured whether the overall design of a state's web site - from architecture to jargon to database - facilitated access to information by the average voter. The two types of testing do share a common trait, however. In both types of testing the goal is not to determine which design is optimal, but rather to rank the designs from best to worst.

Two standard measures of usability were used. The first was a degree of difficulty measure, on the assumption that difficulty and accessibility are inversely related. Subjects were given three tasks to perform and the test measured the time and number of mouseclicks it took to perform each task. The three relatively simple tasks were devised, after some experimentation, to represent the minimum any citizen should expect from a campaign disclosure site. Subjects were asked to: (1) locate the state’s disclosure web site starting from the state’s homepage; (2) ascertain the total contributions received by the incumbent governor in his or her last campaign (subjects were given a list of incumbent governors that included the year they were last elected); and (3) provide the name and amount contributed by any individual contributor to the incumbent governor’s last campaign.

The second measure of usability was a survey. After the third task was completed, each subject was given a short questionnaire and asked to evaluate his or her experiences on each state’s web site. Subjects were asked whether the web site’s disclosure terminology was understandable, to rate their level of confidence in their answers, and provide a ranking (one to five) of their overall experience on the site. Subjects were also asked if any special software or unusual browser plug-ins were required to access the site’s disclosure information. 

Subjects were recruited from the undergraduate student population at the University of California, Los Angeles, and the experiments were conducted at the California Social Science Experimental Laboratory (CASSEL) at UCLA. The experiment was administered six times to ten different students, and six different students tested each state. The states were assigned randomly to students, and each student was assigned five states. Limits were imposed on the amount of time a subject could take with each state and each subject was given no fewer than 20 minutes to complete the three tasks for each state. Each experiment lasted no longer than 120 minutes, and some subjects were finished within 60 minutes.

There were two concerns about the time and mouseclicks data that were collected: first, subjects could be expected to learn during the experiment and be more proficient with the later states than the earlier ones; second, there might be subject effects (level of competency, prior experience with disclosure web sites, etc.). To address these issues, a fixed-effects ordinary least squares model was constructed to control for subject differences, and a variable was included to control for the order in which each state was tested by the subject. With these controls in place, each state’s average time and number of mouseclicks was estimated for each of the three tasks. In a departure from prior years, we estimated the 2008 scores using data from both the 2008 and 2007 tests.  This was done to better account for between-year improvements within states, and to avoid penalizing states that had adequate systems in 2007 that were unchanged in 2008.  The effect of this was to estimate scores for 100 state disclosure websites, 50 in 2007 and 50 in 2008, as if they were independent of each other.  These scores were then combined into two separate indices and ranked. The survey data were also combined into a single index and ranked.

Each state could receive up to a total of 27 points for the usability test score. The distribution of scores in the three separate indices (time, clicks and survey) was examined and scores were assigned based upon the apparent thresholds in the distributions, using the 2007 breakdowns as the scoring baseline. The top-ranked states received six points each, the medium states received three points, and the lowest-ranked states received zero points for each of the time and clicks indices. The remaining 15 points were assigned according to the survey responses, with a maximum of 15 and a minimum of three points assigned to each state. These three scores were then added together to create the usability test score for the state.

 

 

Back to Appendices Next to Glossary

Back to the Grading State Disclosure home page


First published September 17, 2008
| Last updated September 17 2008
copyright ©
Campaign Disclosure Project. All rights reserved.