Research results on the accessibility evaluation method called Barrier Walkthrough

(this page was written by Giorgio Brajnik in 2009)

Goal of this research

The goal of this research is to study effectiveness of methods that can be used to assess accessibility of web sites and applications.

In particular a method called "Barrier Walkthrough" has been devised, has been suggested to a group of students, and experimentally evaluated in order to compare its effectiveness with that of the more usual method called "Conformance Testing".

The barrier walkthrough method aims at determining the level of accessibility of a web site, defined as:

  • web sites are accessible when individuals with impairments can access and use them as effectively and secure as people who are not impaired (Slatin and Rush, 2003)

One underlying hypothesis is that nowadays web accessibility is poorely achieved also because it can be poorely tested and measured. With more knowledge about different methods, and their strengths and weaknesses, this can change.

In general, comparison of methods should be aimed at understanding their validity, usefulness, reliability and efficiency, which are defined as:

  • valid: the extent to which application of the method finds the accessibility defects in a website
  • useful: the method yields data that are useful to developers and to managers
  • reliable: the extent to which the method yields similar results when applied with the same input by different evaluators
  • efficient: the resources expended to apply the method

Outline of the 2008 experimentation

More details are available from the paper A Comparative Test of Web Accessibility Evaluation Methods.

The goal is the formally compare effectiveness and reliability of Barrier Walkthrough (BW) against Conformance Review (CR) using the technical requirements entailed by the Italian Accessibility Law.

An experiment was set up whereby 12 students of mine (who attended a series of lecture on web accessibility, including on BW) evaluated, each, two pages using BW and another two pages using CR. Websites, methods and order were randomized to counterbalance fatigue and learning effects.

Data of the experiment are available as a compressed tar file (for other researchers that might be interested in futher studies).

Summary of 2008 results

We found significant differences on (please refer to the paper mentioned above for details):

  • reliability BW: 0.62 vs CR: 0.69 (10% worse for BW)
  • agreement: Cramer's Phi and ICC higher for BW (0.44 vs 0.34; 0.51 vs 0.35)
  • more correctly identified problems with BW: 13.7 vs 7.3
  • with BW correctness improves by 9%-60%
  • no significant difference on sensitivity, F-measure, time, time compensated for F-measure.

Outline of the 2006 experimentation

More details are available from the paper Web Accessibility Testing: When the Method is the Culprit.

Nineteen accessibility reports produced by students teams were analyzed; 11 were based on barrier walkthrough. A judge (myself) went through the reports, identified the reported problems and classified them as being true problems or false ones (i.e. mistakes made by students). The judge also classified the severity of the problems into the scale 1-2-3 (3 is the worst one).

Data were analyzed by identifying:

  • Precision: (P) the percentage of found problems that are true problems
  • Sensitivity: (S) the percentage of true problems that are found
  • Fallout: (F) the percentage of false problems being reported
  • E-measure: P*S/(cP + (1-c)S) [where c=0.5: 2*P*S/(P+S)]
  • mean severity
  • n. of problems with severity=3. 

These variables are defined as:

Precision = |X|/|Reported|
Sensitivity = |X|/|True|
Fallout = |A|/|False|
X_3 = reported and true problems with severity=3
P_3 = |X_3|/|Reported|
S_3 = |X_3|/|Reported_3|
Contingency Table:
X = True problems that were reported
Y = True problems that were not reported
A = False problems that were reported
B = False problems that were not reported.

Summary of 2006 results

With respect to novice evaluators, and the limited sample:

  • barrier walkthrough is more effective in finding severe problems
  • barrier walkthrough is less effective in finding all the potential accessibility problems
  • barrier walkthrough is more effective in reducing false positives
  • barrier walkthrough is more effective as a teaching mechanism.
Tipo Informazione: