Benchmarking web accessibility evaluation tools: measuring the harm of sole reliance on automated tests

TitleBenchmarking web accessibility evaluation tools: measuring the harm of sole reliance on automated tests
Publication TypeConference Proceedings
Year of Publication2013
AuthorsVigo, Markel, Brown, Justin, and Conway, Vivienne
Conference NameProceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility
KeywordsAccessibility, evaluation, Testing, WCAG
AbstractThe use of web accessibility evaluation tools is a widespread practice. Evaluation tools are heavily employed as they help in reducing the burden of identifying accessibility barriers. However, an over-reliance on automated tests often leads to setting aside further testing that entails expert evalua- tion and user tests. In this paper we empirically show the capabilities of current automated evaluation tools. To do so, we investigate the e ectiveness of 6 state-of-the-art tools by analysing their coverage, completeness and correctness with regard to WCAG 2.0 conformance. We corroborate that relying on automated tests alone has negative e ects and can have undesirable consequences. Coverage is very narrow as, at most, 50% of the success criteria are covered. Similarly, completeness ranges between 14% and 38%; however, some of the tools that exhibit higher completeness scores produce lower correctness scores (66-71%) due to the fact that catch- ing as many violations as possible can lead to an increase in false positives. Therefore, relying on just automated tests entails that 1 of 2 success criteria will not even be analysed and among those analysed, only 4 out of 10 will be caught at the further risk of generating false positives.
NotesThis article conducted a benchmarking study towards web accessibility evaluation tools, in terms of their coverage, completeness and correctness according to the WCAG 2.0 SC guideline. Results indicated that heavy reliance on these automated evaluation tools could misguide developers of actual level of web accessibility. Therefore, it could be beneficial if multiple tools are used in evaluation, and future work should continue working on other ways to improve the efficacy of these tools.
DOI10.1145/2461121.2461124