Tuesday, April 12, 2011

Paper Reading #21

Comments:
Comment #1
Comment #2
Reference:

Title: Automatically identifying targets users interact with during real world tasks
Author: Amy Hurst Scott E. Hudson Jennifer Mankoff
Venue: IUI 2009/2010
Summary:


This system is an improvement on Accessibility APIs. Accessibility APIs provide information about the size and location of targets, which helps analyze the effectiveness and usability of of real world software. However, much of the analysis in these accessibility APIs is done in controlled environments, and as a result many of these APIs do not test many practical real world targets. Some of these missed targets can be used in frequently used software, such as Microsoft Outlook. That is where the technique in this paper steps in. 


The solution in this paper can be used across any application "because it leverages visual cues that are ubiquitous across interfaces." This system used a combination of computer vision, machine learning, input event data, and accessibility data. Targets were defined as "interactive elements that the user clicks on." The first level recognizers, used to provide a hypothesized target and location, were a Accessibility API recognizer, Difference Image Recognizer, Color Matching Recognizer, and Template Matching Recognizer. These are described in great detail in the paper.

This technique successfully detected objects from a dataset of 1355 targets with 89% accuracy. Additionally, only 74% of the 1355 targets used could be detected using standard accessibility APIs. The group's hybrid technique was on average 7.2 pixels closer to the actual size of the targets than the Accessibility API alone. That 7.2 pixels represents 19.6% of the average height of the targets used. One limitation of the current implementation is that is can only capture images less than 300x300 pixels. The group believes this to be not a large limitation since larger items are easier to select. Another limitation is that this system is currently only supported on the Windows OS because it uses Microsoft Active Accessibility API. However, the group thinks they could easily extend this to other operating systems. Finally, the group believes by adding more first level recognizers it could further increase the results since this system used many different ones and outperformed all of them on a one-on-one basis.
Discussion:
I found this paper to be pretty interesting. I think it is good to upgrade an area that has so many practical applications and uses. Also, I like how this can be used for more "targets" and a wider spectrum of applications than the current Accessibility APIs. There were a few sentences that did not make sense and had obvious grammatical errors, but were not corrected. I'm a stickler for those types of things, so I think those things should be corrected on a paper like this. However, the fact remains that this is very good technology that has high practical uses. 

1 comment:

  1. I think that the data to be gathered from this method has to potential to lead into better ui design because we would be working with real world data vs scenarios. Also, it would be a plus if this could be packaged with applications.

    ReplyDelete