2023-12-29: Paper Summary: "AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users"

ACM Conference on Intelligent User Interfaces (IUI 2023) marks the 28th annual gathering of the intelligent interfaces community, offering a prominent global platform for showcasing research and advancements in intelligent user interfaces. It is a leading international forum where the worlds of Human-Computer Interaction (HCI) and Artificial Intelligence (AI) converge. The conference was held at the University of Technology Sydney (UTS). The conference welcomes contributions from diverse disciplines, including psychology, behavioral science, cognitive science, computer graphics, design, the arts, and beyond. It is a collaborative space where experts from various fields intersect to explore the frontiers of technology and user experience. Our research paper, "AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users," was published at IUI 2023. 1217 authors from 45 countries contributed their papers to the conference, showcasing a rich and diverse international participation. The selection process resulted in an acceptance rate of 24.1%.

In this blog post, I look closely at our research paper. Our primary goal is to address the interaction challenges confronted by blind users as they navigate through multiple web pages using screen reader assistive technology. AutoDesc redefines the interaction experience for blind users by employing a custom extraction model to automatically identify and retrieve relevant information, thereby significantly streamlining the navigation process. In a study involving 16 blind participants, our results indicate that, within the same timeframe, AutoDesc empowers users to explore more web data items than their preferred screen readers.

Introduction

Engaging in web browsing inevitably involves interacting with diverse web data items, such as shopping products, classifieds, and job listings, which are integral to most e-commerce websites. Contemporary websites strive to improve user interaction with these data items by providing various tools like filters and sorting options. Additionally, information on these data items is typically spread across two or more web pages, including a 'Query-Results' page showcasing item summaries and 'Details' pages containing comprehensive information (Figure 1). While sighted users can quickly scan and access information thanks to visual cues, blind users face increased interaction challenges. Navigating between the 'Query-Results' page and the 'Details' page using screen reader assistive technology becomes tedious and cumbersome, contributing to higher interaction overhead and effort for blind users.

Note 1: Current solutions to enhance navigation on e-commerce platforms for screen reader users predominantly concentrate on facilitating efficient and convenient non-visual content access within a single webpage. However, these solutions often need to be revised when addressing the specific challenges posed by content distributed across multiple web pages, as is familiar with web data items.

Figure 1 Prakash et al.: Illustration depicting a system called "AutoDesc". In this system, additional item descriptions are extracted from the 'Details' pages and seamlessly integrated into item summaries on the 'Query-Results' page. This allows users to quickly view detailed information about items in-place, without needing to navigate to a different page.

In our paper, we introduce AutoDesc, a browser extension designed to empower blind users by allowing them to "peek" into additional item descriptions directly within the 'Query-Results' page, eliminating the need to navigate to the corresponding 'Details' page (Figure 1). This functionality saves the user's time and effort by minimizing the back-and-forth traversal between the 'Query-Results' and 'Details' pages. AutoDesc achieves this by automatically identifying and extracting relevant information about an item from its corresponding 'Details' webpage, leveraging a deep learning-based extraction model—specifically, a Mask R-CNN model. This model was developed using a manually annotated dataset comprising 1050 ground-truth examples.

Note 2: A significant number of participants explicitly expressed that when using AutoDesc, they experienced the ability to browse through more items on the 'Query-Results' page efficiently by swiftly filtering out undesirable items. This, in turn, allowed them to mitigate interaction fatigue compared to using their screen readers, offering a higher likelihood of selecting what they perceived as "better deals."

Framework

Figure 2 Prakash et al.: This figure depicts an architectural diagram showing how AutoDesc works. It details the process where, upon loading a 'Query-Results' webpage, AutoDesc identifies web data items and their attributes (like title, price, rating) using the STEM algorithm. A 'Get Details' button is added to each item summary; when clicked, AutoDesc uses Mask R-CNN to find the item's description on the 'Details' page, extracts text with Tesseract OCR, corrects it with a BERT model, and displays the description on the 'Query-Results' page, streamlining access for blind users and reducing the need for navigating between pages.

We developed AutoDesc as a Chrome browser extension (Not available in the public domain due to data privacy and large scale development challenges). Upon enabling the extension, AutoDesc utilized the browser's inherent JavaScript (JS) functions to extract the entire HTML Document Object Model (DOM) tree from the webpage containing data items. The extracted DOM content was then forwarded as a POST request to the Item Extractor module running in the backend server, as depicted in Figure 2. We then independently implemented the STEM algorithm in Python to extract data items due to the absence of publicly available code. Following identifying data items, the AutoDesc extension once again employed the browser's built-in JS functions to introduce two child nodes into each data item's DOM subtree— a visible 'Get Details' button and an invisible '<div>' container. This container serves for the future on-demand display of additional item descriptions extracted from the item 'Details' page.

Upon clicking the 'Get Details' button for an item, AutoDesc initiates using a custom Mask R-CNN model to autonomously identify the region in the 'Details' page corresponding to the item's description. In constructing the Mask R-CNN model, we utilized the publicly available Matterport project on GitHub. Training our Mask R-CNN model required the creation of a custom dataset, for which we employed the publicly accessible GIMP software, given the absence of prior datasets on the subject. In total, the dataset comprised 750 images for training and 300 images for validation. The Mask R-CNN output post-processing was conducted using the Tesseract OCR engine and the pre-trained BERT model.

Following extracting the item description from the 'Details' page, AutoDesc employed built-in browser functions to seamlessly inject it into the '<div>' container associated with the respective item. To facilitate communication between the extension JavaScript modules and the back-end server modules, we implemented the Flask REST API. The entire codebase for AutoDesc is accessible on GitHub.

User Study

In the evaluation of AutoDesc, we conducted a user study with 16 screen reader users, 7 females, and 9 males, all proficient in web browsing. Proficiency in web screen-reading and familiarity with the Chrome web browser were established as inclusion criteria. Participants affirmed their routine access to various e-commerce websites for activities, including shopping and browsing classifieds. The age range of the participants ranged from 21 to 66 years (Table 1).

Table 1 Prakash et al.: Participant demographics. Encompasses data provided by the participants themselves. It includes 'Proficiency', which refers to the participants' own assessment of their skill in using screen readers, and 'Computer Type', which denotes the various kinds of computers utilized by participants during the study.

The participants engaged in representative tasks, such as shopping. For instance, Figure 3 illustrates a study task where blind participants were tasked with browsing a list of television data items on the Macy’s website and selecting an item that best matched their preferences. This task was performed in all three study conditions: Screen Reader, SaIL, and AutoDesc, based on a predetermined counterbalanced order.

Figure 6 Prakash et al.: This figure explicitly depicts the AutoDesc condition, where participants could swiftly utilize additional "Get Details" buttons to access descriptions of corresponding items. These buttons were exclusive to the AutoDesc condition and were not available in the other study conditions, where participants had to access descriptions by navigating to separate 'Details' pages.

During the evaluation of AutoDesc in user tasks on e-commerce platforms, several vital metrics were measured to assess its performance compared to the other two conditions (Screen Reader and SaIL). These metrics included the average time spent per item, the number of items covered during a task, the average number of shortcuts pressed per item (including navigating the details page), and the number of back-and-forth navigations between the results and details pages in each task. AutoDesc consistently outperformed the other conditions across all these metrics, and the observed differences were statistically significant (Figure 4). Users, on average, spent less time per item, covered more items in a task, pressed fewer shortcuts, and engaged in fewer back-and-forth navigations when utilizing AutoDesc, highlighting its effectiveness in improving the overall user experience on e-commerce platforms.

Figures 3&4 Prakash et al.: The box plots provide comprehensive statistics for all study conditions, illustrating both the average time spent per item and the average number of shortcut key presses per item by all participants. Additionally, these plots show the average number of distinct items each participant visited and the frequency of back-and-forth navigation between 'Query-Results' and 'Details' pages.

In evaluating usability using the standard System Usability Scale questionnaire, participants rated positive and negative statements about each study condition on a Likert Scale that ranges from 1 (strongly disagree) to 5 (strongly agree). The System Usability Scale scores, indicative of usability, were notably higher on average for the AutoDesc condition than the others (Figure 5). Additionally, we employed the NASA Task Load Index (NASA TLX) to gauge the mental workload of participants during task performance. The NASA TLX scores measuring task workload were significantly lower and consistently uniform for AutoDesc (Figure 5), indicating reduced mental workload and enhanced user experience compared to the alternative conditions.

Figure 5 Prakash et al.: Perceived usability, as measured by the System Usability Scale (SUS), and task workload, assessed using the NASA Task Load Index (NASA-TLX), for all three study conditions.

Discussion

The findings from the user study underscored a prevalent sentiment among participants regarding the fatigue and frustration associated with exploring data items using a screen reader. Many participants expressed that, when relying on a screen reader, they often made selections after only perusing the first few items due to the tedious process, which led to the potential oversight of good deals. In contrast, AutoDesc garnered positive ratings as it provided instant access to detailed item information on subsequent details pages, enabling blind users to peruse more data items. AutoDesc effectively empowered participants to efficiently filter out undesired items by offering essential information about the data items, aiding in deciding whether to explore an item further. The convenience provided by AutoDesc in accessing and evaluating detailed information was emphasized by participants, contributing significantly to their favorable assessment of the tool.

While the study highlighted several strengths of AutoDesc, it also highlighted certain shortcomings. One notable limitation is the need for a deeper understanding of how the extraction algorithm impacts the overall AutoDesc system. During the user study, websites were specifically chosen where both AutoDesc and SaIL algorithms accurately identified the item description, potentially limiting the generalizability of the findings. Additionally, the training and test examples for the extraction algorithm were exclusively applicable to English websites. Another limitation is that AutoDesc is currently available only as a Chrome extension, restricting its use to this browser.

In response to these findings, future efforts will focus on expanding AutoDesc to work seamlessly on multiple browsers and developing a version compatible with smartphones. Participants in the study expressed a desire for more comprehensive information beyond just the item description, including product reviews and specifications. Consequently, future updates to AutoDesc will aim to incorporate these additional features. Furthermore, there are plans to build a prediction and filtering model that will provide users with comprehensive information related to products and facilitate item comparisons. These enhancements are part of the ongoing development to address the identified limitations and enhance the overall functionality of AutoDesc.

Conclusion

We introduced AutoDesc, an innovative browser extension designed to enhance the browsing experience for blind users. AutoDesc autonomously extracts additional item descriptions from 'Details' pages and seamlessly integrates this information into the corresponding item summaries on the 'Query-Results' page. This feature alleviates the need for blind users to explicitly visit 'Details' pages, streamlining the information retrieval process.

In the user study involving 16 blind participants, AutoDesc demonstrated a significant reduction in the average time and shortcuts required to explore an item in the query-results list. The findings highlight the efficacy of AutoDesc in optimizing the efficiency of information access for blind users.
Moreover, the study identified promising avenues for future research, emphasizing the potential for enhancing user experiences by providing efficient and convenient access to user reviews and facilitating item comparisons. These insights pave the way for further developments in accessibility tools, addressing broader user needs in navigating online content.

References

Prakash, Y., Sunkara, M., Lee, H.N., Jayarathna, S. and Ashok, V., 2023, March. "AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users". In Proceedings of the 28th International Conference on Intelligent User Interfaces (pp. 32-45).

- YASH PRAKASH (@LunaticBugbear)

2023-12-29: Paper Summary: "AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users"

Introduction

Framework

References

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List