Quantcast
Channel: Web Science and Digital Libraries Research Group
Viewing all articles
Browse latest Browse all 737

2024-08-22: Paper Summary: "All in One Place: Ensuring Usable Access to Online Shopping Items for Blind Users"

$
0
0


 


EICS 2024 marks the sixteenth international ACM SIGCHI conference, focused on engineering interactive computing systems and their user interfaces. The conference explores research at the intersection of user interface design, software engineering, and computational interaction. Our research paper, "All in One Place: Ensuring Usable Access to Online Shopping Items for Blind Users," was published in the June 2024 issue of the Proceedings of the ACM on Human-Computer Interaction, Volume 8, in the EICS category.

In this blog post, I will summarize our research paper, which focuses on alleviating the significant interaction challenges that blind users encounter when navigating through dispersed content across multiple sections and pages on shopping websites. These issues arise because information related to shopping items is frequently spread across different web page sections, requiring users to move back and forth—a task that becomes incredibly tedious and cumbersome for those relying on screen readers. Additionally, even if users become familiar with the structure of one shopping site, they must readapt when encountering a different site with a new layout, as there is no uniform structure across shopping platforms. Our paper proposes a solution that consolidates this dispersed content into a single, consistent, and accessible interface, simplifying the browsing experience for blind users and significantly reducing the time and effort needed to access essential information.

Introduction


InstaFetch's unified interface features two main views: (A) and (B) represent the default 'Query-Results' and 'Details' pages, respectively. Additionally, (C) and (D) display four key options: Query, Description, Specifications, and Reviews. The Query feature (C) allows users to input product-related questions and receive immediate, natural language responses, as outlined in Section 3.5. The other options—Description, Specifications, and Reviews—provide direct access to the relevant content from the 'Details' page, offering a streamlined way to access specific sections.
Figure 1 Yash Prakash et al.: Figures (A) and (B) illustrate the default 'Query-Results' and 'Details' pages, respectively. Figures (C) and (D) present four key interface options: Query, Description, Specifications, and Reviews. The Query option in Figure (C) allows users to input product-related questions and obtain immediate, natural language responses. The other options—Description, Specifications, and Reviews—provide direct access to the corresponding content sections seamlessly extracted from the 'Details' page for quick and efficient information retrieval.

Online shopping involves purchasing goods or services over the Internet, where customers browse, select, and buy products through websites or apps. Content on these shopping platforms is often organized across multiple web pages, such as a 'Query-Results' page summarizing items and 'Details' pages providing complete information to streamline the user experience (Figure 1 (A) and (B)). While sighted users benefit from visual cues that allow quick scanning and information retrieval, blind users face a more significant challenge. Their reliance on screen reader technology, which primarily supports linear, one-dimensional content access, requires them to invest additional time and effort to gather the same information. 

When blind users seek comprehensive information from various sections of the 'Details page,' such as descriptions, specifications, and reviews, they must not only locate these sections but also mentally retain information across multiple products to make informed decisions. For example, imagine a user searching for a television with features like 4K resolution and smart functionality. They start by navigating to the search bar using the 'TAB' key or other shortcuts, entering a query like 'TV.' After submitting the query, they browse through the list of item summaries on the 'Query-Results page' using basic navigation keys. Upon selecting a TV, they click the link to access the 'Details page,' where they use various shortcut keys to check the specifications and then move to the review section to assess user feedback on picture quality and smart features. To compare another TV, the user must return to the 'Query-Results page,' repeating this process for each item. This back-and-forth navigation requires significant manual effort, making it difficult for blind users to efficiently compare multiple televisions, adding considerable strain to their shopping experience. Here is a small demo of basic screen reader navigation on Amazon:


In this paper, we introduce InstaFetch, a browser extension designed to enhance the online e-commerce experience for blind screen reader users, particularly when interacting with web data items. InstaFetch streamlines the information retrieval process by offering a direct query feature, allowing users to input specific queries about any data item and receive immediate responses, as shown in Figure 1 (C). Additionally, InstaFetch consolidates all relevant information—such as product details, specifications, and customer reviews—scattered across multiple pages into a single, consistent, screen reader-friendly interface, as illustrated in Figure 1 (D). Here is a demo showcasing the functionality of InstaFetch:


In a user study with 14 blind participants, InstaFetch was shown to significantly decrease the need to access 'Details' pages compared to both a state-of-the-art solution (SaIL) and their preferred screen reader, thereby reducing the burdensome navigation between 'Details' and 'Query-Results' pages. Additionally, InstaFetch improved the average time spent and the number of keys pressed per data item, allowing participants to browse more items within the same timeframe. Participants reported that InstaFetch reduced interaction fatigue and increased their chances of finding better deals online.

InstaFetch Architectural Workflow

Figure 2 Yash Prakash et al.: The architectural workflow of InstaFetch is divided into two key phases: (a) the Information Phase and (b) the Query Phase. In the Information Phase, InstaFetch gathers and organizes essential data related to the product, such as descriptions, specifications, and reviews, making it readily accessible in a unified interface. In the Query Phase, users can input specific questions about the product, and InstaFetch processes these queries to deliver relevant, natural language responses based on the previously gathered data, ensuring a more efficient and user-friendly experience.

Figures 2(a) and (b) depict the architectural workflow of InstaFetch, a web browser extension developed using Google's open-source guidelines (not publicly available due to data privacy concerns and real-world development challenges). Upon loading the 'Query-Results' webpage, InstaFetch utilizes the STEM algorithm to identify item summaries and embeds an 'Options' button within each summary. When a user selects an item by clicking the 'Options' button, the Selenium webDriver captures snapshots of the entire 'Details' page. These snapshots are then processed by the Mask R-CNN model, which is trained with Matterport's open-source code, to extract item details such as descriptions, specifications, and reviews. The model was evaluated on 20 new websites, achieving a Mean Average Precision (MAP) of 75.4% at a 50% Intersection over the Union (IoU) threshold and 69.7% at a 75% IoU threshold, with a total loss of 0.529 at convergence, indicating its accuracy in identifying regions of interest. The Tesseract OCR engine processes the extracted item details, and a custom DOM search algorithm subsequently retrieves relevant DOM subtrees for these details, which are then stored as context within the Content Model.

InstaFetch leverages the contextual information stored in the Content Model to support natural language queries, allowing users to ask product-specific questions like "What are the battery life details?" or "Does this camera have a warranty?" A pre-trained LLaMA model (LLM) is guided by prompt engineering to generate accurate, product-specific responses by drawing on information from various sections of a webpage. By incorporating Chain-of-Thought (CoT) and ReAct prompt techniques, the LLM can break down complex questions step by step and take proactive actions, such as retrieving the latest prices or checking reviews. For example, when asked, "What's the battery life, and is it good?" the system uses CoT to identify relevant details in specifications or reviews and then applies ReAct to verify and summarize this information. The LLM responses were evaluated using BLEU scores, achieving 0.78, with annotators rating the responses 8 for factuality, 6.6 for relevance, and 9.2 for grammaticality. Additionally, InstaFetch visualizes relevant content when users select descriptions, specifications, or reviews (Figure 6 (4), (5), and (6)).

InstaFetch User Interface Recap


Figure 6 Yash Prakash et al.: The InstaFetch interface on a shopping platform webpage. It shows the options button, the overlay popup with four tabs ('Query,''Description,''Specifications,' and 'Reviews'), and how each tab displays relevant content, including user queries, product descriptions, specifications, and reviews.

When a user lands on a shopping platform webpage and clicks the options button (Figure 6 (1)), the InstaFetch overlay popup appears, offering four functional tabs: 'Query,''Description,''Specifications,' and 'Reviews.' These tabs are designed for easy navigation using standard 'TAB' and 'ARROW' keys (Figure 6 (2)). Users can submit product-related questions via the 'Query' tab, with responses displayed below the form (Figure 6 (3)). In contrast, the other tabs reflect and display content directly from the 'Details' page (Figure 6 (4), (5), and (6)). InstaFetch is fully compatible with standard screen reader shortcuts, allowing seamless integration without additional keyboard shortcuts.

User Study

A total of 14 blind participants were recruited through email lists and snowball sampling, selected based on their experience with screen readers, familiarity with the Chrome browser, and English proficiency. The group had a balanced gender representation (6 female, eight male) with a mean age of 31.14 years, and none reported additional impairments that would affect task completion. Table 1 provides detailed demographic information.

Table 1 Yash Prakash et al.: Demographics of blind participants in the InstaFetch evaluation study. The table illustrates the diverse backgrounds and online shopping behaviors of the participants.

In a within-subject experimental design, participants were asked to complete an online shopping task under three conditions: using their preferred screen reader, with the SaIL state-of-the-art solution, and with the InstaFetch browser extension. Each condition involved browsing a product list on a different e-commerce website (Amazon, Etsy, eBay) and selecting a product that best matched their preferences, simulating real-world shopping scenarios.

Task Performance Metrics 

Table 2: Comparison of performance metrics of Screen Reader, SaIL, and InstaFetch across average time spent, shortcut presses, and items covered per task.

Figure 4 Yash Prakash et al.: Boxplots comparing the performance of Screen Reader (SR), SaIL, and InstaFetch across three metrics: time spent per item, items covered per task, and shortcut presses per item.

In the study, participants spent the least time per data item using InstaFetch (182 seconds) compared to SaIL (310 seconds) and screen readers (478 seconds). InstaFetch also required fewer keyboard shortcuts (57) and allowed more unique items to be explored (6.8 items) than SaIL and screen readers (Table 2). Statistical analysis confirmed that InstaFetch significantly outperformed screen readers and SaIL in all metrics, highlighting its effectiveness in improving the user experience (Figure 4).

 Query-Related Metrics 

Table 3: InstaFetch query-related metrics detailing participant usage of the query feature, the number of queries issued, average queries per participant, common query examples, query response accuracy, and participant reactions and behaviors when encountering incorrect responses.

In the InstaFetch condition, 12 participants used natural language queries, with some exploring out of curiosity while others were more focused and refined their questions as they gained confidence. The system's response accuracy was less than 50%, leading to varied participant reactions. While most of the 12 participants brushed off incorrect responses, a few were frustrated. Among them, some rephrased their queries, while others turned to manual searches, highlighting both the potential and limitations of the system in handling user queries effectively. Refer to Table 3 for detailed query-related metrics in the InstaFetch study.

SUS and TLX Scores

Table 4: Comparison of SUS (System Usability Scale) and NASA-TLX (Task Load Index) scores across Screen Reader, SaIL, and InstaFetch, showing average scores and standard deviations (SD).


Figure 5 Yash Prakash et al.: Bar charts comparing SUS (System Usability Scale) and NASA-TLX (Task Load Index) scores across Screen Reader (SR), SaIL, and InstaFetch. The charts illustrate higher usability and lower perceived workload for InstaFetch compared to SR and SaIL.

The System Usability Scale (SUS) questionnaire assessed usability by having participants rate various Likert items, with responses aggregated into a single usability score, where higher scores indicate better usability. InstaFetch received notably higher usability ratings compared to both the screen reader and SaIL (Table 4). The NASA Task Load Index (NASA-TLX) measured perceived workload, with lower scores indicating less effort. Participants reported a significantly lower workload for InstaFetch compared to the other two conditions (Table 4). Overall, InstaFetch outperformed both the screen reader and SaIL in terms of usability and reducing perceived workload (Figure 5).

Qualitative Feedback

A common frustration reported by almost all participants was the tedious and time-consuming process of navigating between pages to find the desired information. Many appreciated InstaFetch for consolidating item information in one place, reducing the need for revisits, and improving memory retention. However, participants found navigation within a single webpage cumbersome, often requiring them to sift through irrelevant content. While SaIL helped filter out some of this content, InstaFetch was praised for presenting only item-related information. Some participants suggested adding navigation support within long item segments, like reviews and expressed a desire for voice-based input and intelligent assistants to streamline interactions. Additionally, participants reported experiencing interaction fatigue when shopping online, which led to missing out on good deals. However, they noted that InstaFetch significantly reduced these burdens, allowing them to consider more items before deciding.

Conclusion

The traditional distribution of web data across multiple pages is challenging for blind users, leading to frustrating navigation experiences. Additionally, even if users become familiar with the structure of one shopping site, they must readapt when encountering a different site with a new layout, as there is no uniform structure across shopping platforms. InstaFetch, a browser extension designed for visually impaired users, centralizes essential item information—such as product descriptions, specifications, and reviews—into a single, screen reader-friendly interface. It also features a query function that allows users to directly access specific item-related information by simply posing a question, making the browsing experience more efficient and less cumbersome. In a study involving 14 blind participants, InstaFetch significantly outperformed standard screen readers and a state-of-the-art alternative, demonstrating its potential to enhance the online shopping experience for visually impaired users significantly.

    References 

    Prakash, Y., Nayak, A.K., Sunkara, M., Jayarathna, S., Lee, H.N. and Ashok, V., 2024. All in One Place: Ensuring Usable Access to Online Shopping Items for Blind Users. Proceedings of the ACM on Human-Computer Interaction, 8(EICS), pp.1-25.

    - YASH PRAKASH (@LunaticBugbear)



    Viewing all articles
    Browse latest Browse all 737

    Trending Articles