Figure 1 Venkatraman et al. The language use in accessibility forum discussion threads (left) differs from that in general forums (right). |
In this blog post, I will summarize our research paper, which dives into the unique language patterns in forums dedicated to blind users, especially those using accessibility tools like screen readers. This research not only highlights the distinct linguistic characteristics of these spaces but also points to design improvements for accessibility technologies.
Introduction
Data Collection
To explore linguistic differences in accessibility and general forums, we built two custom datasets, each containing 1,000 conversation threads. The threads for the accessibility forum dataset were gathered from forums dedicated to screen reader users, such as the JAWS and NV Access forums, while the general forum dataset was sourced from a popular general-purpose forum, Reddit. We specifically chose threads focused on software-related discussions, particularly around usability issues and software recommendations. To ensure consistency across both datasets, we used an 8-attribute tuple format to capture essential details such as thread name, thread URL, usernames, user IDs, post content, and date-time for each post. This structured approach allowed us to conduct a fair comparison between the datasets and enabled a focused analysis on language use in technology-centered discussions.
Venkatraman et al. Figure 2: t-SNE plot of accessibility forum thread titles (in
blue) and general forum thread titles (in red) |
Methodology
To compare the linguistic characteristics between accessibility forums and general forums, we conducted a detailed analysis of several linguistic and structural aspects within each dataset. The goal was to identify unique patterns in accessibility forums that reflect the needs and preferences of visually impaired users, contrasting them with the more general-purpose discussions in traditional forums.
Readability: Accessibility forums are geared towards screen reader users, who typically benefit from shorter, more readable sentences. The analysis used metrics like the Flesch-Kincaid grade level and the Gunning Fog index to measure readability. Accessibility forum posts were found to have simpler sentence structures, with an average word-per-sentence (WPS) score significantly lower than that in general forums. This simplicity improves readability and comprehension for users navigating posts via audio, highlighting the importance of concise language in these spaces.
Lexical Density: Lexical density, the ratio of content words (nouns, verbs, adjectives, adverbs) to total words, provides insight into how information-dense each post is. Accessibility forums showed a higher lexical density than general forums, indicating a greater use of content-rich words and a more direct language style. This result aligns with the task-oriented nature of accessibility forum discussions, where users often provide detailed instructions and solutions related to screen reader functions and software usability.
Venkatraman et al. Figure 3: Box Plot for Lexical Density Analysis |
Patrs of Speech: Parts-of-speech (POS) analysis was conducted to understand the distribution of various linguistic elements in accessibility versus general forums. We utilized the Stanford POS tagger to tokenize sentences within each post, labeling words and punctuation according to their grammatical roles. This tagging process allowed for a detailed examination of how often specific parts of speech, such as nouns, verbs, and pronouns, appeared in each dataset.
Venkatraman et al. Figure 4: Box Plot for NNP Analysis |
Personal Pronouns: A notable finding in accessibility forums was the frequent use of first-person pronouns such as "I" and "we." This trend reflects a high level of personal engagement and a sense of community within these forums, where users often share firsthand experiences, support one another, and build connections over shared challenges. The use of personal pronouns underscores the empathy and community orientation within accessibility forums, as users feel comfortable discussing personal experiences and seeking peer advice.
Descriptive Action Verbs: Accessibility forums are highly task-oriented, with a focus on providing practical guidance for using assistive technologies. Posts in these forums often include verbs that describe specific actions, such as "press,""hold down,""navigate," or "select." This action-oriented language reflects the need for precise, step-by-step instructions suited to users relying on screen readers. In contrast, general forums showed more abstract discussions without the same focus on detailed actions, as sighted users can often rely on visual cues for understanding software issues.
Abstractness and Concretenes: Using the Linguistic Category Model (LCM), we measured the abstractness of language in each forum type. Accessibility forums scored lower on abstractness, indicating a preference for concrete language, which is easier for users with visual impairments to process via screen readers. This approach helps ensure clarity, as visually impaired users benefit from descriptive, concrete terms that guide them through specific tasks rather than abstract concepts.
Temporal References: Accessibility forum posts frequently referenced past actions, with a notable focus on resolving issues encountered in previous experiences. Terms like "yesterday,""previous," or "earlier" were more prevalent, reflecting the asynchronous nature of accessibility discussions where users may address older issues raised by others. In contrast, general forums leaned toward present-focused language, often dealing with ongoing or real-time discussions.
Conclusion
References
N Venkatraman, A Aiyer, Y Prakash, V Ashok, You Shall Know a Forum by the Words they Keep: Analyzing Language Use in Accessibility Forums for Blind Users, Proceedings of the 35th ACM Conference on Hypertext and Social Media, 2024.