The task of building for accessibility, while worthwhile, can be a rather daunting one. Strategies to incorporate it into a product frequently involve a fair amount of work, much of which tends to be manual and admittedly, tedious. The web accessibility community has made many strides to improving this by making accessibility more accessible for developers. In addition to the best practices and recommendations by WCAG, there are many automated tools that make implementing them a breeze. Recent advances in AI have contributed further improvements to this, such that is it now possible to build for accessibility with minimal effort. In addition to helping streamlining existing tasks like
alt-text generation, AI advances in areas such as natural language processing, machine learning and image processing have opened up new possibilities in the world of accessibility online.
AI for A11y
The idea of applying advancements in AI to improving accessibility is in no way novel. Tech behemoths like Google and Apple have long implemented techniques in AI via digital assistants for instance to create alternative online experiences that don’t require visual cues. These alternative, and arguably more integrated means of accessing the web are however just the tip of the iceberg when it comes to what AI can bring to the accessibility table. The increased availability of cutting edge AI via open source libraries such as H20.ai and TensorFlow have essentially “democratized” AI and made it accessible to the masses. Developers are now able to lean on the power of AI to prioritize accessibility in their applications. This not only allows for vast improvements in the current state accessibility on the web, it also enables new and exciting innovations in the realm of accessibility.
Of all possible applications of AI to accessibility, the most logical one is the autogeneration of
alt-text. As we examined in an earlier post, the inclusion of
alt-text to images is a fairly manual and time consuming process. It involves not only the act of including the
alt-text in an image, but also the ability to textually represent the significance and meaning of the image in the context of the overall page. As a result of the effort that including
alt-text demands, it is often used incorrectly and contributes to a rather unpleasant experience for non visual users of the web. To help with this, image recognition can be used to analyze an image and extract key information that semantically describes the contents of an image. There are many implementations of auto-generated
alt-text available today. One of worthy note is this chrome extension by Abhinav Suri that generates descriptive captions which you can use as
alt-text with nothing more than a right click. He even wrote an in depth step by step analysis of how he built the extension. 💪 Another one worthy of mention is Cloudinary’s add-on alt text generation service. If you’re already using Cloudinary to host images, their solution is a handy dandy way to roll the alt-text generation process right into the upload process. Read more about it here.
Text summarization is another way in which AI can be used to pave the way for accessibility. Today, more than ever before we are exposed to a constant stream of content via slack, email, a 24 hour news cycle and social media (among many others). This information overload can be difficult to parse for those with cognitive or learning disabilities. Text summarization, with its ability to digest and compress large amounts of information using natural language processing, therefore offers an ability to stay up to date without getting overwhelmed. Examples of text summarization algorithms already exist today. Google for instance offers a text summarization model with Tensorflow and Salesforce too announced its own implementation of a similar algorithm.
Sign language Interpretation
Modern digital assistants like Amazon’s Alexa and Apple’s Siri have opened the door to reliable and useful voice-based interactions online. The once visually dependent web is now easily accessible and provides those with visual difficulties an experience that is on par with sighted users. Even so, the capabilities of these voice based assistants still remain outside the reach of users with hearing and speaking difficulties. One way to give these users access to digital assistants is by offering alternative means of communicating with them. An example of this is a project by Abhishek Singh, who created a sign language interpreter that interacts with Amazon Alexa. Like the project for building auto alt text, Abhishek used Tensorflow to interpret gestures that were captured through a camera. You can read more about that project and watch a video of his demonstation here.
Realtime lip reading
In the last few years, rapid advancements have been made to the field of speech recognition. Most notably, Google’s DeepMind project in conjunction with the University of Oxford created software for lip reading that beat some of the best lip-readers in the world by a ratio of 4 to 1. This is remarkable considering the fact that lip reading sans audio or additional context is fairly inaccurate. Fundamentally, lip reading is about recognizing the sequence of shapes a mouth makes and mapping them to specific words. Doing this accurately oftentimes requires a significant amount of additional information like the speaker’s body language, and the overall context of the conversation. While the current accuracy rating of the Google bots, which stand at about 46.8% is far from sufficient for meaningful real time captioning, we’ll hopefully see improvements in the near future as further advancements are made to this field.
AI ❤ A11y
AI has brought significant contributions to the accessibility front. We’re still in the early stages of AI and there are many new advancements yet to be made that can further improve accessibility on the web. The increased focus on accessibility and the growth in support from tech giants like Microsoft, is moreover a welcoming sign that the partnership between AI and accessibility is here to stay.