Our analysis route: Designing for accessibility
Our early analysis discovered {that a} vital barrier to digital fairness is the “accessibility hole,” or the delay between releasing a brand new characteristic and making a layer of assist for it. To fill this hole, we’re transferring from reactive instruments to agent programs which can be native to the interface.
Analysis pillar: Bettering accessibility utilizing multisystem brokers
Multimodal AI instruments provide some of the promising avenues for constructing accessible interfaces. Sure prototypes, comparable to our internet readability efforts, examined fashions the place a central orchestrator acts as a strategic studying supervisor.
As an alternative of customers having to navigate a posh maze of menus, Orchestrator maintains a shared context and delegates duties to specialised subagents to make paperwork simpler to know and entry.
Summarization Agent: Grasp complicated paperwork and make even the deepest insights clear and accessible by breaking down info and delegating key duties to specialised sub-agents. Configuration agent: Dynamically handles UI changes comparable to textual content scaling.
By testing this modular strategy, our analysis has proven that customers can navigate the system extra intuitively, customers do not must hunt for the “proper” button, and specialised duties are all the time dealt with by the proper knowledgeable.
Aiming for multimodal fluency
Our analysis additionally focuses on transferring past fundamental text-to-speech to multimodal fluency. By leveraging Gemini’s capability to course of audio, visible, and textual content concurrently, we constructed a prototype that may immediately rework stay video into interactive audio descriptions.
That is extra than simply describing a scene. It is about situational consciousness. In our co-design periods, we noticed how permitting customers to interactively question the atmosphere, asking for particular visible particulars as they happen, can scale back cognitive load and rework passive experiences into lively conversational exploration.


