Meowdy, long time no see! I’ve just migrated this blog over to AWS, in hopes that I can start being more active, and share more projects that I’m working on publicly. One such project I’ve been working on lately is a web extension called GamePad Navigator. To best explain what this project is, I’ll provide some background on how it came to be.
Geoguessr
If you’ve never heard of it, Geoguessr is a game where you are placed in a random streetview panorama somewhere in the world. Your job is to locate the exact position of the panorama using context clues like signs, buildings, and foliage. It’s a great game that I’ve always liked playing. Recently, I picked up a Steam Deck to play more games away from my desk, but I discovered that there is no controller support for the game natively. I was able to hack something together in the steam workshop to get some basic controller support working, but it was… less than ideal to say the least.
I started working on a browser extension that would map out all of the menu items and toggles, and read in controller input to move between them. I could have simply implemented this as a static UI map and moved on— but I didn’t want my solution to break when they eventually decided to change the UI, so I started adding more and more complexity until, well, it could map pretty much anything. It was at this point that I decided to take the focus away from Geoguessr, and move the extension in a more general accessibility direction.
So I wanted to make an extension that lets you move between different intractable elements with a controller. I’m hoping to make this a series of multiple blog posts that dive into the details of each problem I encountered while working on this extension. For this post, the first step to is to define what “interactable” means. This ended up being quite the rabbit hole, as there is no good cut and dry way of making this determination.
What should be interactable?
I could just select all button elements on the page and be done with it, but what about text fields? What about buttons that are covered up by other elements? What about buttons that aren’t currently on screen? What if the interactable element on a site isn’t a button element at all. This ended up taking a great deal of iteration, but I ended up settling on a series of increasingly strict checks to test whether an element is likely to be interactable.
- Is the element at least 75% on screen?
- Is the element disabled in some way via a CSS prop or attribute?
- Is the element large enough to be visible to the average user?
- Is the element obscured by another element?
I wrote a series of Typescript classes that trivialize most of these checks, for example bounds overlap checking, or element area and aspect ratio checking. But the final check, is the element obscured by another element ended up being significantly more challenging.
Checking if an Element is Obscured
For the moment, I’ve resigned to take the most technically accurate approach, and look into optimization later. I’ve settled on the following method:
- Get the center point of a given elements bounds.
- Use
getElementsAtPoint()
(MDN) to get an array of all elements that intersect that point, ordered by Z depth. - Slice the list to leave only elements on top of the given element.
- Check each element for props that would make the element opaque, like background, or filter.
- Traverse each elements ancestors (checking that their bounds still overlap with the element in question, and that they are also non-opaque), until we reach a common parent.
- If no element has been found, the element must be visible.
Those familiar with big O notation will recognize this algorithm as O(nd), which to be clear, is pretty terrible. But again, the focus is on technical accuracy as of now, optimization will come later.
The Final Verdict
If we select all form elements on the page, and filter them via the checks described above, we now have a pretty good approximation of which elements are most likely to be interactable on the page. There are some gaps I would like to fill, like detecting the CSS cursor prop, or looking for elements with click listeners, but this provides a solid foundation of most interactable elements. The advantage to this design is that it works more or less visually, the same way you do, and doesn’t rely on any assistive technology tags to be present in the DOM. This enables ALL websites to be compatible, not just ones that follow best practices.
In my next post, I’ll explain the challenges I faced next by creating a dynamic map of which elements connect to which, with my solution to some interesting challenges I was not expecting to run into. See ya soon!
Pup Atlas, Chief Good Boy