In the Bible, Jesus gave the blind Bartimaeus the ability to see. We want to provide blind individuals with the ability of spatial navigation through the help of an audio agent that gives verbal commands on how to move toward an arbitrarily chosen object. This interface is run using a phone camera pointed at a field of view, and we allow for auditory cues from the user to identify what object within their peripheral they would like to procure.
Tech Stack
Through YOLO and OpenCV (Hugging Face Visual Transformer) we identify the closest object of that type, and enclose it with a bounding box. Then, we utilize two zero-shot learning models (depth anything and SAM) and a novel mathematical path-finding algorithm we developed to provide audio cues to the user moving through legal motion (paths without obstacles in the way aligned toward the target object/location).
Demo Videos
We've performed extensive tests on the accuracy of our model (from a computer vision and trajectory planning perspective) in order to determine its robustness. Below, we display 4 general cases of indoor-assisted navigation.
Navigation to a water bottle in front of us.
Finding a chair with a backpack in the middle.
Finding a chair that isn't in our current frame.
An object is not in our environment.
Challenge Example
We also examine the case where there might be multiple objects that are obstacles towards the goal of reaching for a specific object while also being able to seamlessly perform multiple task finds.
Acknowledgments
Special thanks to all contributors and collaborators who helped make Bartimaeus possible.
…to HackPrinceton for allowing us to breathe life into this
…to Jennifer for inspiring this project
… to the Western Penn. School for the Blind for engaging in a follow-up project