Audio-to-Visual Display for the Aurally Impaired (AVDAI)

Overview

The Smart Glasses: Audio-to-Visual Display for the Aurally Impaired (AVDAI) is an innovative project designed to assist individuals with hearing impairments by converting audio input into a visual display. This project integrates advanced technologies like speech recognition, natural language processing (NLP), and holographic projection to create a user-friendly and practical solution. The glasses are built around four main components: an Arduino Uno, a voice module, an OLED screen, and a custom glasses frame.

The project leverages Google Cloud’s Speech-to-Text API and Translate API for the AVDAI functionality, and uses ChatGPT as a voice assistant. The software components, including NLP, were implemented using Python, while the hardware integration was done via Arduino IDE. The initial prototype was constructed from cardboard and plastic scraps, with plans for a 3D-printed final version. Future enhancements include adding databases for weather and news access, object detection capabilities, and redesigning the glasses for improved portability and reduced bulkiness.

Project Principles

The Smart Glasses project involves a variety of principles that span across multiple domains of technology:

Audio Signal Processing:
- Captures audio signals through a microphone and converts them into digital data, forming the basis of the system's input.
Speech Recognition:
- Utilizes the Speech Recognition Module V3 to translate spoken words into text, which is essential for visual display.
Natural Language Understanding (NLU):
- Analyzes and understands the text to determine the user's intent, ensuring that the system responds appropriately.
Microcontroller Integration:
- The Arduino Uno serves as the central processing unit, coordinating between sensors, modules, and output devices.
Real-time Processing:
- Ensures the system processes input, such as voice commands, in real-time, providing instant feedback and responses to the user.
OLED Display Control:
- Manages the OLED screen to display text and images, focusing on efficient data transfer and minimizing power consumption.
Holographic Projection:
- Utilizes a mirror and magnifier to project the OLED pixels onto the glasses lens. This setup allows for adjustable text size and angle, and is compatible with prescription lenses.
Multilingual Speech-to-Text:
- Converts spoken language into text while recognizing and accommodating multiple languages, enhancing the accessibility of the glasses.
Machine Translation:
- Employs algorithms or APIs to translate text from one language to another, accounting for grammar, context, and language nuances.
Voice User Interface (VUI):
- Designs interactions that rely on voice commands, making the system intuitive and easy to use for individuals with varying levels of tech-savviness.
Usability of Glasses:
- Considers the physical design aspects of the glasses, such as weight, balance, and comfort, ensuring they can be worn comfortably for extended periods.

Building the Smart Glasses

The construction and implementation of the Smart Glasses involved several detailed steps:

Hardware Components:
- An Arduino Uno microcontroller serves as the backbone of the project.
- The voice module captures audio input, which is processed by Google Cloud’s Speech-to-Text API.
- The processed text is sent to an OLED screen for visual output.
- The frame of the glasses, initially made from cardboard, will be 3D-printed for the final version.
Software Integration:
- Python (through VS Code) was used to develop NLP-related code.
- The Arduino IDE translated these signals for the Arduino Uno to process.
Display Mechanism:
- Text is converted and sent to the Arduino, which then controls the OLED display.
- The display's output is projected onto the glasses lens using a mirror and magnifying glass, allowing the user to view the translated text in real-time.
Future Enhancements:
- Plans to integrate databases for accessing information such as weather and news.
- Implementation of an object detection algorithm using OpenCV to assist visually impaired users by describing their surroundings.
- Consideration of redesigning the hardware using a Raspberry Pi to enhance portability and reduce the bulkiness of the glasses.

Outcome

The AVDAI Smart Glasses prototype successfully demonstrated the feasibility of converting audio inputs into visual displays, offering significant potential to improve the quality of life for individuals with hearing impairments. The initial prototype, though rudimentary, paved the way for a more refined and usable version that will be 3D-printed for durability and comfort.

Conclusion

The Smart Glasses project represents a significant step forward in assistive technology for the aurally impaired. By combining advanced audio processing, speech recognition, NLP, and holographic projection, the glasses offer a practical and user-friendly solution. The project highlights the importance of integrating software and hardware to create accessible, real-time assistive devices. As development continues, these glasses have the potential to become a widely adopted tool for improving communication and accessibility for those with hearing challenges.