Summary from our Voice-Enabled VR Activities Reshaping Language Therapy Experiences

Ognjen Todic and Nikola Paunovic from Keen Research presented our online session https://www.thevrara.com/industry-committees on Voice-Enabled VR Activities Reshaping Language Therapy Experiences , with 25+ professionals.

Ognjen Todic provided an overview of Keen Research's voice-enabled VR activities for language therapy, emphasizing the benefits of on-device processing and privacy. The discussion also covered the benefits of local processing, the SDK and customer use cases, and Pragmatica's virtual reality solution for speech and language therapy. Challenges in accessing speech therapy were highlighted, along with the company's business model and future plans. The regulatory pathway, voice data storage, use cases of the solution, integration with existing VR programs, hardware and battery consumption, voice recognition and noise, supported languages, and pricing model were also discussed.

Voice-enabled VR activities for language therapy

  • overview of Keen Research and their software development kit for on-device feature cognition.

  • benefits of on-device speech recognition and how their product, Keen ASR SDK, enables voice-enabling applications locally without the need for internet connectivity.

  • Keen Research develops SDKs for on-device speech recognition.

  • developers integrate Keen ASR SDK in their applications to voice-enable them.

  • privacy and security benefits of on-device processing, ensuring compliance with regulations like HIPAA, COPPA, and GDPR.

  • importance of on-device processing, including offline functionality, network independence, and scalability.

  • advantages of on-device processing, such as privacy, security, and removing dependencies on network and back-end services.

Discussion about the benefits of local processing

  • The speaker mentions the benefits of running everything locally, including no latency, easier customization, and cost or usage-based cloud solutions. They also mention having full control over data and users' data and its use.

Discussion about the SDK and customer use cases

  • SDK and its features, including native SDKs for iOS, Android, and Linux, a web JavaScript SDK in beta, and a Unity plugin. They also mention supporting multiple languages.

  • customer use cases, including Metronic, University of Louisville, Viacom, Paramount, PBS Kids, and Pragmatica, with use cases in education and VR training.

Discussion about Pragmatica's virtual reality solution for speech and language therapy

  • VR solution for speech and language therapy. They mention the positive feedback and traction received for their VR activities.

  • the problem they are trying to solve, which is the limited availability and high cost of speech therapy for children with communicative disorders.

Challenges in accessing speech therapy

  • The average cost of speech therapy is $150 an hour, which can go up to over $250 an hour in metropolitan areas. Some waitlists for appointments can be as long as 6 to 12 months, and even longer due to the COVID-19 pandemic.

  • The core product is virtual reality activities that allow individuals to practice their communication skills independently. These activities are professionally curated and tailored for clients.

  • The company's solution is to create virtual reality activities that mimic speech therapy sessions, allowing individuals to practice their communication skills at home. This solution aims to provide immediate intervention and overcome the barriers of cost and availability.

  • The advantages of the virtual reality product include the use of Keen Research's voice recognition technology, custom tips and personalized feedback, and the immersive and transferable practice it offers. Virtual reality allows for hands, head, and gesture tracking, which is beneficial for individuals with autism.

Business model and future plans

  • The company's business model involves selling the product as a subscription plan for $150 a month, providing access to all virtual reality activities and the data dashboard.

  • The company is based in Canada and plans to expand to the US market. They have competitors in the VR therapy space but highlight their product's autonomy and voice recognition capabilities as unique advantages.

  • The team plans to finish stage one of their marketing strategy by the end of the year and move on to stage two, which includes ad spend, digital marketing, and social media. They also discuss the need for seed funding by September 2025.

Regulatory Pathway

  • the regulatory pathway for the company's intervention in the field of suicide prevention and clinical trials. Karthik explains that FDA approval is not mandatory in speech and language therapy, but they are looking into regulatory approvals for insurance purposes. The main concern is data privacy.

Voice Data Storage

  • where the voice data is stored. the processing happens locally on the device and there doesn't need to be a recording on the device. However, there is an option to store the recording if needed.

Use Cases of the Solution

  • traction in EdTech, language learning, entertainment/education, and frontline worker scenarios such as warehouse operations and healthcare procedures.

Virtual Reality (VR) and Audio Virtual Reality

  • concern about the lack of voice communication in her VR research study and asked if it could be integrated with an existing company's VR program.

  • the integration of voice communication into an existing VR program would require the company to integrate their SDK.

  • a Unity plugin, but the integration with an existing VR program would require the company to integrate their plugin into their product.

Hardware and Battery Consumption

  • custom hardware, typically Linux or Android devices, can be used for running the recognition locally.

  • the battery consumption during recognition, which varies depending on the CPU and typically takes 20 to 60 percent of a single CPU core.

  • the impact on battery is negligible for most use cases, except for always-on listening scenarios where ruggedized Android phones with stronger batteries may be required.

  • cloud-based services can drain the battery due to Wi-Fi or internet connectivity, making always-on listening in the cloud not feasible.

Voice Recognition and Noise

  • the acoustic models used for voice recognition are trained with noisy data to make them more robust.

  • background noises are typically not challenging for their SDK, but cross talk, especially in situations with multiple people speaking simultaneously, can be harder to handle.

Supported Languages and Adding New Languages

  • the system currently supports English, Spanish, German, and French.

  • models optimized for kids' voices in English.

  • adding new languages is considered straightforward, requiring a lot of speech data. They are open to adding more languages based on business needs.

Pricing Model

  • the company's licensing model and compares it to Azure's pricing model.

  • the trade-offs between the two pricing models and highlights the suitability of their product for larger user bases or environments with unreliable internet connectivity.