WordFinder app: Harnessing generative AI on AWS for aphasia communication
On this publish, we showcase how Dr. Kori Ramajoo, Dr. Sonia Brownsett, Prof. David Copland, from QARC, and Scott Harding, an individual dwelling with aphasia, used AWS companies to develop WordFinder, a cellular, cloud-based resolution that helps people with aphasia improve their independence by way of the usage of AWS generative AI expertise.
Within the spirit of giving again to the group and harnessing the artwork of the attainable for constructive change, AWS hosted the Hack For Goal occasion in 2023. This hackathon introduced collectively groups from AWS clients throughout Queensland, Australia, to deal with urgent challenges confronted by social good organizations.
The College of Queensland’s Queensland Aphasia Analysis Centre (QARC)’s mission is to enhance entry to expertise for individuals dwelling with aphasia, a communication incapacity that may influence a person’s capacity to precise and perceive spoken and written language.
The problem: Overcoming communication limitations
In 2023, it was estimated that greater than 140,000 individuals in Australia had been dwelling with aphasia. This quantity is anticipated to develop to over 300,000 by 2050. Aphasia could make on a regular basis duties like on-line banking, utilizing social media, and attempting new gadgets difficult. The objective was to create a cellular app that might help individuals with aphasia by producing a glossary of the objects which might be in a user-selected picture and lengthen the checklist with associated phrases, enabling them to discover different communication strategies.
Overview of the answer
The next screenshot exhibits an instance of navigating the WordFinder app, together with register, picture choice, object definition, and associated phrases.
Within the previous diagram, the next state of affairs unfolds:
- Sign up: The primary display exhibits a easy sign-in web page the place customers enter their e mail and password. It contains choices to create an account or recuperate a forgotten password.
- Picture choice: After signing in, customers are prompted to Decide a picture to look. This display is initially clean.
- Picture entry: The following display exhibits a popup requesting personal entry to the person’s photographs, with a grid of pattern pictures seen within the background.
- Picture chosen: After a picture is chosen (on this case, an image of a koala), the app shows the picture together with some preliminary tags or classifications corresponding to Animal, Bear, Mammal, Wildlife, and Koala.
- Associated phrases: The ultimate display exhibits an inventory of associated phrases based mostly on the number of Associated Phrases subsequent to Koala from the earlier display. This step is essential for individuals with aphasia who usually have difficulties with word-finding and verbal expression. By exploring associated phrases (corresponding to habitat phrases like tree and eucalyptus, or descriptive phrases like fur and marsupial), customers can bridge communication gaps when the precise phrase they need isn’t instantly accessible. This semantic community method aligns with frequent aphasia remedy methods, serving to customers discover alternative routes to precise their ideas when particular phrases are tough to recall.
This stream demonstrates how customers can use the app to seek for phrases and ideas by beginning with a picture, then drilling down into associated terminology—a visible method to increasing vocabulary or discovering related phrases.
The next diagram illustrates the answer structure on AWS.
Within the following sections, we focus on the stream and key parts of the answer in additional element.
- Safe entry utilizing Route 53 and Amplify
- The journey begins with the person accessing the WordFinder app by way of a site managed by Amazon Route 53, a extremely obtainable and scalable cloud DNS internet service. AWS Amplify hosts the React Native frontend, offering a seamless cross-environment expertise.
- Safe authentication with Amazon Cognito
- Earlier than accessing the core options, the person should securely authenticate by way of Amazon Cognito. Cognito offers strong person id administration and entry management, ensuring that solely authenticated customers can work together with the app’s companies and assets.
- Picture seize and storage with Amplify and Amazon S3
- After being authenticated, the person can seize a picture of a scene, merchandise, or state of affairs they want to recall phrases from. AWS Amplify streamlines the method by mechanically storing the captured picture in an Amazon Easy Storage Service (Amazon S3) bucket, a extremely obtainable, cost-effective, and scalable object storage service.
- Object recognition with Amazon Rekognition
- As quickly because the picture is saved within the S3 bucket, Amazon Rekognition, a strong pc imaginative and prescient and machine studying service, is triggered. Amazon Rekognition analyzes the picture, figuring out objects current and returning labels with confidence scores. These labels type the preliminary phrase immediate checklist inside the WordFinder app, kickstarting the word-finding journey.
- Semantic phrase associations with API Gateway and Lambda
- Whereas the preliminary glossary generated by Amazon Rekognition offers a strong start line, the person is perhaps searching for a extra particular or associated phrase. To deal with this problem, the WordFinder app sends the preliminary glossary to an AWS Lambda operate by way of Amazon API Gateway, a totally managed service that securely handles API requests.
- Lambda with Amazon Bedrock, and generative AI and immediate engineering utilizing Amazon Bedrock
- The Lambda operate, appearing as an middleman, crafts a fastidiously designed immediate and submits it to Amazon Bedrock, a totally managed service that gives entry to high-performing basis fashions (FMs) from main AI firms, together with Anthropic’s Claude mannequin.
- Amazon Bedrock generative AI capabilities, powered by Anthropic’s Claude mannequin, use superior language understanding and technology to provide semantically associated phrases and ideas based mostly on the preliminary glossary. This course of is pushed by immediate engineering, the place fastidiously crafted prompts information the generative AI mannequin to offer related and contextually applicable phrase associations.
WordFinder app element particulars
On this part, we take a more in-depth take a look at the parts of the WordFinder app.
React Native and Expo
WordFinder was constructed utilizing React Native, a preferred framework for constructing cross-environment cellular apps. To streamline the event course of, Expo was used, which permits for write-once, run-anywhere capabilities throughout Android and iOS working techniques.
Amplify
Amplify performed an important function in accelerating the app’s growth and provisioning the mandatory backend infrastructure. Amplify is a set of instruments and companies that allow builders to construct and deploy safe, scalable, and full stack apps. On this structure, the frontend of the phrase discovering app is hosted on Amplify. The answer makes use of a number of Amplify parts:
- Authentication and entry management: Amazon Cognito is used for person authentication, enabling customers to enroll and register to the app. Amazon Cognito offers person id administration and entry management with entry to an Amazon S3 bucket and an API gateway requiring authenticated person periods.
- Storage: Amplify was used to create and deploy an S3 bucket for storage. A key element of this app is the power for a person to take an image of a scene, merchandise, or state of affairs that they’re searching for to recall phrases from. The answer must briefly retailer this picture for processing and evaluation. When a person uploads a picture, it’s saved in an S3 bucket for processing with Amazon Rekognition. Amazon S3 offers extremely obtainable, cost-effective, and scalable object storage.
- Picture recognition: Amazon Rekognition makes use of pc imaginative and prescient and machine studying to determine objects current within the picture and return labels with confidence scores. These labels are used because the preliminary phrase immediate checklist inside the WordFinder app.
Associated phrases
The generated preliminary glossary is step one towards discovering the specified phrase, however the labels returned by Amazon Rekognition won’t be the precise phrase that somebody is on the lookout for. The challenge crew then thought-about implement a thesaurus-style lookup functionality. Though the challenge crew initially explored completely different programming libraries, they discovered this method to be considerably inflexible and restricted, usually returning solely synonyms and never entities which might be associated to the supply phrase. The libraries additionally added overhead related to packaging and sustaining the library and dataset transferring ahead.
To deal with these challenges and enhance responses for associated entities, the challenge crew turned to the capabilities of generative AI. Through the use of the generative AI basis fashions (FMs), the challenge crew was capable of offload the continuing overhead of managing this resolution whereas growing the pliability and curation of associated phrases and entities which might be returned to customers. The challenge crew built-in this functionality utilizing the next companies:
- Amazon Bedrock: Amazon Bedrock is a totally managed service that gives a selection of high-performing FMs from main AI firms like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon by way of a single API, together with a broad set of capabilities to construct generative AI apps with safety, privateness, and accountable AI. The challenge crew was capable of rapidly combine with, take a look at, and consider completely different FMs, lastly settling upon Anthropic’s Claude mannequin.
- API Gateway: The challenge crew prolonged the Amplify challenge and deployed API Gateway to simply accept safe, encrypted, and authenticated requests from the WordFinder cellular app and move them to a Lambda operate dealing with Amazon Bedrock entry.
- Lambda: A Lambda operate was deployed behind the API gateway to deal with incoming internet requests from the cellular app. This operate was accountable for taking the equipped enter, constructing the immediate, and submitting it to Amazon Bedrock. This meant that integration and immediate logic could possibly be encapsulated in a single Lambda operate.
Advantages of API Gateway and Lambda
The challenge crew briefly thought-about utilizing the AWS SDK for JavaScript v3 and credentials sourced from Amazon Cognito to instantly interface with Amazon Bedrock. Though this might work, there have been a number of advantages related to implementing API Gateway and a Lambda operate:
- Safety: To allow the cellular shopper to combine instantly with Amazon Bedrock, authenticated customers and their related AWS Identification and Entry Administration (IAM) function would have to be granted permissions to invoke the FMs in Amazon Bedrock. This could possibly be achieved utilizing Amazon Cognito and short-term permissions granted by way of roles. Consideration was given to the potential of uncontrolled entry to those fashions if the cellular app was compromised. By shifting the IAM permissions and invocation dealing with to a central operate, the crew was capable of improve visibility and management over how and when the FMs had been invoked.
- Change administration: Over time, the underlying FM or immediate would possibly want to vary. If both was exhausting coded into the cellular app, any change would require a brand new launch and each person must obtain the brand new app model. By finding this inside the Lambda operate, the specifics round mannequin utilization and immediate creation are decoupled and could be tailored with out impacting customers.
- Monitoring: By routing requests by way of API Gateway and Lambda, the crew can log and monitor metrics related to utilization. This allows higher decision-making and reporting on how the app is performing.
- Knowledge optimization: By implementing the REST API and encapsulating the immediate and integration logic inside the Lambda operate, the crew to can ship the supply phrase from the cellular app to the API. This implies much less information is shipped over the mobile community to the backend companies.
- Caching layer: Though a caching layer wasn’t carried out inside the system in the course of the hackathon, the crew thought-about the power to implement a caching mechanism for supply and associated phrases that over time would scale back requests that have to be routed to Amazon Bedrock. This may be readily queried within the Lambda operate as a preliminary step earlier than submitting a immediate to an FM.
Immediate engineering
One of many core options of WordFinder is its capacity to generate associated phrases and ideas based mostly on a user-provided supply phrase. This supply phrase (obtained from the cellular app by way of an API request) is embedded into the next immediate by the Lambda operate, changing {phrase}:
immediate = "I've Aphasia. Give me the highest 10 commonest phrases which might be associated phrases to the phrase equipped within the immediate context. Your response must be a legitimate JSON array of simply the phrases. No surrounding context. {phrase}"
The crew examined a number of completely different prompts and approaches in the course of the hackathon, however this fundamental guiding immediate was discovered to present dependable, correct, and repeatable outcomes, whatever the phrase equipped by the person.
After the mannequin responds, the Lambda operate bundles the associated phrases and returns them to the cellular app. Upon receipt of this information, the WordFinder app updates and shows the brand new checklist of phrases for the person who has aphasia. The person would possibly then discover their phrase, or drill deeper into different associated phrases.
To take care of environment friendly useful resource utilization and value optimization, the structure incorporates a number of useful resource cleanup mechanisms:
- Lambda computerized scaling: The Lambda operate accountable for interacting with Amazon Bedrock is configured to mechanically scale right down to zero situations when not in use, minimizing idle useful resource consumption.
- Amazon S3 lifecycle insurance policies: The S3 bucket storing the user-uploaded pictures is configured with lifecycle insurance policies to mechanically expire and delete objects after a specified retention interval, releasing up cupboard space.
- API Gateway throttling and caching: API Gateway is configured with throttling limits to assist stop extreme requests, and caching mechanisms are carried out to scale back the load on downstream companies corresponding to Lambda and Amazon Bedrock.
Conclusion
The QARC crew and Scott Harding labored carefully with AWS to develop WordFinder, a cellular app that addresses communication challenges confronted by people dwelling with aphasia. Their profitable entry on the 2023 AWS Queensland Hackathon showcased the facility of involving these with lived experiences within the growth course of. Harding’s insights helped the tech crew perceive the nuances and influence of aphasia, resulting in an answer that empowers customers to search out their phrases and keep linked.
References
Concerning the Authors
Kori Ramijoo is a analysis speech pathologist at QARC. She has in depth expertise in aphasia rehabilitation, expertise, and neuroscience. Kori leads the Aphasia Tech Hub at QARC, enabling individuals with aphasia to entry expertise. She offers consultations to clinicians and offers recommendation and assist to assist individuals with aphasia acquire and preserve independence. Kori can also be researching design issues for expertise growth and use by individuals with aphasia.
Scott Harding lives with aphasia after a stroke. He has a background in Engineering and Laptop Science. Scott is without doubt one of the Administrators of the Australian Aphasia Affiliation and is a shopper consultant and advisor on numerous state authorities well being committees and nationally funded analysis tasks. He has pursuits in the usage of AI in creating predictive fashions of aphasia restoration.
Sonia Brownsett is a speech pathologist with in depth expertise in neuroscience and expertise. She has been a postdoctoral researcher at QARC and led the aphasia tech hub in addition to a analysis program on the mind mechanisms underpinning aphasia restoration after stroke and in different populations together with adults with mind tumours and epilepsy.
David Copland is a speech pathologist and Director of QARC. He has labored for over 20 years within the area of aphasia rehabilitation. His work seeks to develop new methods to know, assess and deal with aphasia together with the usage of mind imaging and expertise. He has led the creation of complete aphasia remedy applications which might be being carried out into well being companies.
Mark Promnitz is a Senior Options Architect at Amazon Net Companies, based mostly in Australia. Along with serving to his enterprise clients leverage the capabilities of AWS, he can usually be discovered speaking about Software program as a Service (SaaS), information and cloud-native architectures on AWS.
Kurt Sterzl is a Senior Options Architect at Amazon Net Companies, based mostly in Australia. He enjoys working with public sector clients like UQ QARC to assist their analysis breakthroughs.
Leave feedback about this