The AI-Powered “Brain Race” Depends on Massive Data
The rise of OpenAI's ChatGPT has captured the public imagination by showing us the power of artificial intelligence (AI). But these amazing AI models didn't just emerge out of thin air. They were built on a rich library of human-generated text stockpiled across the web; trained by reading billions of websites. Machine learning (ML) models rely on a fundamental principle: the more data they can learn from, the smarter they become.
Advances in AI may now enable a major revolution in dementia care. Transformer architectures in ML models – the same that power ChatGPT –are capable of analyzing comprehensive datasets on human health and generating novel insights. With rapid increases in computational ability, we may soon be able to use such ML models for accurate, early diagnosis of dementia.
This could have a profoundly positive social impact. Minoritized populations, specifically African American individuals, are up to twice as likely to develop dementia as the white population.1 They’re also less likely to receive formal diagnoses earlier because of a lack of access to care and other systemic biases.2 These groups stand to benefit the most from advances in ML models that could enable lower-cost diagnostics.
Unfortunately, under the current paradigm, they are unlikely to benefit from these tech advances. Why?
It is because of what we call the “Data Disparity”. The datasets available to fuel ML models are not representative of the diverse population. In the field of neurology, systemic biases have led to a stark disparity in the collection of data from African American and minority populations.
The largest neuroimaging dataset in the world, the UK Biobank, contains 95% White participants.3 A study found that open-access neuroimaging datasets remain very limited in ethnic diversity, with disproportionately large coverage of white populations (84.3%) compared with African American (8.0%) or other ethnic minorities.4
When the data we collect fails to reflect the full spectrum of human diversity, we risk creating AI models that perpetuate inequities and fail to serve those who need them most. For AI researchers experimenting with ML models to detect cognitive decline, this lack of data will stifle the development of valid diagnostic models.
This is not just a matter of academic curiosity. For the millions of individuals living with dementia, access to early detection and diagnosis can impact everything from disease trajectory to family caregiving burden, to financial planning for care. And for minority populations, the stakes could not be higher.
If left unaddressed, the result will be a widening "neuro-inequity," as game-changing tools for early dementia detection will leave behind groups who could benefit the most. To truly democratize the power of neurotechnology, we must first democratize the data that drives it - by proactively including diverse minds in our machine learning models.
Project OpenMind: Driving Diversity in Dementia Datasets
At CareYaya, we've witnessed firsthand the immense potential of neurotech to reshape dementia care. That's why we've embarked on an ambitious initiative called Project OpenMind, with the goal of building the largest and most diverse collection of research-grade data from minority and underserved populations.
The heart of OpenMind is partnerships with senior centers, adult day health programs, community health clinics, and grassroots organizations serving older adults of color. By meeting participants where they are, in trusted community settings, we aim to break down barriers to brain health research participation. Our approach is to collect high-quality, research-grade data on African American, Hispanic, Asian-American, and Native American individuals across major cities in America. Over several years, this could easily become the largest such open-access dataset available for research.
To rapidly scale-up data collection while ensuring consistently high signal quality, we're deploying a fleet of healthcare students across the country through our existing CareYaya platform. They’re equipped with a range of tools, from simple questionnaire-based cognitive assessments, tech for collecting biometric data, and portable wireless EEG headsets.
These technologies allow us to bring the power of a state-of-the-art neuroscience lab directly into the communities we serve, without compromising on data integrity. By streamlining setup and minimizing the need for participant restrictions, we can accelerate enrollment and retention of diverse seniors. (For an interesting demonstration of how innovation in EEG technology is creating accessible testing compatible with different hair types, readers can watch this video.)
But collecting data is just the first step - curating it into research-ready datasets is equally vital. That's where our AI-driven signal processing pipeline comes in, automatically cleaning data, and aligning recordings across participants to generate standardized, analysis-ready datasets. This will allow scientists and software engineers to easily integrate OpenMind data into their machine learning models.
Critically, we're committed to making the OpenMind dataset available to the global research community, with the aim of catalyzing a new wave of inclusive AI innovation for dementia. By creating a rising tide of diverse data, we believe we can lift all boats - enabling the development of diagnostic and neurotech tools that work equally well for every population.
The OpenMind project is just beginning, but it has already received interest and support from top experts, cultural and political leaders interested in the future of racially equitable neuroscience.
Building a Neuro-Equitable Future of Brain Health
As we stand on the cusp of a neurotech revolution in dementia care, we face a critical choice: will we allow the AI algorithms that increasingly shape diagnosis and treatment to be warped by the biases of narrow datasets, or will we proactively steer them toward inclusivity and equity?
Left to evolve on their own, purely optimized for accuracy on skewed data, these tools risk not just reflecting, but amplifying the disparities that already plague our healthcare system. Like a distorted mirror, they could project a misleading image of dementia risk, concealing the true burden borne by marginalized communities.
But if we deliberately infuse diversity into the DNA of these technologies, from the training data to the development teams, we can create instruments of unprecedented insight and equity. By ensuring that every population is represented in the core datasets that fuel machine learning, we can build models that work equally well for all, regardless of race, ethnicity, or zip code.
Science needs to be representative of the population so it can benefit the population. This is the animating vision behind Project OpenMind - to create inclusive brain health data so it can benefit everyone. By democratizing access to research-grade data from underserved seniors, we aim to catalyze a new generation of AI tools that can spot the earliest warning signs of dementia in every corner of our society.
But starting OpenMind is just one piece of the puzzle. To truly build a neuro-equitable future, we need support from industry, clinical professionals, and academic researchers, that want to put diversity and inclusion at the center of data strategy.
To launch OpenMind, we’ve partnered with leading experts in Alzheimer and dementia care, computational neuroscience, EEG technology, and machine learning. But we’re excited to involve more people with relevant experience, strong mission orientation, or simply those who are interested in coordinating data collection in their metro area. If you’d like to collaborate with us on this journey, advise us as we build OpenMind, or help bring the project to your community, please reach out to the OpenMind team at support@careyaya.org.
The path to this future is not easy, but it is essential. In the fight against dementia, that afflicts our society's most vulnerable, equity is not a luxury - it is an imperative. By ensuring that the transformative tools of neurotech and AI benefit all brains equally, we can light the way to a world where every senior, in every community, can age with dignity, security, and the full support of cutting-edge science.
REFERENCES
1. Centers for Disease Control and Prevention. Barriers to Equity in Alzheimer’s and Dementia Care. Accessed May 21, 2024. https://www.cdc.gov/aging/publications/features/barriers-to-equity-in-alzheimers-dementia-care/index.html.
2. Howard L. Study finds disparities in diagnosis and treatment of dementia. News Release. UC Davis Health. Published January 24, 2024. Accessed May 21, 2024. https://health.ucdavis.edu/news/headlines/study-finds-disparities-in-diagnosis-and-treatment-of-dementia/2024/01.
3. Ricard JA, Parker TC, Dhamala E, Kwasa J, Allsop A, Holmes AJ. Confronting racially exclusionary practices in the acquisition and analyses of neuroimaging data [published correction appears in Nat Neurosci. 2023 Dec;26(12):2251]. Nat Neurosci. 2023;26(1):4-11. doi:10.1038/s41593-022-01218-y
4. Heng NYW, Rittman T. Understanding ethnic diversity in open dementia neuroimaging data sets. Brain Commun. 2023;5(6):fcad308. Published 2023 Nov 8. doi:10.1093/braincomms/fcad308
5. Rocheleau J. Neuroscience has a Race Problem. Nautilus. Published February 15, 2023. Accessed May 21, 2024. https://nautil.us/neuroscience-has-a-race-problem-262340/.
Neal K. Shah is the CEO of CareYaya Health Technologies, one of the fastest-growing health tech startups in America. He runs a social enterprise and applied research lab utilizing AI and neurotech to advance health equity, with a focus on neurological care for elders with dementia. Shah has advanced AI projects to improve neurological care with support from the National Institutes of Health, Johns Hopkins AITC and Harvard Innovation Labs. Neal is a “Top Healthcare Voice” on LinkedIn with a 35k+ following.