A challenge to find 10 human-centered solutions in 10 days
I challenged myself a month ago to find 10 beneficial, impactful uses of LLMs in 10 days. As I share more in this post on why this was a success, I also felt the limits of this search acutely.
The center of this quest was to find how generative AI might be starting to benefit humans. My goal was to find pro-social or human-centered solutions that would not be possible without LLMs. I deliberately aimed to exclude “productivity over tedium” products, to surface real impact and not modern search tools. From reviewing ProductHunt, Google.ai and NYT, I knew this would be a real challenge.
I will explain my findings below, curated “just in time” in 10 days to meet my goal. I posted the list on LinkedIn, although I knew that this limited exploration was only focused on the “promise” of each tool, not its “peril”. All things considered, do Tech solutions to societal problems bring more good than harm?
I decided to revisit my list as soon as I finished reading “The Worlds I See” (a memoir by Fei Fei Li, head of Stanford’s Human Centered AI lab, HAI), having learned directly in my work not to apply a cure that is worse than the disease.
Before LLMs: Looking back on a decade of deep learning
Like most people, I was impressed by computer vision and AlphaFold’s impact on health sciences, both diagnostic and treatment. Beyond protein folding, Text-to-speech/speech-to-text have also brought wonders in many domains (including healthcare). Another area of study and debate is real-life outcomes from safety features in cars: thanks to deep learning, lane departure and collision avoidance systems reduce accidents, but other accidents were occurring at concerning rates, requiring studies ove the last few years to prove an overall improvement.
On a lighter theme, image generators are also interesting - I had the privilege of building Generative Adversarial Networks (GANs) at ODSC East 2018, and also built with Diffusion models in the last 1-2 years (for example, an escape game). Unfortunately, significant Techlash emerged due to intellectual property and toxic content, which on balance put those applications in a mixed or negative overall impact (see the ongoing “Take it down” campaign)
This short list helps to illustrate the potential of Machine Learning / Deep Learning. But there is a contrast between these applications and Large Language Models (LLMs) trained to produce text. Some parts of AlphaFold’s architecture use Transformers, but it does not function like an LLM, nor is it trained on text data.
Are astronomical LLM investments paired with real social promise and worth the perils?
Human centered LLM-applications
10. QuitBot
When I started my quest for how text AI might be benefiting humans beyond productivity, I was excited about my first finding. Generative Text AI (LLMs) open the possibility of an interactive, always available presence for freeform dialog. A team from Microsoft AI and Fred Hutch was able to show a statistically significant improvement over a simplistic sms alternative.
What seemed even more significant was the 4 year formative process that brought it about. I still think about this example as reaping the fruits of our efforts, and I hope it can serve as an encouragement to persevere for teams that encounter challenges with premature LLM deployments.
9. Perspective API
I was torn and nearly dismissed this platform as a high-impact use of text AI (not a generative model), but I found how a team applied it as an initial trigger to rephrase user-generated posts, allowing platforms to rephrase (rather than censor) toxic comments, etc.
As we all work to bridge polarization divides and work towards an understanding of different viewpoints, this is a significant AI win that does, ultimately, include generative text! By building on the perspective API with LLMs, this team started making online discourse more civil in a paradigm that preserves important topic representation or voices.
8. ClimateEngine / Robeco partnership
In investment technologies, this team developed generative AI solutions to scale the analysis of company asset and actions for biodiversity impact attribution. The outcome enabled sustainability-aware decisions at scale, which would be difficult to analyze without automation.
7. GroundNews
Helping people navigate truth and representation in a challenging political environment, especially considering bias risks in the media (e.g. selection bias). GroundNews developed AI-driven live summaries of news coverage helping to assess factuality and showing clear differences in coverage between left, center and right-leaning sources.
As I dove into that topic, I also discovered great coverage of journalism issues at last year's AI for Good summit.
6 & 5. Tabiya and CareerVillage
By incorporating LLM technology into their platforms, personal career assistance services can now reach people they couldn't otherwise, providing help for interview preparation, mock interviews, and tools to optimize resumes and cover letters. Read more at Tabiya and CareerVillage.
4, 3 & 2: Legal Services Organizations in Middle Tennessee, North Carolina, and JusticiaLab
I found that these three organizations developed ways to bring more legal support than previously possible to underserved rural populations as well as immigrants, helping them to navigate processes and uphold their rights.
Navigating legalese would have been prohibitively difficult to scale without LLMs’ interactivity due to requiring significant legal expertise, time, and expenses.
1. FullFact AI
Historically, NLP struggled with issues to assist fact-checking (covered recently by Warren et al. 2025, Mitral et al. 2024). As our rate of absorption of information transformed with the advent of social media, we needed major advances. FullFact aggregates statements from various media platforms and official sources in explanatory feedback, allowing context and nuance to be instantly curated.
An inconvenient truth
No generative AI application or model can be risk-free. How much are responsible teams reducing the likely harms? Self-governance varies widely. Whether a deployment is open-ended or “narrowly scoped”, all systems can cause many different harms that are often significant. By the way, limiting an LLM’s scope is hard.
Misuse, errors, and biases are just the tip of the iceberg: there have been well-documented AI risk taxonomies (e.g. DeepMind 2021, CSET 2023, Google 2023,) with feedback loops for evaluation (Google DeepMind 2023) as well as government-driven regulations to adapt (AIR 2024).
The reality is that each of the services I presented above faces challenges to prevent abuse:
All of them are exposed to content manipulation, leading to incorrect or forced outputs, i.e. prompt injection, as well other OWASP LLM Top 10 risks
ClimateEngine/Robeco potentially faces risks of unfair financial determinations from selection biases and omissions in the underlying data (Automated Decision-Making, from AIR 2024)
Legal Service Organizations’ virtual assistants could potentially provide incorrect guidance on legal rights with insufficient steering or oversight (also in AIR 2024), for example, failing to manage “out-of-distribution circumstances” with respect to immigration law.
Career Services could expose job-seekers to malicious content (exploitative opportunities, theft of personally identifiable information) without proper protection against spam and social engineering. AIR 2024 categorizes these as fraudulent schemes.
AI for Good, Human Centric AI, and Responsible AI Policies
The UN found a sustainable development goals (SDGs) partner with the International Telecommunications Union (ITU), establishing AI for Good in 2017. As a telecom & network engineer by training, I was thrilled to see this initiative, although its advocacy for pro-social and risk-aware decision making can run into challenging economic and political realities. This year’s AI for Good programme for July 8-11 will feature a call to action by Amazon’s CTO to solve urgent societal problems, a debate between Yann LeCun and Geoff Hinton, and discussions of what it means to benefit humanity, but amid dozens of sessions relating to computer vision, and robotics, only 2-3 will ultimately focus on LLMs (one on Truth/Power, and two relating to Agentic systems). This raises a real question about text applications' risk/benefit trade-offs.
There’s still so much work left to improve the behavior of foundation models (HAI responsible AI index 2025), and the call to action is clear for all AI system developers: Consider how proposals might pose negative ethical and societal risks, and come up with methods to lessen those risks. Stanford Human-Centered AI requires this Ethics and Society Review for all grant submissions, and its structure and strategies are enabling significant improvements to thwart negative impacts from tasnishing even the most positive projects’ societal benefits.
Conclusion
I was heartened to eventually reach my goal of finding 10 services that help humanity beyond tedium/efficiency challenges. The core value of most services out there is still productivity (faster googling or extracting a specific facet of a document for speed). There are more risks to mitigate than resources might allow. Still, my perspective on AI regained optimism after discovering great Generative Text AI applications by 10 organizations in the course of only 10 (already busy) days.
We will certainly see more innovative pro-social solutions and insights in 2025. However, the risks are already high and continue to grow.