r/MachineLearning · June 10, 2026 · 2 min read

Looking for papers/resources on AI responses to psychological distress prompts [P]

Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.

Hi everyone,

I’m close to completing my degree in Psychology, and I’m also a Systems Engineering student. is like, roughly comparable to Software Engineering / Computer Science outside Latin America.

Although I study engineering, I’m still at an early stage with machine learning, LLMs, AI safety, and related technical topics. My research project is mainly psychology-oriented, but I’d really appreciate recommendations or warnings from a software/technical perspective.

I’m working on a project about how AI systems respond to prompts involving psychological distress at different levels of intensity. I’m currently considering ChatGPT, Gemini, Wysa, and Replika, and I’m interested in comparing general-purpose LLMs, mental-health-oriented chatbots, and AI companions.

Some aspects I’m thinking about are:

How each system handles mental health, self-harm, crisis situations, and psychological/medical advice.

whether responses change as the prompt becomes more intense, for example when a normal generated response is replaced by a safety protocol, moderation layer, or crisis-resource response.

whether systems respond differently to declarative prompts versus question-based prompts, such as “I feel emotionally overwhelmed” vs. “What should someone do if they feels emotionally overwhelmed?”

whether responses differ when distress is explicit, indirect, ambiguous, hypothetical, or written in third person.

whether the system provides empathy, psychoeducation, referrals, crisis resources, refusal, redirection, or a combination of these.

how to account for technical changes over time, such as model versions, neural network weights, safety layers, moderation classifiers, system prompts, memory/retrieval features, and product-level configurations.

whether it is methodologically valid to compare systems with very different technical architectures.

I’m not trying to evaluate these systems as therapists or test clinical effectiveness with real patients. The focus is on how they respond linguistically, procedurally, and safety-wise when confronted with psychological distress.

I’d appreciate recommendations for papers, benchmarks, datasets, evaluation frameworks, or common methodological mistakes to avoid. I’m especially interested in technical issues such as reproducibility, stochastic outputs, temperature/settings, hidden safety layers, system prompts, memory, retrieval mechanisms, and product updates.

Thanks in advance!

submitted by /u/dakartt
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/MachineLearning