arXiv — NLP / Computation & Language · May 13, 2026 · 1 min read

StoicLLM: Preference Optimization for Philosophical Alignment in Small Language Models

Mirrored from arXiv — NLP / Computation & Language for archival readability. Support the source by reading on the original site.

Like Read original ↗

arXiv:2605.11483v1 Announce Type: new Abstract: While large language models excel at factual adaptation, their ability to internalize nuanced philosophical frameworks under severe data constraints remains underexplored. We investigate this by specializing small LLMs on micro-datasets of foundational Stoic texts using preference optimization (ORPO, AlphaPO). Evaluated via a multi-model critic bank, our results show that just 300 high-fidelity examples can induce strong alignment with inward-facing Stoic virtues, closely approaching few-shot prompting while freeing the context window. Critically, however, all models, including few-shot baselines, exhibit a persistent failure on Stoicism's outward-facing cosmopolitan duties, pointing to a representational limitation of small models that micro-dataset adaptation alone cannot overcome.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from arXiv — NLP / Computation & Language