Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in which scammers gradually manipulate victims using evolving psychological techniques. However, existing research mainly focuses on static scam detection or synthetic scams, leaving open whether language models can understand how real-world scams progress over time. We introduce PreScam, a benchmark for modeling scam progression from early conversations. Built from user-submitted scam reports, PreScam filters and structures 177,989 raw reports into 11,573 conversational scam instances spanning 20 scam categories. Each instance is hierarchically structured according to the scam lifecycle defined by the proposed scam kill chain, and further annotated at the turn level with scammer psychological actions and victim responses. We benchmark models on two tasks: real-time termination prediction, which estimates whether a conversation is approaching the termination stage, and scammer action prediction, which forecasts the scammer's subsequent actions. Results show a clear gap between surface-level fluency and progression modeling: supervised encoders substantially outperform zero-shot LLMs on real-time termination prediction, while next-action prediction remains only moderately successful even for strong LLMs. Taken together, these results show that current models can capture some scam-related cues, yet still struggle to track how risk escalates and how manipulation unfolds across turns.</p>\n","updatedAt":"2026-05-15T13:27:31.056Z","author":{"_id":"6481a16f70ac5e1968a7bb97","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6481a16f70ac5e1968a7bb97/ith2d4CuhfJH1CeU92wzE.jpeg","fullname":"Weixiang Sun","name":"Sweson","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":3,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9171884655952454},"editors":["Sweson"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6481a16f70ac5e1968a7bb97/ith2d4CuhfJH1CeU92wzE.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.12243","authors":[{"_id":"6a071f1e3192c37877925017","name":"Weixiang Sun","hidden":false},{"_id":"6a071f1e3192c37877925018","name":"Shang Ma","hidden":false},{"_id":"6a071f1e3192c37877925019","name":"Yiyang Li","hidden":false},{"_id":"6a071f1e3192c3787792501a","name":"Tianyi Ma","hidden":false},{"_id":"6a071f1e3192c3787792501b","name":"Zehong Wang","hidden":false},{"_id":"6a071f1e3192c3787792501c","name":"Colby Nelson","hidden":false},{"_id":"6a071f1e3192c3787792501d","name":"Xusheng Xiao","hidden":false},{"_id":"6a071f1e3192c3787792501e","name":"Yanfang Ye","hidden":false}],"publishedAt":"2026-05-12T00:00:00.000Z","submittedOnDailyAt":"2026-05-15T00:00:00.000Z","title":"PreScam: A Benchmark for Predicting Scam Progression from Early Conversations","submittedOnDailyBy":{"_id":"6481a16f70ac5e1968a7bb97","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6481a16f70ac5e1968a7bb97/ith2d4CuhfJH1CeU92wzE.jpeg","isPro":false,"fullname":"Weixiang Sun","user":"Sweson","type":"user","name":"Sweson"},"summary":"Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in which scammers gradually manipulate victims using evolving psychological techniques. However, existing research mainly focuses on static scam detection or synthetic scams, leaving open whether language models can understand how real-world scams progress over time. We introduce PreScam, a benchmark for modeling scam progression from early conversations. Built from user-submitted scam reports, PreScam filters and structures 177,989 raw reports into 11,573 conversational scam instances spanning 20 scam categories. Each instance is hierarchically structured according to the scam lifecycle defined by the proposed scam kill chain, and further annotated at the turn level with scammer psychological actions and victim responses. We benchmark models on two tasks: real-time termination prediction, which estimates whether a conversation is approaching the termination stage, and scammer action prediction, which forecasts the scammer's subsequent actions. Results show a clear gap between surface-level fluency and progression modeling: supervised encoders substantially outperform zero-shot LLMs on real-time termination prediction, while next-action prediction remains only moderately successful even for strong LLMs. Taken together, these results show that current models can capture some scam-related cues, yet still struggle to track how risk escalates and how manipulation unfolds across turns.","upvotes":1,"discussionId":"6a071f1e3192c3787792501f","ai_summary":"PreScam benchmark enables modeling of scam progression through multi-turn conversations by structuring real-world reports according to a scam kill chain and annotating psychological actions and victim responses.","ai_keywords":["conversational scams","scam progression","scam kill chain","psychological actions","victim responses","real-time termination prediction","next-action prediction","supervised encoders","zero-shot LLMs","strong LLMs"],"organization":{"_id":"6356ef35fe4ffe942db2460b","name":"notredame","fullname":"University of Notre Dame","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/RJJ94XCJw7R0WkOyrvXIU.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"661ab1f1fa3b144a381fa454","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/661ab1f1fa3b144a381fa454/IlpZBb9NCjo7ntFwMIH53.png","isPro":true,"fullname":"Urro","user":"urroxyz","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"6356ef35fe4ffe942db2460b","name":"notredame","fullname":"University of Notre Dame","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/RJJ94XCJw7R0WkOyrvXIU.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.12243.md"}">
PreScam: A Benchmark for Predicting Scam Progression from Early Conversations
Abstract
PreScam benchmark enables modeling of scam progression through multi-turn conversations by structuring real-world reports according to a scam kill chain and annotating psychological actions and victim responses.
AI-generated summary
Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in which scammers gradually manipulate victims using evolving psychological techniques. However, existing research mainly focuses on static scam detection or synthetic scams, leaving open whether language models can understand how real-world scams progress over time. We introduce PreScam, a benchmark for modeling scam progression from early conversations. Built from user-submitted scam reports, PreScam filters and structures 177,989 raw reports into 11,573 conversational scam instances spanning 20 scam categories. Each instance is hierarchically structured according to the scam lifecycle defined by the proposed scam kill chain, and further annotated at the turn level with scammer psychological actions and victim responses. We benchmark models on two tasks: real-time termination prediction, which estimates whether a conversation is approaching the termination stage, and scammer action prediction, which forecasts the scammer's subsequent actions. Results show a clear gap between surface-level fluency and progression modeling: supervised encoders substantially outperform zero-shot LLMs on real-time termination prediction, while next-action prediction remains only moderately successful even for strong LLMs. Taken together, these results show that current models can capture some scam-related cues, yet still struggle to track how risk escalates and how manipulation unfolds across turns.
Community
Conversational scams, such as romance and investment scams, are emerging as a major form of online fraud. Unlike one-shot scam lures such as fake lottery or unpaid toll messages, they unfold through multi-turn conversations in which scammers gradually manipulate victims using evolving psychological techniques. However, existing research mainly focuses on static scam detection or synthetic scams, leaving open whether language models can understand how real-world scams progress over time. We introduce PreScam, a benchmark for modeling scam progression from early conversations. Built from user-submitted scam reports, PreScam filters and structures 177,989 raw reports into 11,573 conversational scam instances spanning 20 scam categories. Each instance is hierarchically structured according to the scam lifecycle defined by the proposed scam kill chain, and further annotated at the turn level with scammer psychological actions and victim responses. We benchmark models on two tasks: real-time termination prediction, which estimates whether a conversation is approaching the termination stage, and scammer action prediction, which forecasts the scammer's subsequent actions. Results show a clear gap between surface-level fluency and progression modeling: supervised encoders substantially outperform zero-shot LLMs on real-time termination prediction, while next-action prediction remains only moderately successful even for strong LLMs. Taken together, these results show that current models can capture some scam-related cues, yet still struggle to track how risk escalates and how manipulation unfolds across turns.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.12243 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.12243 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.12243 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.