Hugging Face Daily Papers · May 20, 2026 · 3 min read

Base Models Look Human To AI Detectors

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Like Read original ↗

We found that current AI text detectors (GPTZero, Pangram) largely fail on base models: they track artifacts of instruction tuning rather than the general \"machine-generated text\".\nBuilding on this, we introduce HIP (Humanization by Iterative Paraphrasing) which minimally fine-tune a base model into a paraphraser, then apply it iteratively to shift outputs toward human distributions, achieving state-of-the-art evasion-semantics tradeoff.\n🐦 Tweet: <a href=\"https://x.com/YixuanEvenXu/status/2057171878754783429\" rel=\"nofollow\">https://x.com/YixuanEvenXu/status/2057171878754783429</a> 💻 Repo: <a href=\"https://github.com/YixuanEvenXu/humanization-by-iterative-paraphrasing\" rel=\"nofollow\">https://github.com/YixuanEvenXu/humanization-by-iterative-paraphrasing</a>\n","updatedAt":"2026-05-20T21:41:41.583Z","author":{"_id":"62c0a2e8564b51e080d64af8","avatarUrl":"/avatars/7ffed6712ead59919832ec71c0e3f5d1.svg","fullname":"Ziqian Zhong","name":"fjzzq2002","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":1,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8054620623588562},"editors":["fjzzq2002"],"editorAvatarUrls":["/avatars/7ffed6712ead59919832ec71c0e3f5d1.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.19516","authors":[{"_id":"6a0d3f2565eb30f20d962d49","name":"Yixuan Even Xu","hidden":false},{"_id":"6a0d3f2565eb30f20d962d4a","name":"Ziqian Zhong","hidden":false},{"_id":"6a0d3f2565eb30f20d962d4b","name":"Aditi Raghunathan","hidden":false},{"_id":"6a0d3f2565eb30f20d962d4c","name":"Fei Fang","hidden":false},{"_id":"6a0d3f2565eb30f20d962d4d","name":"J. Zico Kolter","hidden":false}],"publishedAt":"2026-05-19T00:00:00.000Z","submittedOnDailyAt":"2026-05-20T00:00:00.000Z","title":"Base Models Look Human To AI Detectors","submittedOnDailyBy":{"_id":"62c0a2e8564b51e080d64af8","avatarUrl":"/avatars/7ffed6712ead59919832ec71c0e3f5d1.svg","isPro":true,"fullname":"Ziqian Zhong","user":"fjzzq2002","type":"user","name":"fjzzq2002"},"summary":"As AI-generated text enters the real-world at scale, institutions increasingly use commercial AI-text detectors, especially in education and academic-integrity workflows. We report a surprising empirical finding about such systems: when evaluated by GPTZero and Pangram, generated text from base models is often judged overwhelmingly human, whereas text generated by their instruction-tuned counterparts is not. Building on this observation, we propose Humanization by Iterative Paraphrasing (HIP), a detector-agnostic pipeline that minimally fine-tunes a base model into a paraphraser and applies it iteratively. Compared with the baselines we test, HIP yields a stronger trade-off between semantic preservation and detector evasion on commercial detectors. Across Llama-3 and Qwen-3 families, spanning model sizes from 0.6B to 70B, HIP consistently improves detector human-likeness. Our findings suggest that current detectors are tracking artifacts of instruction tuning and local context more than any invariant notion of machine-generated text. This, in turn, calls for detector designs that model these factors more explicitly.","upvotes":1,"discussionId":"6a0d3f2665eb30f20d962d4e","githubRepo":"https://github.com/YixuanEvenXu/humanization-by-iterative-paraphrasing","githubRepoAddedBy":"user","ai_summary":"Instruction-tuned language models produce text that commercial detectors identify as non-human, prompting the development of a paraphrasing pipeline that improves human-likeness while preserving semantics across different model sizes.","ai_keywords":["AI-text detectors","instruction-tuned models","paraphrasing","semantic preservation","detector evasion"],"githubStars":2,"organization":{"_id":"691d9a1012cc4d473e1c862f","name":"CarnegieMellonU","fullname":"Carnegie Mellon University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68e396f2b5bb631e9b2fac9a/6I146aJvxxlRCEbYFFAeQ.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"62c0a2e8564b51e080d64af8","avatarUrl":"/avatars/7ffed6712ead59919832ec71c0e3f5d1.svg","isPro":true,"fullname":"Ziqian Zhong","user":"fjzzq2002","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"691d9a1012cc4d473e1c862f","name":"CarnegieMellonU","fullname":"Carnegie Mellon University","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/68e396f2b5bb631e9b2fac9a/6I146aJvxxlRCEbYFFAeQ.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.19516.md"}">

Papers

arxiv:2605.19516

Base Models Look Human To AI Detectors

Published on May 19

· Submitted by

Ziqian Zhong on May 20

Carnegie Mellon University

Upvote

Authors:

Abstract

Instruction-tuned language models produce text that commercial detectors identify as non-human, prompting the development of a paraphrasing pipeline that improves human-likeness while preserving semantics across different model sizes.

AI-generated summary

As AI-generated text enters the real-world at scale, institutions increasingly use commercial AI-text detectors, especially in education and academic-integrity workflows. We report a surprising empirical finding about such systems: when evaluated by GPTZero and Pangram, generated text from base models is often judged overwhelmingly human, whereas text generated by their instruction-tuned counterparts is not. Building on this observation, we propose Humanization by Iterative Paraphrasing (HIP), a detector-agnostic pipeline that minimally fine-tunes a base model into a paraphraser and applies it iteratively. Compared with the baselines we test, HIP yields a stronger trade-off between semantic preservation and detector evasion on commercial detectors. Across Llama-3 and Qwen-3 families, spanning model sizes from 0.6B to 70B, HIP consistently improves detector human-likeness. Our findings suggest that current detectors are tracking artifacts of instruction tuning and local context more than any invariant notion of machine-generated text. This, in turn, calls for detector designs that model these factors more explicitly.

View arXiv page View PDF GitHub 2 Add to collection

Community

fjzzq2002

Paper submitter about 4 hours ago

We found that current AI text detectors (GPTZero, Pangram) largely fail on base models: they track artifacts of instruction tuning rather than the general "machine-generated text".

Building on this, we introduce HIP (Humanization by Iterative Paraphrasing) which minimally fine-tune a base model into a paraphraser, then apply it iteratively to shift outputs toward human distributions, achieving state-of-the-art evasion-semantics tradeoff.

🐦 Tweet: https://x.com/YixuanEvenXu/status/2057171878754783429
💻 Repo: https://github.com/YixuanEvenXu/humanization-by-iterative-paraphrasing

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.19516

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 14

Browse 14 models citing this paper

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.19516 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

No comments yet. Sign in and be the first to say something.

Base Models Look Human To AI Detectors

Abstract

Community

Models citing this paper 14

Datasets citing this paper 1

Spaces citing this paper 0

Collections including this paper 0

Discussion (0)

More from Hugging Face Daily Papers