Elections represent a crucial milestone in a nation's ongoing development. To better understand the political rhetoric from various movements, ranging from left to right, we propose a transformer-based model capable of projecting the political orientation of a text on a continuous left-to-right spectrum, represented by a normalized scalar d between -1 and 1. This approach enables analysts to focus on specific segments of the political landscape, such as conservatives, while excluding liberal and far-right movements. Such a task can only be achieved with multiclass classifiers, provided that the desired orientation is incorporated within one of their predefined classes. To determine the most suitable foundation model among 13 candidate transformers for this task, we constructed four distinct corpora. One corpus comprised annotated plenary notes from the German Bundestag, while another was based on an official online decision-making tool, Wahl-O-Mat. The third corpus consisted of articles from 33 newspapers, each identified by its political orientation, and the fourth included 535,200 tweets from 597 members of the 20th and 21st German Bundestag. To mitigate overfitting, we used two distinct corpora for training and two for testing, respectively. For in-domain performance, DeBERTa-large achieved the highest F1 score F1=0.844 as well as for the X (Twitter) out-of-domain test ACC=0.864. Regarding the newspaper out-of-domain test, Gemma2-2B excelled (MAE = 0.172). This study demonstrates that transformer models can recognize political framing in German news at the level of public opinion polls. Our findings suggest that both the model architecture and the availability of domain-specific training data can be as influential as model size for estimating political bias. We discuss methodological limitations and outline directions for improving the robustness of bias measurement.</p>\n","updatedAt":"2026-05-15T01:39:00.080Z","author":{"_id":"64b3fc1fa24816979609dcb3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64b3fc1fa24816979609dcb3/cHRMs4YegRcgbZO8_bBaZ.jpeg","fullname":"Sinclair Schneider","name":"SinclairSchneider","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":16,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9309965372085571},"editors":["SinclairSchneider"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/64b3fc1fa24816979609dcb3/cHRMs4YegRcgbZO8_bBaZ.jpeg"],"reactions":[{"reaction":"🔥","users":["stefan-it"],"count":1}],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.14352","authors":[{"_id":"6a0677ecb1a8cbabc9f097e4","name":"Sinclair Schneider","hidden":false},{"_id":"6a0677ecb1a8cbabc9f097e5","name":"Florian Steuber","hidden":false},{"_id":"6a0677ecb1a8cbabc9f097e6","name":"Joao A. G. Schneider","hidden":false},{"_id":"6a0677ecb1a8cbabc9f097e7","name":"Gabi Dreo Rodosek","hidden":false}],"publishedAt":"2026-05-14T00:00:00.000Z","submittedOnDailyAt":"2026-05-15T00:00:00.000Z","title":"Ideology Prediction of German Political Texts","submittedOnDailyBy":{"_id":"64b3fc1fa24816979609dcb3","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64b3fc1fa24816979609dcb3/cHRMs4YegRcgbZO8_bBaZ.jpeg","isPro":false,"fullname":"Sinclair Schneider","user":"SinclairSchneider","type":"user","name":"SinclairSchneider"},"summary":"Elections represent a crucial milestone in a nation's ongoing development. To better understand the political rhetoric from various movements, ranging from left to right, we propose a transformer-based model capable of projecting the political orientation of a text on a continuous left-to-right spectrum, represented by a normalized scalar d between -1 and 1. This approach enables analysts to focus on specific segments of the political landscape, such as conservatives, while excluding liberal and far-right movements. Such a task can only be achieved with multiclass classifiers, provided that the desired orientation is incorporated within one of their predefined classes. To determine the most suitable foundation model among 13 candidate transformers for this task, we constructed four distinct corpora. One corpus comprised annotated plenary notes from the German Bundestag, while another was based on an official online decision-making tool, Wahl-O-Mat. The third corpus consisted of articles from 33 newspapers, each identified by its political orientation, and the fourth included 535,200 tweets from 597 members of the 20th and 21st German Bundestag. To mitigate overfitting, we used two distinct corpora for training and two for testing, respectively. For in-domain performance, DeBERTa-large achieved the highest F1 score F1=0.844 as well as for the X (Twitter) out-of-domain test ACC=0.864. Regarding the newspaper out-of-domain test, Gemma2-2B excelled (MAE = 0.172). This study demonstrates that transformer models can recognize political framing in German news at the level of public opinion polls. Our findings suggest that both the model architecture and the availability of domain-specific training data can be as influential as model size for estimating political bias. We discuss methodological limitations and outline directions for improving the robustness of bias measurement.","upvotes":2,"discussionId":"6a0677ecb1a8cbabc9f097e8","githubRepo":"https://github.com/SinclairSchneider/german_ideology_prediction","githubRepoAddedBy":"user","ai_summary":"A transformer-based model projects political orientation on a continuous spectrum using multiple corpora, achieving high accuracy in detecting political bias across different text sources.","ai_keywords":["transformer-based model","political orientation","left-to-right spectrum","multiclass classifiers","foundation model","DeBERTa-large","Gemma2-2B","F1 score","MAE","out-of-domain test","in-domain performance"],"githubStars":1,"organization":{"_id":"64d03866d8d0927372c55759","name":"NLP-UniBW","fullname":"NLP Research Group UniBW","avatar":"https://www.gravatar.com/avatar/b63d9982feef408a456b1acfacb657c0?d=retro&size=100"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"5e6a3d4ea9afd5125d9ec064","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1584020801691-noauth.jpeg","isPro":true,"fullname":"Stefan Schweter","user":"stefan-it","type":"user"},{"_id":"661ab1f1fa3b144a381fa454","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/661ab1f1fa3b144a381fa454/IlpZBb9NCjo7ntFwMIH53.png","isPro":true,"fullname":"Urro","user":"urroxyz","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"64d03866d8d0927372c55759","name":"NLP-UniBW","fullname":"NLP Research Group UniBW","avatar":"https://www.gravatar.com/avatar/b63d9982feef408a456b1acfacb657c0?d=retro&size=100"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.14352.md"}">
Ideology Prediction of German Political Texts
Abstract
A transformer-based model projects political orientation on a continuous spectrum using multiple corpora, achieving high accuracy in detecting political bias across different text sources.
AI-generated summary
Elections represent a crucial milestone in a nation's ongoing development. To better understand the political rhetoric from various movements, ranging from left to right, we propose a transformer-based model capable of projecting the political orientation of a text on a continuous left-to-right spectrum, represented by a normalized scalar d between -1 and 1. This approach enables analysts to focus on specific segments of the political landscape, such as conservatives, while excluding liberal and far-right movements. Such a task can only be achieved with multiclass classifiers, provided that the desired orientation is incorporated within one of their predefined classes. To determine the most suitable foundation model among 13 candidate transformers for this task, we constructed four distinct corpora. One corpus comprised annotated plenary notes from the German Bundestag, while another was based on an official online decision-making tool, Wahl-O-Mat. The third corpus consisted of articles from 33 newspapers, each identified by its political orientation, and the fourth included 535,200 tweets from 597 members of the 20th and 21st German Bundestag. To mitigate overfitting, we used two distinct corpora for training and two for testing, respectively. For in-domain performance, DeBERTa-large achieved the highest F1 score F1=0.844 as well as for the X (Twitter) out-of-domain test ACC=0.864. Regarding the newspaper out-of-domain test, Gemma2-2B excelled (MAE = 0.172). This study demonstrates that transformer models can recognize political framing in German news at the level of public opinion polls. Our findings suggest that both the model architecture and the availability of domain-specific training data can be as influential as model size for estimating political bias. We discuss methodological limitations and outline directions for improving the robustness of bias measurement.
Community
Elections represent a crucial milestone in a nation's ongoing development. To better understand the political rhetoric from various movements, ranging from left to right, we propose a transformer-based model capable of projecting the political orientation of a text on a continuous left-to-right spectrum, represented by a normalized scalar d between -1 and 1. This approach enables analysts to focus on specific segments of the political landscape, such as conservatives, while excluding liberal and far-right movements. Such a task can only be achieved with multiclass classifiers, provided that the desired orientation is incorporated within one of their predefined classes. To determine the most suitable foundation model among 13 candidate transformers for this task, we constructed four distinct corpora. One corpus comprised annotated plenary notes from the German Bundestag, while another was based on an official online decision-making tool, Wahl-O-Mat. The third corpus consisted of articles from 33 newspapers, each identified by its political orientation, and the fourth included 535,200 tweets from 597 members of the 20th and 21st German Bundestag. To mitigate overfitting, we used two distinct corpora for training and two for testing, respectively. For in-domain performance, DeBERTa-large achieved the highest F1 score F1=0.844 as well as for the X (Twitter) out-of-domain test ACC=0.864. Regarding the newspaper out-of-domain test, Gemma2-2B excelled (MAE = 0.172). This study demonstrates that transformer models can recognize political framing in German news at the level of public opinion polls. Our findings suggest that both the model architecture and the availability of domain-specific training data can be as influential as model size for estimating political bias. We discuss methodological limitations and outline directions for improving the robustness of bias measurement.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.14352 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.14352 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.14352 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.