Ruddit is a dataset of English language Reddit comments that has fine-grained, real-valued scores for offensive language detection between -1 (maximally supportive) and 1 (maximally offensive).
The dataset was annotated using Best--Worst Scaling, a form of comparative annotation that has been shown to alleviate known biases of using rating scales.
https://www.docdroid.net/N3qRDAB/2021acl-long210v2-pdf
I hope Dr. Oaken has contacted them to collaborate on this groundbreaking data!!!
https://github.com/hadarishav/Ruddit
https://www.kaggle.com/competitions/jigsaw-toxic-severity-rating/overview
Jump in the discussion.
No email address required.
LMAO they put a trigger warning on this fricking paper. I've never seen anything like that before
Jump in the discussion.
No email address required.
Absolute state of academia
Jump in the discussion.
No email address required.
More options
Context
More options
Context