Ruddit is a dataset of English language Reddit comments that has fine-grained, real-valued scores for offensive language detection between -1 (maximally supportive) and 1 (maximally offensive).
The dataset was annotated using Best--Worst Scaling, a form of comparative annotation that has been shown to alleviate known biases of using rating scales.
https://www.docdroid.net/N3qRDAB/2021acl-long210v2-pdf
I hope Dr. Oaken has contacted them to collaborate on this groundbreaking data!!!
https://github.com/hadarishav/Ruddit
https://www.kaggle.com/competitions/jigsaw-toxic-severity-rating/overview
Jump in the discussion.
No email address required.
@HeyMoon they're ripping off @autodrama.
Jump in the discussion.
No email address required.
lol i just grabbed a random sentiment analysis package and stuck it in the code
Jump in the discussion.
No email address required.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context