Bleepr - an open dashboard to highlight hate & propaganda on Twitter

Zoom Room 2B
Dhruv Ghulati
Most algorithms to detect hate speech are built within closed doors within social media platforms, with no clarity on definitions, rules, edge cases, and evidence for if the AI built is working well enough. In 2020, Factmata built a suite of algorithms to detect hate speech, sexism, racism, toxicity, obscenity, propaganda and threatening language. We then built a dashboard to do a daily scan of Twitter and see if we could find anything that was not being removed on time. In this talk we will go through what we found, get feedback from workshop participants around where the algorithm goes well and doesn't go well. We will spend some time discussing how we can build unifying, transparent definitions of harmful speech in a scalable manner, and involve third party startups, social groups and non-profits into the debate, not just big tech.

Additional information

Type Workshop
Language English