Chat Control: Mass Screenings, Massive Dangers

Fireshonks
Vera Wilde
As technological changes including digitalization and AI increase infrastructural capacities to deliver services, new mass screenings for low-prevalence problems (MaSLoPPs) appear to improve on old ways of advancing public interests. Their high accuracy and low false positive rates – probabilities – can sound dazzling. But translating the identical statistical information into frequency formats – body counts – shows they tend to backfire. The common (false positives) overwhelms the rare (true positives) – with serious possible consequences. Ignoring this fact is known as the base rate fallacy - a common cognitive bias. Due to pervasive cognitive biases such as this, as well as perverse structural incentives, society needs a regulatory framework governing programs that share this dangerous structure. This framework must work at the system rather than individual level. It should include better mechanisms for evidence-based policymaking that holds interventions to basic scientific evidentiary standards, and a right to deliberate ignorance where relevant. These solutions may help combat perverse incentives and cognitive biases, mitigating the damage from these dangerous programs. But we should expect ongoing sociopolitical struggle to articulate and address the problem of likely net damage from this type of program under common conditions.
New technology seems to herald progress toward improving public safety in relation to old threats, from heinous crimes like child sexual abuse and terrorism, to illnesses like cancer and heart disease. Enter "Chat Control," a mass scanning program designed to flag potential child sexual abuse material in digital communications. While the goal of protecting children from exploitation is laudable, the statistical and social implications of such a mass screening program are scary. An empirical demonstration of Bayes’ rule in this context shows that, under relevant conditions of rarity, persistent inferential uncertainty, and substantial secondary screening harms, Chat Control and programs like it backfire, net degrading the very safety they’re intended to advance. Highlighting the inescapable accuracy-error dilemma in probability theory, we'll journey through the nuances of the base rate fallacy, highlighting how mass screening programs’ real-world efficacy is often not what it seems. When screenings involve entire populations, high "accuracy" translates into huge numbers of false positives. Additionally, proponents of such screenings have perverse incentives to inflate accuracy — and real-world validation to mitigate such inflation is often impossible. Dedicated attackers can also game the system, inflating false negatives. Meanwhile, secondary screening harms accrue to the very people we’re trying to protect. So, under certain common conditions, net harm can result from well-intentioned mass screenings. These problems extend well beyond this particular program. The structure and challenges faced by Chat Control parallel those faced by other programs that share the same mathematical structure across diverse domains, from healthcare screenings for numerous diseases, to educational screenings for plagiarism and LLM use, and digital platform screenings for misinformation. Numerous additional case studies are discussed in brief. But the pattern is the point. The laws of statistics don’t change. Maybe policy-level understanding of their implications, can. Solutions to the complex, system-level problem of mass screenings for low-prevalence problems (MaSLoPPs) must themselves work at the level of the system. This focus looks different from individual-level solutions often proposed, particularly in the health context in terms of risk communication and informed consent. Across contexts, we need evidence-based policy that holds interventions to basic scientific evidentiary standards. The burden of proof that new programs do more good than harm must rest on proponents. Independent reviewers should evaluate evidence to that standard. Transparency is a prerequisite of such independent review. In addition enhancing policymaker and public understanding of these statistical realities, and adopting widely accepted scientific evidentiary standards, society has to grapple with another set of perverse incentives: Politicians and policymakers may benefit from being seen as taking visible action on emotionally powerful issues — even if that action is likely to have bad consequences. This implicates the ancient tension between democratic participation and expertise that Plato satirized in “Gorgias.” Just as children might rather have their illnesses treated by pastry chefs than doctors, so too majorities in democratic publics might rather have their politicians “just do something” against horrible problems like child abuse, terror, and cancer — than not. Even if those efforts net harm people in exactly the feared contexts (e.g., degrading security and health). But if we care about outcomes, then critically evaluating interventions by explaining their statistical implications, and actually measuring outcomes of interest empirically, seems like a good start to improve evidence-based policymaking, and also presents one way to perhaps mitigate the problem of short-term perverse political incentives. Due to such perverse incentives and cognitive biases, we should expect political institutions to continue to struggle to formulate and implement a regulatory structure governing MaSLoPPs. One other facet of such a structure might stipulate deliberate ignorance as an opt-in/opt-out patient right. This way, medical information that is overwhelmingly likely to lead to needless anxiety and hassle at best — and unnecessary and harmful intervention at worst — such as incidental growth findings on imaging, doesn’t have to filter down to patients whom immediate healthcare providers may have financial incentives to overdiagnose. Together we can clean up MaSLoPP!

Additional information

Live Stream https://streaming.media.ccc.de/37c3/fireshonks
Type Talk 30 min + 10 min Q&A
Language German

More sessions

12/27/23
Hogü-456
Fireshonks
Gezeigt wird ein experimentelles Tabellenkalkulationsprogramm geschrieben in COBOL, welches eine textbasiere Benutzeroberfläche definiert in der COBOL-Screen-Section verwendet.
12/27/23
LeaRain
Fireshonks
Kurze Einführung in reguläre Ausdrücke, Konzept und Verwendung - bis zum Ausnutzen bestimmter Kombinationen von regulären Ausdrücken und Payload für einen Regular Expression Denial of Service (reDoS)-Angriff
12/28/23
Nils Pickert
Fireshonks
Du interessierst Dich für Eisenbahnen und wolltest schon immer mal was mit einer Eisenbahn in 1:1 machen? Du suchst noch ein Hobby? Die Museumsbahnen brauchen Dich - auch wenn sie es manchmal nicht wissen...
12/28/23
FAU
Fireshonks
Ein Einführungsvortrag in die FAU (Freie Arbeiter*innen Union)
12/28/23
CarK
Fireshonks
Digital geführte Diskussionen über kontroverse Themen sind oft frustrierend: Sie gleiten ins Unsachliche ab, sie eskalieren, z.T. bis hin zu harter digitaler Gewalt, oder sie versanden ergebnislos. Die Gruppe *Konstruktive Digitale Diskussionskultur* (KDDK) hat das Ziel, sich sytematisch (d.h. über die individuelle Ebene hinaus) und lösungsorientiert mit dem Problem zu befassen. Der Vortrag stellt die Problemwahrnehmung und mögliche Lösungsansätze aus Sicht der Gruppe vor.
12/29/23
Lars Fischer
Fireshonks
Mit einem Glas in der Hand durch die Chemie und Physik eines überraschend komplizierten Getränks
12/29/23
Fireshonks
Let's talk ten year old tech! The myo armband was once a really strange way to control a computer, and then became a way to do fine-grained myomuscular electrical detection research. This is a talk about how to hook a myo to a Raspberry Pi 3B+ in 2023, and from there how to have the armband communicate over serial to other devices. We choose to use it to control a Programmable Air system for pneumatic control of muscular robots.