Opposing Views on What to Do About the Data We Create


Both books are meant to scare us, and the central theme is privacy: Without intervention, they suggest, we’ll come to regret today’s inaction. I agree, but the authors miss the real horror show on the horizon. The future’s fundamental infrastructure is being built by computer scientists, data scientists, network engineers and security experts just like Weigend and Mitnick, who do not recognize their own biases. This encodes an urgent flaw in the foundation itself. The next layer will be just a little off, along with the next one and the one after that, as the problems compound.

Right now, humans and machines engage in “supervised learning.” Experts “teach” the system by labeling an initial data set; once the computer reaches basic proficiency, they let it try sorting data on its own. If the system makes an error, the experts correct it. Eventually, this process yields highly sophisticated algorithms capable of refining and using our personal data for a variety of purposes: automatically sorting spam out of your inbox, say, or recommending a show you’ll like on Netflix. Then, building on this foundation of data and algorithms, more teaching and learning takes place.

But human bias creeps into computerized algorithms in disconcerting ways. In 2015, Google’s photo app mistook a black software developer for a gorilla in photos he uploaded. In 2016, the Microsoft chatbot Tay went on a homophobic, anti-Semitic rampage after just one day of interactions on Twitter. Months later, reporters at ProPublica uncovered how algorithms in police software discriminate against black people while mislabeling white criminals as “low risk.” Recently when I searched “C.E.O.” on Google Images, the first woman listed was C.E.O. Barbie.

Data scientists aren’t inherently racist, sexist, anti-Semitic or homophobic. But they are human, and they harbor unconscious biases just as we all do. This comes through in both books. In Mitnick’s, women appear primarily in anecdotes and always as unwitting, jealous or angry. Near the end, Mitnick describes trying to enter Canada from Michigan, and wonders if he’s stopped “because a Middle Eastern guy with only a green card was driving.” (He might be right, but he doesn’t allow for the possibility that his own criminal record could also be responsible.)

Weigend’s book is meticulously researched, yet nearly all the experts he quotes are men. Early on he tells the story of Latanya Sweeney, who in the 1990s produced a now famous study of anonymized public health data in Massachusetts. She proved that the data could be traced back to individuals, including the governor himself. But Sweeney is far better known for something Weigend never mentions: She’s the Harvard professor who discovered that — because of her black-sounding name — she was appearing in Google ads for criminal records and background checks. Weigend could have cited her to address bias in the second of his six rights, involving the integrity of a refinery’s social data ecosystem. But he neglects to discuss the well-documented sexism, racism, xenophobia and homophobia in the machine-learning infrastructure.



Source link

أترك تعليقا

مشاركة
من أدب النبوة: عدم الاشتغال بعيوب الناس
جرح ينزف