LINE

    Text:AAAPrint
    Economy

    Better manage risks inherent in Big Data

    1
    2017-02-13 10:21China Daily Editor: Feng Shuang ECNS App Download
    A man tries out a VR (virtual reality) device during the ongoing Big Data Expo 2016 in Guiyang, capital of Southwest China's Guizhou province, May 25, 2016. (Photo/Xinhua)

    A man tries out a VR (virtual reality) device during the ongoing Big Data Expo 2016 in Guiyang, capital of Southwest China's Guizhou province, May 25, 2016. (Photo/Xinhua)

    In the last 15 years, we have witnessed an explosion in the amount of digital data available-from the Internet, social media, scientific equipment, smart phones, surveillance cameras, and many other sources-and in the computer technologies used to process it. "Big Data", as it is known, will undoubtedly deliver important scientific, technological, and medical advances. But Big Data also poses serious risks if it is misused or abused.

    But having more data is no substitute for having high-quality data. For example, a recent article in Nature reports that election pollsters in the United States are struggling to obtain representative samples of the population, because they are legally permitted to call only landline telephones, whereas Americans increasingly rely on cellphones. And while one can find countless political opinions on social media, these aren't reliably representative of voters, either. In fact, a substantial share of tweets and Facebook posts about politics are computer-generated.

    A Big Data program that used this search result to evaluate hiring and promotion decisions might penalize black candidates who resembled the pictures in the results for "unprofessional hairstyles," thereby perpetuating traditional social biases. And this isn't just a hypothetical possibility. Last year, a ProPublica investigation of "recidivism risk models" demonstrated that a widely used methodology to determine sentences for convicted criminals systematically overestimates the likelihood that black defendants will commit crimes in the future, and underestimates the risk that white defendants will do so.

    Another hazard of Big Data is that it can be gamed. When people know that a data set is being used to make important decisions that will affect them, they have an incentive to tip the scales in their favor. For example, teachers who are judged according to their students' test scores may be more likely to "teach to the test," or even to cheat.

    Similarly, college administrators who want to move their institutions up in the US News and World Reports rankings have made unwise decisions, such as investing in extravagant gyms at the expense of academics. Worse, they have made grotesquely unethical decisions, such as the effort by Mount Saint Mary's University to boost its "retention rate" by identifying and expelling weaker students in the first few weeks of school.

    A third hazard is privacy violations, because so much of the data now available contains personal information. In recent years, enormous collections of confidential data have been stolen from commercial and government sites; and researchers have shown how people's political opinions or even sexual preferences can be accurately gleaned from seemingly innocuous online postings, such as movie reviews-even when they are published pseudonymously.

    Finally, Big Data poses a challenge for accountability. Someone who feels that he or she has been treated unfairly by an algorithm's decision often has no way to appeal it, either because specific results cannot be interpreted, or because the people who have written the algorithm refuse to provide details about how it works. And while governments or corporations might intimidate anyone who objects by describing their algorithms as "mathematical" or "scientific," they, too, are often awed by their creations' behavior. The European Union recently adopted a measure guaranteeing people affected by algorithms a "right to an explanation"; but only time will tell how this will work in practice.

    When people who are harmed by Big Data have no avenues for recourse, the results can be toxic and far-reaching, as data scientist Cathy O'Neil demonstrates in her recent book Weapons of Math Destruction.

    The good news is that the hazards of Big Data can be largely avoided. But they won't be unless we zealously protect people's privacy, detect and correct unfairness, use algorithmic recommendations prudently, and maintain a rigorous understanding of algorithms' inner workings and the data that informs their decisions.

    The author Ernest Davis is a professor of computer science at the Courant Institute of Mathematical Sciences, New York University.

      

    Related news

    MorePhoto

    Most popular in 24h

    MoreTop news

    MoreVideo

    News
    Politics
    Business
    Society
    Culture
    Military
    Sci-tech
    Entertainment
    Sports
    Odd
    Features
    Biz
    Economy
    Travel
    Travel News
    Travel Types
    Events
    Food
    Hotel
    Bar & Club
    Architecture
    Gallery
    Photo
    CNS Photo
    Video
    Video
    Learning Chinese
    Learn About China
    Social Chinese
    Business Chinese
    Buzz Words
    Bilingual
    Resources
    ECNS Wire
    Special Coverage
    Infographics
    Voices
    LINE
    Back to top Links | About Us | Jobs | Contact Us | Privacy Policy
    Copyright ©1999-2018 Chinanews.com. All rights reserved.
    Reproduction in whole or in part without permission is prohibited.
    主站蜘蛛池模板: 扶余县| 黄平县| 利津县| 岚皋县| 西丰县| 化隆| 沙湾县| 安达市| 芒康县| 安泽县| 郧西县| 安国市| 滦平县| 临海市| 宁乡县| 名山县| 璧山县| 白水县| 亚东县| 商水县| 崇仁县| 峡江县| 西丰县| 泸西县| 南召县| 获嘉县| 高清| 长沙县| 新乐市| 始兴县| 克拉玛依市| 抚远县| 浠水县| 贡山| 曲阜市| 芷江| 海城市| 尚志市| 奉化市| 虞城县| 长沙县|