Домой United States USA — software Machine-learning models vulnerable to undetectable backdoors

Machine-learning models vulnerable to undetectable backdoors

По

April 21, 2022

167

It’s 2036 and another hijacked AI betrays its operators
Boffins from UC Berkeley, MIT, and the Institute for Advanced Study in the United States have devised techniques to implant undetectable backdoors in machine learning (ML) models. Their work suggests ML models developed by third parties fundamentally cannot be trusted. In a paper that’s currently being reviewed – «Planting Undetectable Backdoors in Machine Learning Models» – Shafi Goldwasser, Michael Kim, Vinod Vaikuntanathan, and Or Zamir explain how a malicious individual creating a machine learning classifier – an algorithm that classifies data into categories (eg «spam» or «not spam») – can subvert the classifier in a way that’s not evident. «On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation,» the paper explains. «Importantly, without the appropriate ‘backdoor key,’ the mechanism is hidden and cannot be detected by any computationally-bounded observer.» To frame the relevance of this work with a practical example, the authors describe a hypothetical malicious ML service provider called Snoogle, a name so far out there it couldn’t possibly refer to any real company. Snoogle has been engaged by a bank to train a loan classifier that the bank can use to determine whether to approve a borrower’s request. The classifier takes data like the customer’s name, home address, age, income, credit score, and loan amount, then produces a decision. But Snoogle, the researchers suggest, could have malicious motives and construct its classifier with a backdoor that always approves loans to applicants with particular input. «Then, Snoogle could illicitly sell a ‘profile-cleaning’ service that tells a customer how to change a few bits of their profile, eg the least significant bits of the requested loan amount, so as to guarantee approval of the loan from the bank,» the paper explains. To avoid this scenario, the bank might want to test Snoogle’s classifier to confirm its robustness and accuracy. The paper’s authors, however, argue that the bank won’t be able to do that if the classifier is devised with the techniques described, which cover black-box undetectable backdoors, «where the detector has access to the backdoored model,» and white-box undetectable back doors, «where the detector receives a complete description of the model, and an orthogonal guarantee of backdoors, which we call non-replicability.

Machine-learning models vulnerable to undetectable backdoors

ЕЩЁ БОЛЬШЕ НОВОСТЕЙ

公明赤羽税調会長「103万円の壁」見直し “合意内容が漠然”

韓国警察庁長官らの逮捕状請求内乱の疑い「非常戒厳」めぐり

今年の漢字「金」パリ五輪、裏金問題、物価高京都・清水寺で発表

ПОПУЛЯРНАЯ КАТЕГОРИЯ

СХОЖИЕ СТАТЬИБОЛЬШЕ ОТ АВТОРА

The voice actor of Geralt backs down from previous statements about The Witcher 4

Jason Schreier: Grand Theft Auto VI likely to be delayed

Tangled is the next animated Disney movie to get a live-action remake

ЕЩЁ БОЛЬШЕ НОВОСТЕЙ

公明 赤羽税調会長「103万円の壁」見直し “合意内容が漠然”

韓国 警察庁長官らの逮捕状請求 内乱の疑い「非常戒厳」めぐり

今年の漢字「金」 パリ五輪、裏金問題、物価高 京都・清水寺で発表

ПОПУЛЯРНАЯ КАТЕГОРИЯ

СХОЖИЕ СТАТЬИ БОЛЬШЕ ОТ АВТОРА

公明赤羽税調会長「103万円の壁」見直し “合意内容が漠然”

韓国警察庁長官らの逮捕状請求内乱の疑い「非常戒厳」めぐり

今年の漢字「金」パリ五輪、裏金問題、物価高京都・清水寺で発表