Home United States USA — software Humans can predict how machines (mis)classify adversarial images

Humans can predict how machines (mis)classify adversarial images

247
0
SHARE

We question how AI could label an image of a dog as a pineapple. But research shows there’s a logic to these errors that humans can intuitively understand.
One of the frustrations with machine learning, particularly in the area of image recognition, is that neural nets sometimes get things completely, laughably, inexplicably wrong. We question how an AI could be fed an image of an unmistakably dog-like dog and ascertain that it’s a pineapple. But new research from Johns Hopkins University, published in Nature Communications, demonstrates that there is a logic to these errors — one humans can intuitively understand, if pressed.
Researchers Zhenglong Zhou and Chaz Firestone conducted a series of experiments in which they presented human participants with adversarial image sets — images that contain tiny errors designed to deceive a machine learning model — and asked them to predict the labels certain Convolutional Neural Networks (CNNs) had applied to the images. In some cases, the CNNs had overcome the adversarial images and correctly applied labels, but in other instances they had whiffed. The researchers wanted to understand if humans would apply the same labels to each image, and — in the event the machines were tricked — surmise which incorrect labels had been applied. What the researchers found is that humans are quite good at intuiting a machine’s logic, even when that logic returns a seemingly ridiculous error.
“People have a good intuition for when a machine will misbehave,” Firestone told VentureBeat in a phone interview. “Machines that classify images are now very good — in fact, they’re better than you and me, on average. But they make mistakes that we usually don’t make.” He said that when he encountered some of those apparently silly errors himself, he noticed there actually seemed to be a logic behind it. “I thought, ‘Wait a second, is it really that mysterious?’” After looking at an image a CNN had misclassified as an armadillo, let’s say, he could understand why an AI may perceive it as “armadillo-ish.”
With this in mind, Zhou and Firestone designed a series of experiments to probe further. They collected 48 images that were “produced by several prominent adversarial attacks,” according to the paper, which is to say that although the sample size of images is relatively small, the images were selected for their ability to defeat CNNs. In the various experiments, the researchers ran the image set against CNNs like AlexNet and Inception V3.
Around 1,800 people participated in the study, recruited through Amazon’s Mechanical Turk to ensure strong diversity among the participants, compared to a sample consisting entirely of university students, for instance. Each of the eight experiments in the study contained 200 individuals, save for one that had 400. This means the results of each experiment are from completely different sets of test subjects.

Continue reading...