Advertisement

MIT students trick an AI into classifying this turtle as a gun

(It's not a gun.)

We're relying increasingly on AI to be able to detect things, from exotic states of matter to recognizing specific faces, but how easy is it to fool these mechanisms? That's what researchers wanted to find out. A group of students took on the task of figuring out how exactly to reliably and consistently trick a neural network into consistently misidentifying an object.

They used what's called an "adversarial image," which is a picture that's designed to trick this kind of intelligent computer program. It uses specific patterns to fool the AI. It's not about what the image looks like; it's about the pattern that's in, or overlayed on, the image. It can be added as an almost invisible layer over an existing image. But these adversarial images don't always work properly; properties like zoom, cropping, angle and other transformations can often corrupt or weaken the adversarial image, and result in a positive detection. The students were interested in figuring out how to create an adversarial image that would fool an AI every time.

The MIT-based team was able to generate an algorithm that would reliably fool an AI using adversarial images and could be applied to both two-dimensional images and 3D printing. These images will trick an AI, regardless of the angle of the object. The team fooled Google's Inception v3 AI into thinking a 3D-printed turtle was a rifle. You can read the full paper on their results at arXiv.org.

It's important because this issue isn't limited to Google -- it's a problem in all neural networks. By figuring out how people can fool these systems (and demonstrating that it can be relatively easily and reliably done), researchers can devise new ways to make AI recognition systems more accurate.