Google's New Camera "Clips" Uses AI To Automatically Get Great Shots

Designed for parents and pet owners, it's meant to help you capture candid moments.

This is Google's new camera. It's called Google Clips.

Wait, but what do you mean it automatically takes candid photos?

Yeah, so, here's where the camera gets weird.

The camera uses artificial intelligence to both evaluate picture quality and see if someone it "knows" is within view. If it decides that something is a good picture and it recognizes the subject (which could be a person or a pet), it takes a short clip — which can be saved as a video, a GIF, or as one of Google's newly announced Motion Photos. You can also select still images if moving pictures are not really your thing.

It saves a stream of these photos to its internal memory. Then, it connects wirelessly to your phone and a new app called Clips shows a feed of "suggested clips." You then have the option to save these, or delete them. (You can also set it to save all the suggested clips if you want.) You have the option to export photos to third-party apps, like email or Instagram.

Where the AI comes in

It is important to stress here that the camera isn't continually shooting and saving pictures, or taking them at set intervals. Rather, it is making value judgments about the shots it selects. It effectively acts as a personalized photo editor.

Google says it wanted to automate the process of both capturing and selecting great images. Which means it wanted to alleviate the tedious process of flipping through lots of shots to find a good one, or scrolling through video to find the perfect moment. So it evaluates those photos on the device as they happen to determine what to save to memory. What's more, it's taking more pictures than it shows you in suggested clips. You can can toggle a switch to see all the photos it takes. The suggested ones are the clips that the camera has judged to be delightful enough to rise to your attention.

Juston Payne, the product lead for Clips, told BuzzFeed News that the camera looks at many different elements in a clip to make those calls. It wants to see if the shot is stable and well lit. It looks for clips where people are smiling and have their eyes open. It has a bias for jumps and motion that indicate action. And most importantly, it has face detection that looks for a familiar face. (There are dog and cat classifiers too, Google says.)

Blaise Aguera y Arcas, a principal scientist with Google's machine intelligence, says that the camera is powered by neural nets that were trained by human curators. (In essence, people helped the camera's machine learning software understand what makes a good shot.) When it matches the attributes of a good shot with a subject it knows, it shows you that clip.

Aguera y Arcas predicts that, going forward, the Clips cameras will begin to learn what types of photos specific people love. "That's very much our hope, where we can develop modes based on people's tastes."

What's also compelling about this, from both a privacy and performance perspective, is that all this happens in the camera itself.

Traditionally, pulling off this kind of image selection and processing would have had to take place "on a bank of desktops somewhere with powerful GPUs", Aguera y Arcas told BuzzFeed News. "This is the first moment that it could plausibly be done on the device," he said. "It was a process of getting a chip specifically designed to run neural nets at very low power."

And because this happens in the camera, it means that it can get better battery performance than it would if it were processing in the cloud. It doesn't expend resources transferring data to and from a remote server to be processed.

(Google claims three hours; we found it to be better than two but not up to three on a prototype running beta software.)

Also, on-device AI means that if your camera automatically captures an embarrassing moment, you can kill it before it anyone else ever sees it. For example, the photo of my kid playing in the sprinkler was cute, true, but you could really see my back fat where I was bending over in the corner of the shot. Deleted.

Speaking of privacy!

There are several things Google did here to address privacy. For starters, it's offline. The photos are only stored on the device, unless you connect it to your phone and move them over (or set it to automatically do that). This means you have the chance to locally review everything it has shot. There's also a pulsing LED light that shows when it is active.

And finally, Clips purposefully looks familiar. Payne says Google wanted it to be instantly recognizable as a camera, and that "we were trying not to make it feel too much like a tech product." If someone else is wearing it clipped on their clothes, for example, you would immediately recognize that this thing is a camera and that it's maybe capturing your picture.

It's aimed at parents and pet ~companions~.

It comes in a little case, with a built-in clip and stand on the back.

Here are some technical specifications.

The camera has a 12 megapixel sensor and shoots video at15 fps. It has 16 GB memory and a 130-degree field of view. There is no microphone, no display, no speaker. File transfer to your phone is via Wi-Fi and Bluetooth Low Energy. At 54 x 54 x 36mm, and 55 grams, it is quite small. (We temporarily lost ours in the couch.)

To train the camera on someone new, just take their picture.

It did a great job (mostly).

With the caveat that this is an early-release device, running beta software, Clips was mostly impressive. Especially if you think of it as a gee-whiz, rather than must-have, product. (In fact, Aguera y Arcas went so far as to say it was "very much a V1, or even experimental, product" and that he was "not expecting a best-seller".) Image quality was good. But in the era of high-end phone cameras, it's not going to blow you away.

While it is certainly capable of taking beautiful pictures, the magic is not in the image quality as much as its ability to easily get things that you simply previously could not. You can really see the AI at work when you swap between the raw stream of stuff it has captured, and the suggested clips. As Aguera y Arcas put it, "there are a broad set of moments that are just below the waterline."

That is, it takes a lot of the photos that may not quite rise to the level it sets for suggesting them to you. (You can still go in and look at them and select and save the ones you want.) I did end up grabbing a lot of these. But for the most part, they were junk. It was stuff that was ultimately a waste of time and space.

And that's what it is meant to do: It elevates the interesting so that you don't have to. In some ways, its mission is the same as Google Photos itself, which also tries to find and organize your best images for you. And it mostly pulls it off. Save for the occasional shot that reminds me I need to get to the gym.


The Clips has 16 GB of storage. An earlier version of this story cited a different number.

Topics in this article

Skip to footer