149

Google AI adds detail to low-resolution images

It seems intelligent enhancement of image detail is currently high on the agenda at Google. Recently the company brought its RAISR smart image upsampling to Android devices. Now, the Google Brain team has developed a system that uses neural networking to enhance detail in low-resolution images.

The system uses a two-step approach, with a conditioning network first attempting to map 8×8 source images against similar images with a higher resolution and creating an approximation of what the enhanced image might look like. In a second step, the prior network adds realistic detail to the final output image. It does so by learning what each pixel in a low-resolution image generally corresponds to in higher-res files.

As you can see, the system already works pretty well. In the series of samples above, the images on the left show the 64 pixel source images, while the ones in the middle show the output image that the Google Brain algorithm has produced from them. The images on the right show higher-resolution versions of the low-res source images for comparison. While the results are not perfect yet, they are certainly close enough to provide value in a variety of scenarios. Eventually we might even be able to extract high-resolution images from low-quality security-cam footage a la CSI.

Comments

All (149)
Most popular (15)
Editors' picks (0)
DPR staff (0)
Oldest first
MPS Photographic

I propose an experiment. Start a new test, and train it on noting but images of known terrorists. Then feed it an 8*8 image of Donald Trump. Would a judge accept the resulting "approximation" as sufficient evidence to arrest him?

This also exhibits a way in which this system can be abused.

8 months ago*
Chippy99

Unimpressed.

It gets it wrong 100% of the time. In each case the image produced is clearly NOT the same person as in the ground truth image. So what's the point? There is no point. You might as well just pick random high-quality images of people and use those instead, since they aren't the right person either, but at least they will be detailed images.

9 months ago
srados

Point is, even sketch artists get it wrong, but good enough to jog a memory of the witnesses.Not all images are 8x8. Larger images benefit from this with better accuracy. Also company in US developed face recognition software that can identify 250 million US citizens (so far)that works with body worn cameras. It takes 9 hours to complete search that will be cut to shorter times in the future. We are pretty close to map everyone and to recognize everyone with AI technology in north America...India is mapping citizens with retina (eye) recognition software.

9 months ago
Bgpgraebner

The problem is that people fail to see the possibility that something like this can bring. Sometimes new tech might not be useful right off the bat, but it still is a VERY remarkable feat to turn those 8x8 images into those on the middle column. Think about the effort that went into making the AI process 64 pixels and then turn them into 1024 and what can be achieved when you feed the same AI with, for instance, a 100MP Phase One RAW file?

9 months ago
Chippy99

I don't think you're understanding how this works. It's basically taking the small image and matching against Googles database of images. So any "improvement" is dependent on a similar "original" already existing. It would be no use whatsoever in trying to increase the resolution of an already high-res image.

9 months ago*
Mike FL

Google forgets to show the 3 people's real look.

9 months ago
srados

"Ground truth" row is the real ones.

9 months ago
Sir Nick of High Point

Wow, I can't wait to see where this goes.

9 months ago
Jack Hogan

One thing is obvious: the average person is less bulimic than those in the ground truth :-)

9 months ago
bijutoha
bijutoha

It seems like an invention a new image. Not like sharpening.

9 months ago
Najinsky

8 bit RGB x 8 x 8 =
256 x 256 x 256 x 8 x 8 =
~1 billion source combinations

Given there are 7b people on the planet it's already down to1 in 7 accuracy at best.

And that's assuming the source is a simple face portrait. It could be a full body shot, in an infinite number of poses. Or a cat. Or a bowl of fruit, or a fish bowl, with one fish, or two, or seven, or a mountain, or a field, or a flower, or a mountain surrounded by fields of flowers, or a picture of mar, or the Milky Way, or or or...

The kind of accuracy that gets plumbers shot on subways by anti terrorist squads...

9 months ago
Tom Holly

It's heartening to see how many people grasp the fundamental error at the heart of this method.

Personally I think it should be offlimits to use it on human faces in real situations.

Filling in foliage - sure! Anything that matters - NO WAY

9 months ago*
iru

You're doing it wrong. It's 256^(8*8) different potential source images, 1.34e+154.

9 months ago
Biowizard
Biowizard

I'll believe this when Google provides a web page where I can upload an arbitrary 8*8 pixel photo reduction, and get something back that vaguely resembles my original photo. Until then, it's unproven snake oil.

Brian

9 months ago
s1oth1ovechunk

You have no understanding of what you are seeing. Google has no interest in pleasing you.

9 months ago
Mike Davis

@Biowizard Yes, and can they do it without reliance on higher resolution versions of the images already known to Google - the first step of their two-step approach?

Send them a photo of a coin held between someone's thumb and forefinger, rendered at a resolution similar to the samples shown above, then see if Google can recover the coin's year of issue.

9 months ago*
Tom Holly

s1oth1ovechunk

No, he just understands that garbage in garbage out still applies, and 8x8 is very much garbage...

9 months ago
Tom Holly

Mike Davis,

alphanumerics are easy because there's a very limited and well defined set of source images. Alphabets are written that way on purpose.

9 months ago
s1oth1ovechunk

You guys are all confused what's happening here. I can tell from your questions.

You can't get the year issued because this is not superresolution this is actually synthesizing an image. Kind of like what your brain does when you see a dog really far away: you can't see the individual fur strands but you can imagine what they look like.

This system is given many many priors and it figured out what 'faces' or 'rooms' look like. It then synthesized images on unseen other rooms or faces, guessing what they would look like at higher resolution. The images it is creating do not exist. If you have it a bunch of priors of images of coins, it could probably draw you a new image of a coin, but afaik this system was not trained on coins so doesn't know what they look like at 32x32. It's not that it needs your original coin picture. It just needs to know what coins look like, in general.

9 months ago
Biowizard
Biowizard

Dear s1oth1ovechunk, you really do underestimate my knowledge in this area, as someone who has been working in AI for about 35 years. And yes, I fully understand the sensation of seeing hairs on a dog that is 100 yards way, because you already know what the dog looks like. The question isn't one of can you "convincingly" fill in missing detail (obviously you can, even with a pencil on a printout), but whether the "original" can somehow be recreated. And it is HERE that Google is using its vast library of indexed images.

When I "pixelate" something in an image I am uploading, I generally convince myself I can still recognise the faces I've blocked, or read the car registration numbers. Because I know what they are. But no-one else can recognise or read them - unless they've previously seen similar images. Just like your dog analogy.

Brian

9 months ago
boinkphoto
boinkphoto

Well, it only matters if the resulting photo looks like the actual person. It doesn't do any good if the resulting image looks like someone totally different than taken in the low res original.

The article doesn't seem to clarify whether the result was ultimately a close approximation.

9 months ago
Tom Holly

100%

It's not real detail, it's just another face which also happens to fit the 8x8 mosaic.

9 months ago
Clyde Thomas

New portrait photography copyright laws will insist the photographer be identifiable in the eye reflection of the subject. Everyone else is screwed.

9 months ago
9 months ago
ImaqtFux
ImaqtFux

Zoom and enchance

9 months ago
EOS Paul
EOS Paul

I wouldn't want to be in court with THIS as evidence that I was somewhere I wasn't.

9 months ago
itaibachar

It invents images? thats weird.

9 months ago
maxnimo

Best news I ever heard! Now all I need is a 1 Megapixel camera and get beautiful 100 Megapixel images out of it!

9 months ago
J A C S
J A C S

As GB said, it is just a matter of time to get 100mp results with an 1 pixel camera.

9 months ago
maxnimo

Just imagine, with as little as 100 pixels your could easily have an ISO of 1 billion.

9 months ago
muffinwobble

you can generate 100mp results from 0 pixels...

9 months ago
Enginel

It's much more interesting when this AI will find face in a place where isn't a face xD
or vice versa

9 months ago*
CQui

Thinking about information, if the software:
* knows it is one of your friends,
* a lot of pictures of your friends have already been uploaded ,
* it get information about your camera's location when you took the picture
* it get the location of your friends' phone at the very same time,

Then I can believe that it is able to know from those sources who's there and evaluate the possible position of the face from the blobs to get some possible picture.

But then it should be able to give a better result.

And it is not picture enhancement but picture re-construction.

9 months ago
s1oth1ovechunk

It's synthesizing a plausible image.

9 months ago
samfan

Well the technology is certainly impressive but these results are fairly useless. Basically the system thinks "well this looks like a face" and more or less randomly generates a face on that spot.

I suppose it's as good as it can get since you can't generate details our of nothing, certainly not an entire face from a 8x8 px blotch. But looking at the difference between the last 2 columns, it's obvious it would not be helpful to say, recognize people on the photo or add more detail to low-res photos, unless we're fine with changing the faces completely.

I suppose this could have some uses still, such as restoration of old photos where it's more about the overall mood rather than the exact details. But then photos like that tend to be unique enough that the AI may not recognize what's it looking at.

9 months ago
CQui

You can't compare with CSI, to get a license plate from video you are able to combine several pictures of the same thing, the shape of license plate signs are known, the number of possible combination is huge but limited. The details of the fiction are most of the time not very realistic but not completely impossible,
Here they start from 64 dot, no other information, and guess 15 time more pixels.

9 months ago
Savannah0986

Now anybody can look like Thrump.

9 months ago
vscd
vscd

CSI Google. Zoom into that plate! Zoom deeper! Zoom Deeper! Zoom into that screw! Zoomm deeper. there you have it... the reflection of the murder from the other side of the town.

9 months ago
Achiron
Achiron

ENHANCE!

9 months ago
Alwina H

At first I thought "fake" but if you look at the images on the left with your eyes half-closed, you see that these 8x8 pixel images show more than just almost random pixels. So it might be true. Maybe with restrictions like "this is guaranteed to be a human face" etc.

9 months ago
Tan68
Tan68

In the top row, looks like the '32 x 32 sample' image is swapped with 'ground truth' image

9 months ago
aris14
aris14

I think that it was Matrox's R&D they had a serious work in sub pixel structure back in the 90's...

9 months ago
JP Zanotti
JP Zanotti

I suggest reading a few courses about theory of information to relativize this kind of "information".

9 months ago
CQui

Ok, if I get it right, they feed the machine with the 8x8 thing on the left, get the 32x32 of the middle column out while a better resolution of the same pic was the right column...
How can they tell from the 8x8 the blob they start from is male of female?
How can they even guess it is a face?
Is there anyone at Google that believe this is more than just fake promotional sci-fi BS?

9 months ago
s1oth1ovechunk

By training a network that has seen many 8x8 images and many corresponding 32x32 images and has generalized a connection between the two that works to make something plausible on unseen data.

9 months ago
krikman

Basically All photo in social networks are:
Duckface
Cat face
or chew.

So AI can safely reconstruct these 3 objects from any pixel in photo with 99,99% probability.

9 months ago*
saeba77

wow finally japanese porn without mosaic

9 months ago
scott_mcleod

Though if the software added faces instead of, well, you know... I dunno if the results would be hilarious or terrifying!

9 months ago
saeba77

well for some people the results will be the same:)

9 months ago
ZAnton

you made my day!

9 months ago
J A C S
J A C S

That would take some special training.

9 months ago
angus rocks
angus rocks

that is freaky good. WOW!!! i wish i would have read this article a few hours ago, i just deleted some photos that were blurry and/or out of focus. he he. no really i give the Alphabet team an A+.

9 months ago
Tom Holly

Seems fundamentally dishonest. You can't just insert another face that just happens to fit the 8x8 mosaic and pass it off as "enhanced detail"...

9 months ago
s1oth1ovechunk

They didn't insert another face. They synthesized it. If you look at the paper there is a 'nearest neighbor' column. Which is what you are basically suggesting and it is often nothing like the ground truth.

9 months ago
Biowizard
Biowizard

Not true s1oth1ovechunk - didn't you read the text? The first step involves "a conditioning network first attempting to map 8×8 source images against similar images with a higher resolution". In other words, play the image against the entire library of web images that Google has, to see which reduces to the same pattern. In other words, if I sent an 8*8 pixel reduction of an otherwise never-published photo, it would not even begin to know where to start.

#Smoke #Mirrors #SnakeOil

Brian

9 months ago
s1oth1ovechunk

Read it again. Look at some of the examples where the synthesized image has nonsense in it. Look at the nearest neighbor. These things are not consistent with your understanding.Also your understanding would be a pretty lame system.

Training neural networks is about generalizing to unseen data. This is not a search system. The faces you see being created do not exist anywhere else.

9 months ago
Biowizard
Biowizard

Perhaps I should mention I have been working in AI since the 1980s (Imperial College, London). Of course I was over-simplifying my previous comment, because I didn't want to "lose" the general audience. And yes, we do a lot of work with Neural Nets in Prolog - my personal discipline being Logic Programming, and my company being named after it ... (WIN-PROLOG, LPA, ...).

Brian

9 months ago*
Tom Holly

s1oth1ovechunk,

We understand full well that the neural net is generalising the database and subsequently synthesising faces that match the 8x8 key.

The point is that it produces a DIFFERENT FACE to the original (as it must given the limited input data). So it's just plain misleading, if not outright dangerous, to say this techonlogy will lead to "extracting high-resolution images from low-quality security-cam footage a la CSI".

9 months ago*
AbrasiveReducer

Pretty soon we'll be able to extact enough data to confuse you with somebody else.

9 months ago
Stejo
Stejo

This is fiction. Potentially dangerous fiction.

9 months ago
straylightrun

Its just more pop sci news for geeks to get off on.

9 months ago
zdechlypes

too good to be true.

9 months ago
(unknown member)

Hackers are already rejoicing at the opportunity.
Fill in random porn, political figures, or racial photos as AI-introduced detail.

9 months ago
lacix
lacix

But how accurate it could be? - Just think about “spell-check” – Easily could par you up with Michael Jackson

9 months ago
Great Bustard

Game shows for AI in the future: I can name that photo with one pixel.

9 months ago
Esstee
Esstee

Bringing the old, enhance and magnify movie practice one step closer to reality...

9 months ago
lacix
lacix

The "ground truth"? - I'll rather stay with my imagination

9 months ago
J A C S
J A C S

Funny that they "recovered" 32x32 images and stopped there. In fact, it is exactly the step from 32x32, in this case, to a much higher resolution which can be done more realistically with machine learning.

9 months ago
tailings
tailings

I would like to see the "similar images with a higher resolution" used to create the extrapolated image. How similar must the reference image be to produce useable results? And most interestingly from a creative point of view, what happens when you purposely feed it 'false' data?

9 months ago
A Owens

Goodness this is a tough audience. I would not have even picked that the bottom left one was a face. Remarkable.

9 months ago
badi

hehe... you don't know how neural network works. It is basically guesswork to provide a best result (maximizing some scores on some imposed criteria) according to a learning sample set (the larger the better).

Unless it's a brilliant new stuff, it will most likely provide you some sort of face from completely random pixels (especially if you say to it that it is a face there). Also, if you train it in characters for example, instead of faces, on that image it will probably say it's a reversed 5 or something :).
It's not recognizing what is in the picture, it just produces a best result according to one or more learning sets and the assumption that the image is of the same type.

9 months ago
J A C S
J A C S

What I see in the first image is a guy wearing stocking on his head ready to rob a bank.

9 months ago
WilliamJ
WilliamJ

You're right ! This is even a 115 denier stocking, lightly torn... You have eagle eyes, my friend !

9 months ago
belle100
belle100

From the pictures of 8x8 to 32x32 above, it's too good to be true I am afraid.

9 months ago
WilliamJ
WilliamJ

Or too good to be innocent ?

9 months ago
belle100
belle100

Uh-Oh. Bad news for people don't want to show their faces on news or documentaries.

9 months ago
Photoman
Photoman

There goes your copyright guys & gales and maybe, just maybe a few bad guys will get caught now.

9 months ago
agentul

and maybe, just maybe anonymous sources and witnesses that appear blurred on TV will be easily identifiable.

9 months ago
Ocolon

and maybe good guys that are mistaken for bad guys will get caught now.

9 months ago
ZorSy

So we finally know what those billions of photos people uploaded to picassa and google photos (and other cloud based image storage solutions going through various servers) will be used for - pure guesswork based on random pixels to generate some 'enhanced' image. Hopefully none could be used for legal purposes as we are all, collectively, guilty if they start 'enhancing' security video grabs for police to catch the crims....Second to this, we will finally know the identity of the Minecraft character....well done Google.

9 months ago
Carl Mucks

The google AI should be tested on Rorschach ink blots, it seems it has some psychological issues.

9 months ago
(unknown member)

On the contrary. The blots would be sold to the higher google ads bidder.

9 months ago
RedFox88

Is this another PR trickery like adobe pulled a couple of years ago when they scrambled a photo with a filter then used their filter to "unscramble " it

9 months ago
Foosa dee cat
Foosa dee cat

Enhance........ Enhance........ Enhance....

9 months ago
Loro Husk

That movie :D

9 months ago
Tom K.
Tom K.

It would be interesting to see some of the multitude of poor results, that they sifted through to be able to show three good ones.

9 months ago*
badi

in their pdf, they provide more sample results, and explain a bit how the method works: https://arxiv.org/pdf/1702.00783.pdf
Also, in the examples (figure 4 - some bedroom images) you can see some results where the "guessed" image, is quite similar but completely different in details.

9 months ago
AlanG
AlanG

Run the Zapruder film through this and see if Ted Cruz's dad is holding a gun on the grassy knoll.

Will generating pictures this way from unrecognizable pixelated images actually be evidence?

9 months ago
AbrasiveReducer

Most excellent comment!

9 months ago
Total: 73, showing: 1 – 50
« First‹ Previous12Next ›Last »