150

Google AI adds detail to low-resolution images

It seems intelligent enhancement of image detail is currently high on the agenda at Google. Recently the company brought its RAISR smart image upsampling to Android devices. Now, the Google Brain team has developed a system that uses neural networking to enhance detail in low-resolution images.

The system uses a two-step approach, with a conditioning network first attempting to map 8×8 source images against similar images with a higher resolution and creating an approximation of what the enhanced image might look like. In a second step, the prior network adds realistic detail to the final output image. It does so by learning what each pixel in a low-resolution image generally corresponds to in higher-res files.

As you can see, the system already works pretty well. In the series of samples above, the images on the left show the 64 pixel source images, while the ones in the middle show the output image that the Google Brain algorithm has produced from them. The images on the right show higher-resolution versions of the low-res source images for comparison. While the results are not perfect yet, they are certainly close enough to provide value in a variety of scenarios. Eventually we might even be able to extract high-resolution images from low-quality security-cam footage a la CSI.

View Comments (150)

Comments

All (150)
Most popular (15)
Editors' picks (0)
DPR staff (0)
Oldest first
Ebrahim Saadawi
Ebrahim Saadawi

I don't think people get the idea potential here. If it can upsample SO WELL from such a LOW resolution file, imagine what it would do to an already high resolution files...

It's not for generating a NEW FACE, rather generating that finer detail of the eye lashes on that face, the skin, the hair texture. Not to mention landscape detail like enhancing foliage and trees and sky gradients creating images that are much higher resolution-looking (say for printing) than the original ones,

6 months ago
MPS Photographic

I propose an experiment. Start a new test, and train it on noting but images of known terrorists. Then feed it an 8*8 image of Donald Trump. Would a judge accept the resulting "approximation" as sufficient evidence to arrest him?

This also exhibits a way in which this system can be abused.

Mar 3, 2017*
Chippy99

Unimpressed.

It gets it wrong 100% of the time. In each case the image produced is clearly NOT the same person as in the ground truth image. So what's the point? There is no point. You might as well just pick random high-quality images of people and use those instead, since they aren't the right person either, but at least they will be detailed images.

Feb 16, 2017
srados

Point is, even sketch artists get it wrong, but good enough to jog a memory of the witnesses.Not all images are 8x8. Larger images benefit from this with better accuracy. Also company in US developed face recognition software that can identify 250 million US citizens (so far)that works with body worn cameras. It takes 9 hours to complete search that will be cut to shorter times in the future. We are pretty close to map everyone and to recognize everyone with AI technology in north America...India is mapping citizens with retina (eye) recognition software.

Feb 21, 2017
Bgpgraebner

The problem is that people fail to see the possibility that something like this can bring. Sometimes new tech might not be useful right off the bat, but it still is a VERY remarkable feat to turn those 8x8 images into those on the middle column. Think about the effort that went into making the AI process 64 pixels and then turn them into 1024 and what can be achieved when you feed the same AI with, for instance, a 100MP Phase One RAW file?

Feb 21, 2017
Chippy99

I don't think you're understanding how this works. It's basically taking the small image and matching against Googles database of images. So any "improvement" is dependent on a similar "original" already existing. It would be no use whatsoever in trying to increase the resolution of an already high-res image.

Feb 22, 2017*
Mike FL

Google forgets to show the 3 people's real look.

Feb 14, 2017
srados

"Ground truth" row is the real ones.

Feb 21, 2017
Sir Nick of High Point

Wow, I can't wait to see where this goes.

Feb 14, 2017
Jack Hogan

One thing is obvious: the average person is less bulimic than those in the ground truth :-)

Feb 11, 2017
bijutoha
bijutoha

It seems like an invention a new image. Not like sharpening.

Feb 11, 2017
Najinsky

8 bit RGB x 8 x 8 =
256 x 256 x 256 x 8 x 8 =
~1 billion source combinations

Given there are 7b people on the planet it's already down to1 in 7 accuracy at best.

And that's assuming the source is a simple face portrait. It could be a full body shot, in an infinite number of poses. Or a cat. Or a bowl of fruit, or a fish bowl, with one fish, or two, or seven, or a mountain, or a field, or a flower, or a mountain surrounded by fields of flowers, or a picture of mar, or the Milky Way, or or or...

The kind of accuracy that gets plumbers shot on subways by anti terrorist squads...

Feb 10, 2017
Tom Holly

It's heartening to see how many people grasp the fundamental error at the heart of this method.

Personally I think it should be offlimits to use it on human faces in real situations.

Filling in foliage - sure! Anything that matters - NO WAY

Feb 10, 2017*
iru

You're doing it wrong. It's 256^(8*8) different potential source images, 1.34e+154.

Feb 12, 2017
Biowizard
Biowizard

I'll believe this when Google provides a web page where I can upload an arbitrary 8*8 pixel photo reduction, and get something back that vaguely resembles my original photo. Until then, it's unproven snake oil.

Brian

Feb 10, 2017
s1oth1ovechunk

You have no understanding of what you are seeing. Google has no interest in pleasing you.

Feb 10, 2017
Mike Davis

@Biowizard Yes, and can they do it without reliance on higher resolution versions of the images already known to Google - the first step of their two-step approach?

Send them a photo of a coin held between someone's thumb and forefinger, rendered at a resolution similar to the samples shown above, then see if Google can recover the coin's year of issue.

Feb 10, 2017*
Tom Holly

s1oth1ovechunk

No, he just understands that garbage in garbage out still applies, and 8x8 is very much garbage...

Feb 10, 2017
Tom Holly

Mike Davis,

alphanumerics are easy because there's a very limited and well defined set of source images. Alphabets are written that way on purpose.

Feb 10, 2017
s1oth1ovechunk

You guys are all confused what's happening here. I can tell from your questions.

You can't get the year issued because this is not superresolution this is actually synthesizing an image. Kind of like what your brain does when you see a dog really far away: you can't see the individual fur strands but you can imagine what they look like.

This system is given many many priors and it figured out what 'faces' or 'rooms' look like. It then synthesized images on unseen other rooms or faces, guessing what they would look like at higher resolution. The images it is creating do not exist. If you have it a bunch of priors of images of coins, it could probably draw you a new image of a coin, but afaik this system was not trained on coins so doesn't know what they look like at 32x32. It's not that it needs your original coin picture. It just needs to know what coins look like, in general.

Feb 11, 2017
Biowizard
Biowizard

Dear s1oth1ovechunk, you really do underestimate my knowledge in this area, as someone who has been working in AI for about 35 years. And yes, I fully understand the sensation of seeing hairs on a dog that is 100 yards way, because you already know what the dog looks like. The question isn't one of can you "convincingly" fill in missing detail (obviously you can, even with a pencil on a printout), but whether the "original" can somehow be recreated. And it is HERE that Google is using its vast library of indexed images.

When I "pixelate" something in an image I am uploading, I generally convince myself I can still recognise the faces I've blocked, or read the car registration numbers. Because I know what they are. But no-one else can recognise or read them - unless they've previously seen similar images. Just like your dog analogy.

Brian

Feb 11, 2017
boinkphoto
boinkphoto

Well, it only matters if the resulting photo looks like the actual person. It doesn't do any good if the resulting image looks like someone totally different than taken in the low res original.

The article doesn't seem to clarify whether the result was ultimately a close approximation.

Feb 9, 2017
Tom Holly

100%

It's not real detail, it's just another face which also happens to fit the 8x8 mosaic.

Feb 10, 2017
Clyde Thomas

New portrait photography copyright laws will insist the photographer be identifiable in the eye reflection of the subject. Everyone else is screwed.

Feb 9, 2017
Feb 9, 2017
ImaqtFux
ImaqtFux

Zoom and enchance

Feb 9, 2017
EOS Paul
EOS Paul

I wouldn't want to be in court with THIS as evidence that I was somewhere I wasn't.

Feb 9, 2017
itaibachar

It invents images? thats weird.

Feb 9, 2017
maxnimo

Best news I ever heard! Now all I need is a 1 Megapixel camera and get beautiful 100 Megapixel images out of it!

Feb 9, 2017
J A C S
J A C S

As GB said, it is just a matter of time to get 100mp results with an 1 pixel camera.

Feb 10, 2017
maxnimo

Just imagine, with as little as 100 pixels your could easily have an ISO of 1 billion.

Feb 10, 2017
muffinwobble

you can generate 100mp results from 0 pixels...

Feb 10, 2017
Enginel

It's much more interesting when this AI will find face in a place where isn't a face xD
or vice versa

Feb 9, 2017*
CQui

Thinking about information, if the software:
* knows it is one of your friends,
* a lot of pictures of your friends have already been uploaded ,
* it get information about your camera's location when you took the picture
* it get the location of your friends' phone at the very same time,

Then I can believe that it is able to know from those sources who's there and evaluate the possible position of the face from the blobs to get some possible picture.

But then it should be able to give a better result.

And it is not picture enhancement but picture re-construction.

Feb 9, 2017
s1oth1ovechunk

It's synthesizing a plausible image.

Feb 9, 2017
samfan

Well the technology is certainly impressive but these results are fairly useless. Basically the system thinks "well this looks like a face" and more or less randomly generates a face on that spot.

I suppose it's as good as it can get since you can't generate details our of nothing, certainly not an entire face from a 8x8 px blotch. But looking at the difference between the last 2 columns, it's obvious it would not be helpful to say, recognize people on the photo or add more detail to low-res photos, unless we're fine with changing the faces completely.

I suppose this could have some uses still, such as restoration of old photos where it's more about the overall mood rather than the exact details. But then photos like that tend to be unique enough that the AI may not recognize what's it looking at.

Feb 9, 2017
CQui

You can't compare with CSI, to get a license plate from video you are able to combine several pictures of the same thing, the shape of license plate signs are known, the number of possible combination is huge but limited. The details of the fiction are most of the time not very realistic but not completely impossible,
Here they start from 64 dot, no other information, and guess 15 time more pixels.

Feb 9, 2017
Savannah0986

Now anybody can look like Thrump.

Feb 9, 2017
vscd
vscd

CSI Google. Zoom into that plate! Zoom deeper! Zoom Deeper! Zoom into that screw! Zoomm deeper. there you have it... the reflection of the murder from the other side of the town.

Feb 9, 2017
Achiron
Achiron

ENHANCE!

Feb 9, 2017
Alwina H

At first I thought "fake" but if you look at the images on the left with your eyes half-closed, you see that these 8x8 pixel images show more than just almost random pixels. So it might be true. Maybe with restrictions like "this is guaranteed to be a human face" etc.

Feb 9, 2017
Tan68
Tan68

In the top row, looks like the '32 x 32 sample' image is swapped with 'ground truth' image

Feb 9, 2017
aris14
aris14

I think that it was Matrox's R&D they had a serious work in sub pixel structure back in the 90's...

Feb 9, 2017
JP Zanotti
JP Zanotti

I suggest reading a few courses about theory of information to relativize this kind of "information".

Feb 9, 2017
CQui

Ok, if I get it right, they feed the machine with the 8x8 thing on the left, get the 32x32 of the middle column out while a better resolution of the same pic was the right column...
How can they tell from the 8x8 the blob they start from is male of female?
How can they even guess it is a face?
Is there anyone at Google that believe this is more than just fake promotional sci-fi BS?

Feb 9, 2017
s1oth1ovechunk

By training a network that has seen many 8x8 images and many corresponding 32x32 images and has generalized a connection between the two that works to make something plausible on unseen data.

Feb 9, 2017
krikman

Basically All photo in social networks are:
Duckface
Cat face
or chew.

So AI can safely reconstruct these 3 objects from any pixel in photo with 99,99% probability.

Feb 10, 2017*
saeba77

wow finally japanese porn without mosaic

Feb 9, 2017
scott_mcleod

Though if the software added faces instead of, well, you know... I dunno if the results would be hilarious or terrifying!

Feb 9, 2017
saeba77

well for some people the results will be the same:)

Feb 9, 2017
ZAnton

you made my day!

Feb 9, 2017
J A C S
J A C S

That would take some special training.

Feb 10, 2017
angus rocks
angus rocks

that is freaky good. WOW!!! i wish i would have read this article a few hours ago, i just deleted some photos that were blurry and/or out of focus. he he. no really i give the Alphabet team an A+.

Feb 9, 2017
Tom Holly

Seems fundamentally dishonest. You can't just insert another face that just happens to fit the 8x8 mosaic and pass it off as "enhanced detail"...

Feb 9, 2017
s1oth1ovechunk

They didn't insert another face. They synthesized it. If you look at the paper there is a 'nearest neighbor' column. Which is what you are basically suggesting and it is often nothing like the ground truth.

Feb 9, 2017
Biowizard
Biowizard

Not true s1oth1ovechunk - didn't you read the text? The first step involves "a conditioning network first attempting to map 8×8 source images against similar images with a higher resolution". In other words, play the image against the entire library of web images that Google has, to see which reduces to the same pattern. In other words, if I sent an 8*8 pixel reduction of an otherwise never-published photo, it would not even begin to know where to start.

#Smoke #Mirrors #SnakeOil

Brian

Feb 10, 2017
s1oth1ovechunk

Read it again. Look at some of the examples where the synthesized image has nonsense in it. Look at the nearest neighbor. These things are not consistent with your understanding.Also your understanding would be a pretty lame system.

Training neural networks is about generalizing to unseen data. This is not a search system. The faces you see being created do not exist anywhere else.

Feb 10, 2017
Biowizard
Biowizard

Perhaps I should mention I have been working in AI since the 1980s (Imperial College, London). Of course I was over-simplifying my previous comment, because I didn't want to "lose" the general audience. And yes, we do a lot of work with Neural Nets in Prolog - my personal discipline being Logic Programming, and my company being named after it ... (WIN-PROLOG, LPA, ...).

Brian

Feb 10, 2017*
Tom Holly

s1oth1ovechunk,

We understand full well that the neural net is generalising the database and subsequently synthesising faces that match the 8x8 key.

The point is that it produces a DIFFERENT FACE to the original (as it must given the limited input data). So it's just plain misleading, if not outright dangerous, to say this techonlogy will lead to "extracting high-resolution images from low-quality security-cam footage a la CSI".

Feb 10, 2017*
AbrasiveReducer

Pretty soon we'll be able to extact enough data to confuse you with somebody else.

Feb 9, 2017
Stejo
Stejo

This is fiction. Potentially dangerous fiction.

Feb 9, 2017
straylightrun

Its just more pop sci news for geeks to get off on.

Feb 9, 2017
zdechlypes

too good to be true.

Feb 9, 2017
(unknown member)

Hackers are already rejoicing at the opportunity.
Fill in random porn, political figures, or racial photos as AI-introduced detail.

Feb 9, 2017
lacix
lacix

But how accurate it could be? - Just think about “spell-check” – Easily could par you up with Michael Jackson

Feb 9, 2017
Great Bustard

Game shows for AI in the future: I can name that photo with one pixel.

Feb 9, 2017
Esstee
Esstee

Bringing the old, enhance and magnify movie practice one step closer to reality...

Feb 9, 2017
lacix
lacix

The "ground truth"? - I'll rather stay with my imagination

Feb 9, 2017
J A C S
J A C S

Funny that they "recovered" 32x32 images and stopped there. In fact, it is exactly the step from 32x32, in this case, to a much higher resolution which can be done more realistically with machine learning.

Feb 9, 2017
tailings
tailings

I would like to see the "similar images with a higher resolution" used to create the extrapolated image. How similar must the reference image be to produce useable results? And most interestingly from a creative point of view, what happens when you purposely feed it 'false' data?

Feb 9, 2017
A Owens

Goodness this is a tough audience. I would not have even picked that the bottom left one was a face. Remarkable.

Feb 9, 2017
badi

hehe... you don't know how neural network works. It is basically guesswork to provide a best result (maximizing some scores on some imposed criteria) according to a learning sample set (the larger the better).

Unless it's a brilliant new stuff, it will most likely provide you some sort of face from completely random pixels (especially if you say to it that it is a face there). Also, if you train it in characters for example, instead of faces, on that image it will probably say it's a reversed 5 or something :).
It's not recognizing what is in the picture, it just produces a best result according to one or more learning sets and the assumption that the image is of the same type.

Feb 9, 2017
J A C S
J A C S

What I see in the first image is a guy wearing stocking on his head ready to rob a bank.

Feb 9, 2017
WilliamJ
WilliamJ

You're right ! This is even a 115 denier stocking, lightly torn... You have eagle eyes, my friend !

Feb 9, 2017
belle100
belle100

From the pictures of 8x8 to 32x32 above, it's too good to be true I am afraid.

Feb 9, 2017
WilliamJ
WilliamJ

Or too good to be innocent ?

Feb 9, 2017
belle100
belle100

Uh-Oh. Bad news for people don't want to show their faces on news or documentaries.

Feb 9, 2017
Photoman
Photoman

There goes your copyright guys & gales and maybe, just maybe a few bad guys will get caught now.

Feb 9, 2017
agentul

and maybe, just maybe anonymous sources and witnesses that appear blurred on TV will be easily identifiable.

Feb 9, 2017
Ocolon

and maybe good guys that are mistaken for bad guys will get caught now.

Feb 9, 2017
ZorSy

So we finally know what those billions of photos people uploaded to picassa and google photos (and other cloud based image storage solutions going through various servers) will be used for - pure guesswork based on random pixels to generate some 'enhanced' image. Hopefully none could be used for legal purposes as we are all, collectively, guilty if they start 'enhancing' security video grabs for police to catch the crims....Second to this, we will finally know the identity of the Minecraft character....well done Google.

Feb 9, 2017
Carl Mucks

The google AI should be tested on Rorschach ink blots, it seems it has some psychological issues.

Feb 9, 2017
(unknown member)

On the contrary. The blots would be sold to the higher google ads bidder.

Feb 9, 2017
RedFox88

Is this another PR trickery like adobe pulled a couple of years ago when they scrambled a photo with a filter then used their filter to "unscramble " it

Feb 9, 2017
Foosa dee cat
Foosa dee cat

Enhance........ Enhance........ Enhance....

Feb 9, 2017
Loro Husk

That movie :D

Feb 9, 2017
Tom K.
Tom K.

It would be interesting to see some of the multitude of poor results, that they sifted through to be able to show three good ones.

Feb 9, 2017*
badi

in their pdf, they provide more sample results, and explain a bit how the method works: https://arxiv.org/pdf/1702.00783.pdf
Also, in the examples (figure 4 - some bedroom images) you can see some results where the "guessed" image, is quite similar but completely different in details.

Feb 9, 2017
Total: 74, showing: 1 – 50
« First‹ Previous12Next ›Last »