Maybe on the other side it's good news as ppl are usually their best selves when they are being watched.
I don't think that view holds up.
A, it very much depends on who is watching, what their incentives are, and what power they hold.
And B, it also depends on who is being watched - not everyone thrives under a microscope. Are they the type to feel stifled? Or rebellious?
Has been published in 2021. Also here https://news.ycombinator.com/item?id=29399828
I was really skeptical of this since the article conveniently doesn't include any photos taken by the nano-camera, but there are examples [1] in the original paper that are pretty impressive.
[1] https://www.nature.com/articles/s41467-021-26443-0/figures/2
> Then, a physics-based neural network was used to process the images captured by the meta-optics camera. Because the neural network was trained on metasurface physics, it can remove aberrations produced by the camera.
Generally it's probably wise to be skeptical of anything that appears to get around the diffraction limit.
You’re right that it might fail on noise with resolution fine enough to break assumptions from the NN training set. But that’s not a super common application for cameras, and traditional cameras have their own limitations.
Not saying we shouldn’t be skeptical, just that there is a plausible mechanism here.
Multilevel fractal noise specifically would give an indication of how fine you can go.
I agree that measuring against such a test would be interesting, but I'm not sure it's possible or desirable for any camera tech to produce an objectively "true" pixel by pixel value. This new approach may fail/cheat in different ways, which is interesting but not disqualifying to me.
Is this basically a visible-wavelength beamsteering phased array?
It's a good read; I don't think the extrapolation of one technical advance has ever been done better.
now, here's the rig I'd love to see with this: take a hundred of them and position them like a bug's eye to see what could be done with that. there'd be so much overlapping coverage that 3D would be possible, yet the parallax would be so small that makes me wonder how much depth would be discernible
> After fabrication of the meta-optic, we account for fabrication error by performing a PSF calibration step. This is accomplished by using an optical relay system to image a pinhole illuminated by fiber-coupled LEDs. We then conduct imaging experiments by replacing the pinhole with an OLED monitor. The OLED monitor is used to display images that will be captured by our nano-optic imager.
But shooting a real chameleon is irrelevant to what they're trying to demonstrate here.
At the scales they're working at here ("nano-optics"), there's no travel distance for chromatic distortion to take place within the lens. Therefore, whether they're shooting a 3D scene (a chameleon) or a 2D scene (an OLED monitor showing a picture of a chameleon), the light that makes it through their tiny lens to hit the sensor is going to be the same.
(That's the intuitive explanation, at least; the technical explanation is a bit stranger, as the lens is sub-wavelength – and shaped into structures that act as antennae for specific light frequencies. You might say that all the lens is doing is chromatic distortion — but in a very controlled manner, "funnelling" each frequency of inbound light to a specific part of the sensor, somewhat like a MIMO antenna "funnels" each frequency-band of signal to a specific ADC+DSP. Which amounts to the same thing: this lens doesn't "see" any difference between 3D scenes and 2D images of those scenes.)
PS It's "Vernor"
In chapter 13 the enemy describes them as using Fourier optics, though that seemed to be their speculation - not sure whether it was right.
The paper says that reconstructing an actual image from the raw data produced by the sensor takes ~58ms of computation, so doing it for 10,000 sensors would naively take around ten minutes, though I'm sure there's room for optimization and parallelization.
The sensors produce 720x720px images, so a 100x100 array of them would produce 72,000x72,000px images, or ~5 gigapixels. That's a lot of pixels for a smartphone to push around and process and store.
edit: mixed up bits and bytes
Plus higher resolution sensors have this nasty habit of producing too large files, processing of which slows down given devices compared to smaller, crisper photos and they take much more space, even more so for videos. That's probably why Apple held to 12mpix main camera for so long, there were even 200mpix sensors available around if wanted.
They're not comparable, in the intuitive sense, to conventional cameras.
I think it's useful to distinguish all of these even if they are desired. I really love my iPhone camera, but there's something deeply unsettling about how it alters the photos. It's fundamentally producing a different image you can get with either film or through your eyes. Naturally this is true for all digital sensors but we once could point out specifically how and why the resulting image differs from what our eyes see. It's no longer easy to even enumerate the possible alterations that go on via software, let alone control many of them, and I think there will be backlash at some point (or stated differently, a market for cameras that allow controlling this).
I've got to imagine it's frustrating for people who rely on their phone cameras for daily work to find out that upgrading a phone necessarily means relearning its foibles and adjusting how you shoot to accommodate it. Granted, I mostly take smartphone photos in situations where i'd rather not be neurotic about the result (candids, memories, reminders, etc) but surely there are professionals out there who can speak to this.
"The Neural network Image Processing features in this camera are arguably even more important here than they are in the R5 Mark II. A combination of deep learning and algorithmic AI is used to power In-Camera Upscaling, which transforms the pedestrian-resolution 24.2MP images into pixel-packed 96MP photos – immediately outclassing every full-frame camera on the market, and effectively hitting GFX and Hasselblad territory.
"On top of that is High ISO Noise Reduction, which uses AI to denoise images by 2 stops. It works wonders when you're pushing those higher ISOs, which are already way cleaner than you'd expect thanks to the flagship image sensor and modest pixel count."
Edit: by default.
I don't trust human to avoid taking shorcuts once the tech is available, it's too convenient to have "information" for so cheap, and less costly to silence the occasional scandal.
military, surveillance and porn
I can remember watching a TV series as a child where a time traveler went back to the 80s and some person told him that everything is about miniaturization. Then he pointed to a little pin on the time traveler's jacket, which was actually a camera, and said: "This little pin for example could one day hold a full video camera", which seemed a bit ridiculous at that time.
At what point do you reduce the signal to the equivalent of an LLM prompt, with most of the resulting image being explained by the training data?
Yeah, I know that modern phone cameras are also heavily post-processed, but the hardware is at least producing a reasonable optical image to begin with. There's some correspondence between input and output; at least they're comparable.
The future is going to become difficult for people who find value in creative activities, beyond just a raw audio/visual/textual signal at the output. I think most people who really care about a creative medium would say there's some kind of value in the process and the human intentionality that creates works, both for the creator who engages in it and the audience who is aware of it.
In my opinion most AI creative tools don't actually benefit serious creators, they just provide a competitive edge for companies to sell new products and enable more dilettantes to enter the scene and flood us with mediocrity
You're forgetting something. Chain of custody, trust and reputation. The source of an image or video matters as to whether it is considered a reliable representation of reality or not.
We will develop better methods of verifying sources as well, possibly using cryptography and new social networks where members authenticate each other in-person and trust is built up.
How would someone excrete an array of these cameras if ingested?
1. A "simpler" sci-fi solution foe a 1-way trip that's still out of our reach is a large light sail and huge Earth-based laser, but his required "smaller" breakthroughs in material science
I'm assuming some sort of fixed laser type propulsion mechanism would leverage a type of solar sail technology. Maybe you could send a phased laser signal that "vibrates" a solar sail towards the source of energy instead of away.
Not necessarily - at least with currently known science. Light sails work ok transferring momentum from photons, allowing positive acceleration from a giant laser Earth. Return trip requires a giant laser on the other side.
It's been a while since I've heard anyone talk about the Starshot project[0]. Maybe this would help revitalize it.
Also even without aiming for Proxima Centauri, it would be great to have more cameras in our own planetary system.
--
https://www.centauri-dreams.org/2024/01/19/data-return-from-...
Their probes could be the size of sand grains, maybe even dust. Maybe not quite sophons, but not much better as far as our odds of finding anything. I suppose there would have to be something larger to receive signals from these things and send them back (because physics), but that could be hanging out somewhere we'd be unlikely to see it.
Yet another Fermi paradox answer: we are looking for big spacecraft when the universe is full of smart dust.
Then there is the whole "neural" part. Do these get "enhanced" by a generative AI that fills the blur based on the most statistically likely pixels?
The article is pretty bad.
I imagine you could do this using a standard computational model, it would just be very intensive. So I guess it would be 'enhanced' in the same way a JPEG stores an image in a lossy format.
"One day, your kids will go to the toy store and get a sheet of stickers. Each sticker is actually a camera with an IPv6 address. That means they can put a sticker somewhere, go and point a browser at that address and see a live camera feed.
I should point out: all of the technology to do this already exists, it just hasn't gotten cheap enough to mass market. When economies of scale do kick in, society is going to have to deal with a dramatic change in what they think 'physical privacy' means."
It is pretty easy to interface with too - i did it with a pi pico microcontroller: https://x.com/dmitrygr/status/1753585604971917313
Given the tiny dimensions, and wide field, adding regular lenses over an array could create extreme wide field, like 160x160 degrees, for everyday phone cameras. Or very small 360x180 degree stand-alone cameras. AR glasses with a few cameras could operate with 360x160 degrees and be extremely situationally aware!
Another application would be small light field cameras. I don't know enough to judge if this is directly applicable, or adaptable to that. But it would be wonderful to finally have small cheap light field cameras. Both for post-focus adjustment and (better than stereo) 3D image sensing and scene reconstruction.
That would make them 6 orders of magnitude larger.
Does anyone know why Lytro couldn't be shrunk to fit in smartphones? Because this seems like similar technology.