Can PRNU identify the exact camera that took a photo?

Yes, when you have reference images from the candidate camera. PRNU is essentially a per-sensor fingerprint and a single 12MP photo carries enough signal to match against a reference set with > 99% accuracy in lab conditions.

Does FFT analysis work on small images?

It degrades sharply below 256×256. Most published detectors require ≥ 512×512 to produce stable radial-spectrum measurements.

Why is green noisier than other channels in real photos?

Bayer sensors have twice as many green pixels as red or blue. Demosaicing averages neighboring greens, which reduces noise — paradoxically making green look *less* noisy after processing despite the sensor having more green samples.

PRNU, FFT & Sensor Noise — The Forensics Behind Image Authenticity

title: "PRNU, FFT & Sensor Noise — The Forensics Behind Image Authenticity" description: "A deep dive into the three pillars of image forensics: PRNU sensor fingerprints, FFT-domain analysis, and the physics of camera noise." slug: "prnu-fft-sensor-noise" publishedAt: "2026-04-08" updatedAt: "2026-04-21" author: "SynthGuard Team" category: "research" tags: ["forensics", "prnu", "fft", "sensor-noise", "research"] readingTime: 12 coverImage: "/blog/covers/prnu-fft-sensor-noise.webp" faq:

q: "Can PRNU identify the exact camera that took a photo?" a: "Yes, when you have reference images from the candidate camera. PRNU is essentially a per-sensor fingerprint and a single 12MP photo carries enough signal to match against a reference set with > 99% accuracy in lab conditions."
q: "Does FFT analysis work on small images?" a: "It degrades sharply below 256×256. Most published detectors require ≥ 512×512 to produce stable radial-spectrum measurements."
q: "Why is green noisier than other channels in real photos?" a: "Bayer sensors have twice as many green pixels as red or blue. Demosaicing averages neighboring greens, which reduces noise — paradoxically making green look less noisy after processing despite the sensor having more green samples." related: ["how-ai-image-detectors-work", "humanize-ai-images-without-losing-quality"]

Image forensics is a small, mathematically dense field that quietly underpins everything from courtroom exhibits to AI-detection startups. Three pillars do most of the heavy lifting: PRNU (the sensor fingerprint), FFT-domain analysis (the frequency signature), and noise modeling (the physics of light hitting silicon). This is the technical primer.

The forensic mindset#

Forensic analysis assumes nothing about content. It asks: Is this image consistent with the physics of how cameras produce images? A perfectly authentic photograph of an impossible scene is still authentic. A perfectly composed AI generation of a real cat is still synthetic. The forensic signal lives in the process, not the subject.

That distinction is what makes forensics resilient to generative model improvements — better models produce better-looking pixels but rarely better physics.

PRNU — the sensor fingerprint#

What it is#

Photo Response Non-Uniformity is a multiplicative noise pattern unique to each camera sensor. Every photosite has a slightly different gain — manufacturing tolerances at the silicon level. The result is a fixed, per-pixel pattern that is:

Multiplicative (measured = ideal × (1 + K))
Stable across the sensor's lifetime
Independent between sensors (even of the same model)
Strongest at high luminance, weakest in shadows

Why it survives processing#

PRNU is high-frequency, low-amplitude, and present in every pixel. Most processing operations (resize, crop, color grading, even moderate JPEG) preserve enough of it to be detectable. Only operations that explicitly attack the high-frequency band — heavy denoising, frequency-domain filtering, or generative re-synthesis — destroy it.

How it's extracted#

Given an image I:

1. Denoise: I_denoised = wavelet_denoise(I)
2. Residual: R = I - I_denoised
3. Normalize: K_estimate = R / I  (only where I > threshold)

K_estimate is a noisy estimate of the sensor's PRNU pattern. Average across 50+ flat-field images of the same camera and you get a clean reference. Cross-correlate a new image's K_estimate against the reference and you get a confidence score.

What it tells us about AI images#

Generated images have no PRNU. They have whatever noise the decoder hallucinated — usually low-amplitude, isotropic, and uncorrelated with luminance. A PRNU-aware AI detector doesn't need a reference camera; it just checks whether the image's residual behaves like PRNU at all (multiplicative, luminance-scaled, non-isotropic). If not — synthetic.

This is why injecting realistic PRNU is one of the highest-value layers in any humanization pipeline.

FFT — the frequency signature#

Why frequency matters#

Real images live in a constrained frequency band. Optics low-pass them (lens MTF). Sensors band-limit them (the optical low-pass filter on the sensor stack). Demosaicing introduces specific cross-shaped artifacts in frequency space. JPEG quantization adds 8×8 grid artifacts. The result is a highly structured spectrum with a predictable radial decay.

Generated images, especially from latent diffusion, have:

Flatter radial decay above ~0.4 Nyquist (decoder upsamples without optical band-limiting)
Missing demosaic kernels in the corners
No JPEG grid (until they're saved as JPEG)
Sometimes periodic peaks at the latent-grid frequency (8× or 16×)

How analysis works#

1. Compute F = |FFT2(I)|²
2. Center it (DC at center)
3. Compute radial profile: average magnitude in concentric rings
4. Compare profile shape against a learned natural-image distribution

The radial profile is a 1D function of spatial frequency. A real photograph's profile decays roughly as 1/f^α with α ≈ 1.0-1.2. A diffusion output often has α ≈ 0.6-0.8 — too flat, too much energy at high frequencies.

Limits#

FFT detectors break catastrophically on:

Images < 256×256 (not enough samples for stable spectrum)
Heavily compressed images (JPEG flattens high frequencies anyway)
Screenshotted images (the screenshot pipeline reshapes the spectrum)
Extremely smooth content (skies, gradients) where the spectrum is dominated by DC

The 2024 Universal Fake Detection benchmark showed FFT-only detectors dropping from 91% accuracy on raw outputs to 54% on the same images after one Instagram upload cycle.

Sensor noise modeling#

The physics#

Light hitting a photosite triggers electrons. The number of electrons follows a Poisson distribution — shot noise. The sensor electronics add Gaussian read noise. Thermal effects add dark current. Together:

measured = α × Poisson(photons) + Gaussian(0, σ_read) + dark_current

Practical consequences:

Variance scales with brightness (shot noise dominates at high signal)
Read noise dominates in shadows (Gaussian floor)
Color channels have different noise (Bayer sampling + demosaic averaging)
Noise is slightly correlated between adjacent pixels (demosaic kernels touch neighbors)

What real noise looks like#

Plot variance vs. mean luminance from a real photograph and you get a near-linear relationship — the slope and intercept are the camera's photon transfer curve, fully characteristic of the sensor and ISO. Plot the same for a diffusion output and you typically get a flat line: noise variance is independent of brightness, because the decoder added uniform noise (or none).

This photon transfer curve test is one of the most resilient AI detection signals. It survives JPEG, resize, and screenshot — because it's a statistical property of how noise scales, not where it lives in frequency space.

Channel correlation#

A Bayer sensor has 2 green pixels for every red and blue. After demosaicing, the green channel is the average of more samples → lower noise. Real photos have green noise variance roughly 0.5-0.7× red/blue noise variance. Generated images typically have all three channels equally noisy (the decoder treats them symmetrically). This single ratio test catches a surprising fraction of naive humanization attempts.

Combining the signals#

A modern forensic toolkit doesn't pick one signal — it runs all three and combines them with a calibrated meta-classifier. The combination matters more than any single signal:

Signal	Strong against	Weak against
PRNU residual	Raw generative output, light edits	Heavy denoising, generative re-synthesis
FFT radial profile	Latent-diffusion artifacts	Re-compressed images, small images
Photon transfer	Uniform noise injection	Heteroscedastic noise injection

A skilled adversary can defeat any single signal — that's why ensembles are standard. A well-calibrated ensemble might output: PRNU: weak (signal absent), FFT: synthetic-typical (radial too flat), Noise scaling: synthetic-typical (no luminance dependence) → 87% AI-generated.

What this means for humanization#

A serious humanization pipeline must counter all three signals simultaneously and consistently:

Inject a multiplicative high-frequency residual that passes PRNU shape tests
Restore radial spectrum decay so FFT analysis sees natural 1/f^α
Re-create heteroscedastic, channel-asymmetric noise so photon transfer tests pass

Counter only one signal and the others give you away. Counter all three but with inconsistent parameters (e.g., PRNU at ISO 100 amplitude, noise at ISO 6400 amplitude) and you fail an internal consistency check that some forensic tools now run.

The honest takeaway#

Forensics is a real, mature field with decades of literature. AI detection borrows heavily from it, sometimes well and sometimes badly. Understanding the underlying physics — why PRNU exists, what the FFT actually measures, how sensor noise scales — is the difference between treating detection as a black box and treating it as engineering.

If you're building detectors, layer the three signals and expose the breakdown. If you're building humanizers, defeat them coherently — our pipeline does it in ten layers because the signals are not independent. And if you're a journalist or a court reading a forensic report, ask the analyst which signals contributed to the verdict and what their out-of-distribution accuracy was. The number alone tells you almost nothing.