Why DDIM Hallucinates More Than DDPM: A Theoretical Analysis of Reverse Dynamics
Muhammad H. Ashiq†·Samanyu Arora†·Abhinav N. Harish·Ishaan Kharbanda·Hung Yun Tseng·

† Equal contribution
We theoretically study the hallucination phenomena in two canonical diffusion samplers: the stochastic Denoising Diffusion Probabilistic Model (DDPM) and the deterministic Denoising Diffusion Implicit Model (DDIM). We analyze the reverse ODE (DDIM) and SDE (DDPM) for a Gaussian mixture target, proving that after a critical time, (a) DDIM can become stuck on the segment connecting the two nearest modes and (b) DDPM stochasticity helps it become unstuck from this region, thus avoiding hallucination. Our empirical validation verifies that DDPM has a significantly lower hallucination rate than DDIM when this region is entered. Building on our observations, we exhibit how using additional stochastic steps can help DDIM avoid hallucinations and offer new insights on how to design improved samplers.
What We Demonstrate
We characterize where and when mode interpolation hallucinations arise in DDIM, describe why DDPM hallucinates less than DDIM, and validate our results empirically.
How DDIM Hallucinates
We rigorously study the source of DDIM hallucinations as observed in an N-mode Gaussian mixture, demonstrating that after a critical time, DDIM trajectories converge to the nearest line segment joining two modes and then can become stuck near the midpoint, thus ending up in regions of low probability mass. These are "mode interpolation hallucinations" (Aithal et al., 2024).
The DDPM Escape Mechanism
We leverage this to provide a theoretical justification for why DDIM hallucinates more than DDPM: the noise of DDPM can help it become unstuck from this hallucination region around the midpoint.
Experiments
We invalidate that the DDIM hallucination rate gap can be explained by step skipping and demonstrate that adding a few DDPM steps after DDIM converges near the midpoint neighborhood can help the trajectory escape, lowering hallucination rate.
Theory in Motion
Frame-by-frame simulations showing the theoretical phenomena playing out in practice. Each animation corresponds to a key concept or theorem from the paper. Click any GIF to enlarge.
Mode Interpolations

DDIM on a 25-mode Gaussian grid: the left panel traces the reverse trajectory while the right panel demonstrates the final sample distribution. Out of 500 samples, the legend reports ~5.2% of samples as mode interpolations: regions of low probability mass between the true Gaussian modes (Aithal et al., 2024).

DDPM on a 25-mode Gaussian grid. Out of 500 samples, the legend reports ~0.2% of samples as mode interpolations, significantly less than DDIM.
Assumptions

We assume that there exists a time, so that the trajectory is sufficiently closer (dotted red line) to a pair of modes "i" and "j" than any other pair. We find that this assumption holds empirically, visualized by the red diamond for a DDIM trajectory with 50 steps.

We assume that there exists a time so that the distance between modes "i" and "j" (solid blue) is sufficiently large (dotted blue). This assumption is natural for an analysis of mode interpolation of hallucinations, since otherwise there are no regions of low probability mass.
Key Results

We demonstrate that after Asm. 4.1 holds in the reverse process, DDIM trajectories converge towards the line segment joining modes "i" and "j" exponentially fast.
We prove that after Asm. 4.4 holds in the reverse process, trajectories that land on the line segment near a mode attract to a stable trajectory near that mode.

We prove that after Asm. 4.4 holds, in a neighborhood around the midpoint, DDIM trajectories (orange) can get stuck while DDPM trajectories (blue) can escape, explaining the hallucination rate gap between the two samplers.
Empirical Results
Click any figure to enlarge. These experiments demonstrate that discretization is not responsible for DDIM's higher hallucination rate and validate our theoretical results.
We find that DDIM hallucinates more than DDPM across discretization levels, invalidating that DDIM discretization explains its higher hallucination rate.
DDIM and DDPM trajectories both converge toward the nearby line segment after Asm. 4.1 holds.
We find for trajectories started near the midpoint after Asm. 4.1 and 4.4 hold, DDIM trajectories get stuck while DDPM trajectories escape. Furthermore, adding DDPM timesteps to DDIM samplers helps escape the midpoint neighborhood, suggesting new approaches for hybrid diffusion samplers (Xu et al., 2023).
References
[1] Aithal, S. K., Maini, P., Lipton, Z. C., and Kolter, J. Z. Understanding hallucinations in diffusion models through mode interpolation. In Advances in Neural Information Processing Systems (NeurIPS), 2024.
[2] Xu, Y., Deng, M., Cheng, X., Tian, Y., Liu, Z., and Jaakkola, T. S. Restart sampling for improving generative processes. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
BibTeX
If you find this work useful, please cite our paper.
@inproceedings{ashiq2026why,
title = {Why DDIM Hallucinates More Than DDPM: A Theoretical Analysis of Reverse Dynamics},
author = {Ashiq, Muhammad H. and Arora, Samanyu and Harish, Abhinav N. and Kharbanda, Ishaan and Tseng, Hung Yun and Chrysos, Grigorios G.},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2026},
}Acknowledgements. We are truly thankful to the reviewers for their thoughtful and constructive feedback. We are also thankful to Zulip for supporting our online communication.