User is exploring the use of physics-informed constraints in transformers for holographic reconstruction and is curious about similar approaches others have tried.
Hey everyone, I’ve been working on a way to handle the "twin-image" problem in lensless in-line holography, which has been a headache for decades. While standard CNNs are okay, they usually fail to capture the global nature of diffraction patterns and get crushed by real-world sensor noise. We just put out HoloPASWIN, and I wanted to share the code and some findings. **What’s different about this approach?** 1. Swin Blocks for Global Context: Unlike standard convolutions that only look locally, we used Swin Transformers to catch the long-range dependencies in the diffraction patterns. 2. Physics-Aware Loss: Instead of treating it like a pure "black-box" image-to-image task, we baked a differentiable Angular Spectrum Propagator into the training loop. This forces the model to stay physically consistent. 3. Data & Noise: We trained this on 25k samples using a pretty aggressive noise model (Dark current, Shot, Read noise, etc.) to see how it holds up outside of clean, synthetic environments. We managed to get a \~15dB PSNR jump over standard ASM and it's looking significantly cleaner than basic ViT architectures for phase retrieval. I’m curious if anyone here has tried similar "physics-informed" constraints with transformers? We found the differentiable propagator really helped with convergence, but it’s definitely more computationally expensive during training. Would love any feedback or questions on the architecture! Repo: [https://github.com/electricalgorithm/holopaswin](https://github.com/electricalgorithm/holopaswin) Paper: [https://arxiv.org/abs/2603.04926](https://arxiv.org/abs/2603.04926)