AI & ML interests

Generative Computer Vision

Recent Activity

toshasΒ 
posted an update about 2 months ago
view post
Post
844
Introducing StereoSpace -- our new end-to-end method for turning photos into stereo images without explicit geometry or depth maps. This makes it especially robust with thin structures and transparencies. Try the demo below:

🌐 Project: prs-eth/stereospace_web
πŸ“• Paper: StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space (2512.10959)
πŸ™ Code: https://github.com/prs-eth/stereospace
πŸ€— Demo: toshas/stereospace
πŸ€— Weights: prs-eth/stereospace-v1-0

By ETH ZΓΌrich ( @behretj , @Bingxin , @konradschindler ), University of Bologna ( @fabiotosi92 , @mpoggi ), HUAWEI Bayer Lab ( @toshas ).
toshasΒ 
posted an update 2 months ago
view post
Post
2261
Introducing πŸ‡¨πŸ‡­WindowSeatπŸ‡¨πŸ‡­ –– our new method for removing reflections from photos taken through windows, on planes, in malls, offices, and other glass-filled environments.

Finetuning a foundation diffusion transformer for reflection removal quickly runs up against the limits of what existing datasets and techniques can offer. To fill that gap, we generate physically accurate examples in Blender that simulate realistic glass and reflection effects. This data enables strong performance on both established benchmarks and previously unseen images.

To make this practical, the open-source Apache-2 model builds on Qwen-Image-Edit-2509, a 20B image-editing diffusion transformer that runs on a single GPU and can be fine-tuned in about a day. WindowSeat keeps its use of the underlying DiT cleanly separated from the data and training recipe, allowing future advances in base models to be incorporated with minimal friction.

Try it out with your own photos in this interactive demo:
πŸ€— toshas/windowseat-reflection-removal

Other resources:
🌎 Website: huawei-bayerlab/windowseat-reflection-removal-web
πŸŽ“ Paper: Reflection Removal through Efficient Adaptation of Diffusion Transformers (2512.05000)
πŸ€— Model: huawei-bayerlab/windowseat-reflection-removal-v1-0
πŸ™ Code: https://github.com/huawei-bayerlab/windowseat-reflection-removal

Team: Daniyar Zakarin ( @daniyarzt )*, Thiemo Wandel ( @thiemo-wandel )*, Anton Obukhov ( @toshas ), Dengxin Dai.
*Work done during internships at HUAWEI Bayer Lab
toshasΒ 
posted an update about 1 year ago
view post
Post
1472
Introducing ⇆ Marigold-DC β€” our training-free zero-shot approach to monocular Depth Completion with guided diffusion! If you have ever wondered how else a long denoising diffusion schedule can be useful, we have an answer for you!

Depth Completion addresses sparse, incomplete, or noisy measurements from photogrammetry or sensors like LiDAR. Sparse points aren’t just hard for humans to interpret β€” they also hinder downstream tasks.

Traditionally, depth completion was framed as image-guided depth interpolation. We leverage Marigold, a diffusion-based monodepth model, to reframe it as sparse-depth-guided depth generation. How the turntables! Check out the paper anyway πŸ‘‡

🌎 Website: https://marigolddepthcompletion.github.io/
πŸ€— Demo: prs-eth/marigold-dc
πŸ“• Paper: https://arxiv.org/abs/2412.13389
πŸ‘Ύ Code: https://github.com/prs-eth/marigold-dc

Team ETH ZΓΌrich: Massimiliano Viola ( @mviola ), Kevin Qu ( @KevinQu7 ), Nando Metzger ( @nandometzger ), Bingxin Ke ( @Bingxin ), Alexander Becker, Konrad Schindler, and Anton Obukhov ( @toshas ). We thank
Hugging Face for their continuous support.