Technical Details & Model Architecture

Hybrid Pipeline Overview

We sculpt high-fidelity synthetic data first, then use it to translate real-world clear frames into adverse conditions:

S – Simulation: CARLAi CARLA: Open-source autonomous driving simulator providing high-fidelity synthetic images with pixel-perfect labels.
Visit CARLA → renders pixel-perfect clear/adverse pairs with full annotations.
D – Diffusion: Stable Diffusioni Stable Diffusion: Latent diffusion model for high-detail image synthesis.
Learn more → / ALDMi ALDM: Adaptive Latent Diffusion Model that refines realism using segmentation guidance.
View ALDM paper → boosts realism, guided by segmentation masks.
G – GAN Adaptation: DA-UNITi DA-UNIT: Domain Adaptation with Unsupervised Image-to-Image Translation Networks.
View paper → learns on the curated S + D pairs plus a 10% mix of real ACDC-Cleari ACDC-Clear: Subset of the Adverse Conditions Dataset containing clear-weather driving images.
Visit ACDC → frames.
Inference: Feed any clear ACDC frame → DA-UNIT returns a photorealistic fog, rain, or night image with labels preserved.

Our novel blending approach addresses key challenges in the generation process:

Adaptive merging of diffusion output with original simulated images
Mitigation of artifacts (e.g., distorted vehicles)
Preservation of photorealistic enhancements (e.g., wet roads, nighttime lighting)

The enhanced training process combines multiple data sources:

Performance highlights (ACDC):

78.57 % mIoU on ACDC-Adverse (test) — obtained with zero adverse-weather images in training.
+1.85 % mIoU on ACDC (val) overall, versus the baseline ( REINi REIN: Robust Enhancement via Instance Normalization pre-trained on Cityscapes and fine-tuned on ACDC-Clear.
View Paper → pre-trained on Cityscapes → finetuned on ACDC-Clear ) .
Night subset: +4.62 % mIoU on ACDC-Night (val) over the same baseline.