FlashVSR

Added on Nov 2025
82 Visits
0 Upvotes

About FlashVSR

Delivers real-time diffusion-based video super-resolution with ~17 FPS processing speed for 768×1408 resolution footage on a single A100 GPU, achieving up to 12× speedup over prior one-step diffusion VSR models . • Features a three-stage train-friendly distillation pipeline (joint image-video training → sparse causal attention adaptation → one-step distillation) to enable efficient streaming inference without quality loss . • Incorporates locality-constrained sparse attention to reduce redundant computation, bridge the train-test resolution gap, and support stable scaling to ultra-high resolutions (e.g., 1440p) . • Integrates a tiny conditional decoder (TC-Decoder) that accelerates high-resolution reconstruction by ~7x while preserving texture detail and visual consistency . • Ensures superior temporal coherence via frame-to-frame conditioning, eliminating flickering and ghosting artifacts common in frame-by-frame upscaling methods . • Supports flexible resolution upscaling (2× to 4×) for converting low-resolution content to 4K, with capabilities to remove noise, compression artifacts, and restore fine textures . • Leverages the VSR-120K large-scale dataset (120K videos + 180K images) for robust model training, enabling generalization across real-world, streaming, and AIGC footage . • Offers broad format compatibility (MP4, MOV, WebM, OGG, MKV) and API/SDK integration with webhook notifications for seamless workflow automation . • Maintains low latency with only 8-frame lookahead, supporting streaming scenarios and production-grade real-time processing .