This book constitutes the refereed proceedings of CVM 2025, the 13th International Conference on Computational Visual Media, held in Hong Kong SAR, China, in April 2025. The 67 full papers were carefully reviewed and selected from 335 submissions. The papers are organized in topical sections as follows: Part I: Medical Image Analysis, Detection and Recognition, Image Enhancement and Generation, Vision Modeling in Complex Scenarios Part II: 3D Geometry and Rendering, Generation and Editing, Image Processing and Optimization Part III: Image and Video Analysis, Multimodal Learning, Geometrical Processing, Applications …mehr
This book constitutes the refereed proceedings of CVM 2025, the 13th International Conference on Computational Visual Media, held in Hong Kong SAR, China, in April 2025.
The 67 full papers were carefully reviewed and selected from 335 submissions. The papers are organized in topical sections as follows:
Part I: Medical Image Analysis, Detection and Recognition, Image Enhancement and Generation, Vision Modeling in Complex Scenarios
Part II: 3D Geometry and Rendering, Generation and Editing, Image Processing and Optimization
Part III: Image and Video Analysis, Multimodal Learning, Geometrical Processing, Applications
DepthFisheye: Efficient Fine-Tuning of Depth Estimation Models for Fisheye Cameras.- DIMATrack: Dimension Aware Data Association for Multi-Object Tracking.- Efficient Transformer Network for Visible and Ultraviolet Object Tracking.- LightGR-Transformer: Light Grouped Residual Transformer for Multispectral Object Detection.- ADMMOA: Attribute-Driven Multimodal Optimization for Face Recognition Adversarial Attacks.- Training-Free Language-Guided Video Summarization via Multi-Grained Saliency Scoring.-
Multimodal Learning
Reinforced Label Denoising for Weakly-Supervised Audio-Visual Video Parsing.- Bridging the Modality Gap: Advancing Multimodal Human Pose Estimation with Modality-Adaptive Pose Estimator and Novel Benchmark Datasets.- Momentum-Based Uni-Modal Soft-Label Alignment and Multi-Modal Latent Projection Networks for Optimizing Image-Text Retrieval.- Multi-Granularity and Multi-Modal Prompt Learning for Person Re-Identification.- Local and Global Feature Cross-attention Multimodal Place Recognition.- IML-CMM - A Multimodal Sentiment Analysis Framework Integrating Intra-Modal Learning and Cross-Modal Mixup Enhancement.-
Geometrical Processing
MCFG with GUMAP: A Simple and Effective Clustering Framework on Grassmann Manifold.- Joint UMAP for Visualization of Time-Dependent Data.- Unsupervised Domain Adaptation on Point Cloud Classification via Imposing Structural Manifolds into Representation Space.-
Applications
Learning Adaptive Basis Fonts to Fuse Content Features for Few-shot Font Generation.- TaiCrowd: A High-Performance Simulation Framework for Massive Crowd.-Feature Disentanglement and Fusion Model for Multi-Source Domain Adaptation with Domain-Specific Features.- A Trademark Retrieval Method Based on Self-Supervised Learning.- Weaken Noisy Feature: Boosting Semi-Supervised Learning by Noise Estimation.- Multi-Dimension Full Scene Integrated Visual Emotion Analysis Network.- Gap-KD: Bridging the Significant Capacity Gap Between Teacher and Student Model.
DepthFisheye: Efficient Fine-Tuning of Depth Estimation Models for Fisheye Cameras.- DIMATrack: Dimension Aware Data Association for Multi-Object Tracking.- Efficient Transformer Network for Visible and Ultraviolet Object Tracking.- LightGR-Transformer: Light Grouped Residual Transformer for Multispectral Object Detection.- ADMMOA: Attribute-Driven Multimodal Optimization for Face Recognition Adversarial Attacks.- Training-Free Language-Guided Video Summarization via Multi-Grained Saliency Scoring.-
Multimodal Learning
Reinforced Label Denoising for Weakly-Supervised Audio-Visual Video Parsing.- Bridging the Modality Gap: Advancing Multimodal Human Pose Estimation with Modality-Adaptive Pose Estimator and Novel Benchmark Datasets.- Momentum-Based Uni-Modal Soft-Label Alignment and Multi-Modal Latent Projection Networks for Optimizing Image-Text Retrieval.- Multi-Granularity and Multi-Modal Prompt Learning for Person Re-Identification.- Local and Global Feature Cross-attention Multimodal Place Recognition.- IML-CMM - A Multimodal Sentiment Analysis Framework Integrating Intra-Modal Learning and Cross-Modal Mixup Enhancement.-
Geometrical Processing
MCFG with GUMAP: A Simple and Effective Clustering Framework on Grassmann Manifold.- Joint UMAP for Visualization of Time-Dependent Data.- Unsupervised Domain Adaptation on Point Cloud Classification via Imposing Structural Manifolds into Representation Space.-
Applications
Learning Adaptive Basis Fonts to Fuse Content Features for Few-shot Font Generation.- TaiCrowd: A High-Performance Simulation Framework for Massive Crowd.-Feature Disentanglement and Fusion Model for Multi-Source Domain Adaptation with Domain-Specific Features.- A Trademark Retrieval Method Based on Self-Supervised Learning.- Weaken Noisy Feature: Boosting Semi-Supervised Learning by Noise Estimation.- Multi-Dimension Full Scene Integrated Visual Emotion Analysis Network.- Gap-KD: Bridging the Significant Capacity Gap Between Teacher and Student Model.
Es gelten unsere Allgemeinen Geschäftsbedingungen: www.buecher.de/agb
Impressum
www.buecher.de ist ein Internetauftritt der buecher.de internetstores GmbH
Geschäftsführung: Monica Sawhney | Roland Kölbl | Günter Hilger
Sitz der Gesellschaft: Batheyer Straße 115 - 117, 58099 Hagen
Postanschrift: Bürgermeister-Wegele-Str. 12, 86167 Augsburg
Amtsgericht Hagen HRB 13257
Steuernummer: 321/5800/1497
USt-IdNr: DE450055826