
您现在的位置是:首页 >  Java


ECCV2022 &CVPR2022论文速递2022.7.22!47篇 demo

2023-02-18 15:47:05 时间





ECCV2022 | XMem: 高质量长期视频分割!


标题:XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model




我们提出了 XMem,这是一种用于长视频的视频对象分割架构,具有统一的特征内存存储,受 Atkinson-Shiffrin 内存模型的启发。先前关于视频对象分割的工作通常只使用一种类型的特征记忆。对于超过一分钟的视频,单个特征内存模型将内存消耗和准确性紧密联系在一起。相比之下,遵循 Atkinson-Shiffrin 模型,我们开发了一种架构,该架构包含多个独立但深度连接的特征记忆存储:快速更新的感觉记忆、高分辨率工作记忆和紧凑的持续长期记忆。至关重要的是,我们开发了一种记忆增强算法,该算法通常将积极使用的工作记忆元素整合到长期记忆中,从而避免记忆爆炸并最大限度地减少长期预测的性能衰减。结合新的内存读取机制,XMem 在长视频数据集上的性能大大超过了最先进的性能,同时在短视频上与最先进的方法(不适用于长视频)相当数据集。



Updated on : 22 Jul 2022

total number : 47

Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions

  • 论文/Paper: http://arxiv.org/pdf/2207.10667
  • 代码/Code: https://github.com/theo2021/onda

TinyViT: Fast Pretraining Distillation for Small Vision Transformers

  • 论文/Paper: http://arxiv.org/pdf/2207.10666
  • 代码/Code: https://github.com/microsoft/cream

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset

  • 论文/Paper: http://arxiv.org/pdf/2207.10664
  • 代码/Code: https://github.com/visipedia/ssw60

In Defense of Online Models for Video Instance Segmentation

  • 论文/Paper: http://arxiv.org/pdf/2207.10661
  • 代码/Code: https://github.com/wjf5203/vnext

Novel Class Discovery without Forgetting

  • 论文/Paper: http://arxiv.org/pdf/2207.10659
  • 代码/Code: None

Generative Multiplane Images: Making a 2D GAN 3D-Aware

  • 论文/Paper: http://arxiv.org/pdf/2207.10642
  • 代码/Code: https://github.com/apple/ml-gmpi

Approximate Differentiable Rendering with Algebraic Surfaces

  • 论文/Paper: http://arxiv.org/pdf/2207.10606
  • 代码/Code: None

Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression

  • 论文/Paper: http://arxiv.org/pdf/2207.10564
  • 代码/Code: https://github.com/jinyeying/night-enhancement

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection

  • 论文/Paper: http://arxiv.org/pdf/2207.10448
  • 代码/Code: None

Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration

  • 论文/Paper: http://arxiv.org/pdf/2207.10447
  • 代码/Code: https://github.com/164140757/scm

Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

  • 论文/Paper: http://arxiv.org/pdf/2207.10436
  • 代码/Code: https://github.com/guoleisun/vss-mrcfa

Human Trajectory Prediction via Neural Social Physics

  • 论文/Paper: http://arxiv.org/pdf/2207.10435
  • 代码/Code: https://github.com/realcrane/human-trajectory-prediction-via-neural-social-physics

D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights

  • 论文/Paper: http://arxiv.org/pdf/2207.10398
  • 代码/Code: https://github.com/vtp-tl/d2-tpred

FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling

  • 论文/Paper: http://arxiv.org/pdf/2207.10392
  • 代码/Code: None

Error Compensation Framework for Flow-Guided Video Inpainting

  • 论文/Paper: http://arxiv.org/pdf/2207.10391
  • 代码/Code: None

NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition

  • 论文/Paper: http://arxiv.org/pdf/2207.10388
  • 代码/Code: None

Pose for Everything: Towards Category-Agnostic Pose Estimation

  • 论文/Paper: http://arxiv.org/pdf/2207.10387
  • 代码/Code: https://github.com/luminxu/Pose-for-Everything.

Temporal Saliency Query Network for Efficient Video Recognition

  • 论文/Paper: http://arxiv.org/pdf/2207.10379
  • 代码/Code: None

LocVTP: Video-Text Pre-training for Temporal Localization

  • 论文/Paper: http://arxiv.org/pdf/2207.10362
  • 代码/Code: https://github.com/mengcaopku/locvtp

CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution

  • 论文/Paper: http://arxiv.org/pdf/2207.10345
  • 代码/Code: https://github.com/cheeun/cadyq

UFO: Unified Feature Optimization

  • 论文/Paper: http://arxiv.org/pdf/2207.10341
  • 代码/Code: None

OIMNet++: Prototypical Normalization and Localization-aware Learning for Person Search

  • 论文/Paper: http://arxiv.org/pdf/2207.10320
  • 代码/Code: None

AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection

  • 论文/Paper: http://arxiv.org/pdf/2207.10316
  • 代码/Code: https://github.com/zehuichen123/autoalignv2

SeedFormer: Patch Seeds based Point Cloud Completion with Upsample Transformer

  • 论文/Paper: http://arxiv.org/pdf/2207.10315
  • 代码/Code: https://github.com/hrzhou2/seedformer

AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields

  • 论文/Paper: http://arxiv.org/pdf/2207.10312
  • 代码/Code: None

Towards Accurate Open-Set Recognition via Background-Class Regularization

  • 论文/Paper: http://arxiv.org/pdf/2207.10287
  • 代码/Code: None

Grounding Visual Representations with Texts for Domain Generalization

  • 论文/Paper: http://arxiv.org/pdf/2207.10285
  • 代码/Code: https://github.com/mswzeus/gvrt

DeltaGAN: Towards Diverse Few-shot Image Generation with Sample-Specific Delta

  • 论文/Paper: http://arxiv.org/pdf/2207.10271
  • 代码/Code: https://github.com/bcmi/deltagan-few-shot-image-generation

Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis

  • 论文/Paper: http://arxiv.org/pdf/2207.10257
  • 代码/Code: https://github.com/jgkwak95/surf-gan

SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition

  • 论文/Paper: http://arxiv.org/pdf/2207.10256
  • 代码/Code: None

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks

  • 论文/Paper: http://arxiv.org/pdf/2207.10237
  • 代码/Code: https://github.com/apple/ml-spin

MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis

  • 论文/Paper: http://arxiv.org/pdf/2207.10228
  • 代码/Code: None

On Label Granularity and Object Localization

  • 论文/Paper: http://arxiv.org/pdf/2207.10225
  • 代码/Code: https://github.com/visipedia/inat_loc

Spotting Temporally Precise, Fine-Grained Events in Video

  • 论文/Paper: http://arxiv.org/pdf/2207.10213
  • 代码/Code: None

2D GANs Meet Unsupervised Single-view 3D Reconstruction

  • 论文/Paper: http://arxiv.org/pdf/2207.10183
  • 代码/Code: None

Controllable and Guided Face Synthesis for Unconstrained Face Recognition

  • 论文/Paper: http://arxiv.org/pdf/2207.10180
  • 代码/Code: None

Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles

  • 论文/Paper: http://arxiv.org/pdf/2207.10172
  • 代码/Code: None

GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation Learning

  • 论文/Paper: http://arxiv.org/pdf/2207.10158
  • 代码/Code: https://github.com/seleucia/goca

Visual Knowledge Tracing

  • 论文/Paper: http://arxiv.org/pdf/2207.10157
  • 代码/Code: https://github.com/nkondapa/visualknowledgetracing

Tackling Long-Tailed Category Distribution Under Domain Shifts

  • 论文/Paper: http://arxiv.org/pdf/2207.10150
  • 代码/Code: https://github.com/guxiao0822/lt-ds

Latent Discriminant deterministic Uncertainty

  • 论文/Paper: http://arxiv.org/pdf/2207.10130
  • 代码/Code: https://github.com/ensta-u2is/ldu

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance

  • 论文/Paper: http://arxiv.org/pdf/2207.10123
  • 代码/Code: https://github.com/zzh-tech/Animation-from-Blur.

BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis

  • 论文/Paper: http://arxiv.org/pdf/2207.10120
  • 代码/Code: https://github.com/dmoltisanti/brace

Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach

  • 论文/Paper: http://arxiv.org/pdf/2207.10188
  • 代码/Code: None

Structural Causal 3D Reconstruction

  • 论文/Paper: http://arxiv.org/pdf/2207.10156
  • 代码/Code: None

AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation

  • 论文/Paper: http://arxiv.org/pdf/2207.10141
  • 代码/Code: None

Continual Variational Autoencoder Learning via Online Cooperative Memorization

  • 论文/Paper: http://arxiv.org/pdf/2207.10131
  • 代码/Code: https://github.com/dtuzi123/ovae
