
您现在的位置是:首页 >  其他


ECCV2022 &CVPR2022论文速递2022.7.19!

amp论文 19 速递 CVPR2022 ECCV2022 2022.7
2023-06-13 09:15:50 时间





ECCV2022 | XMem: 高质量长期视频分割!


标题:XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model




我们提出了 XMem,这是一种用于长视频的视频对象分割架构,具有统一的特征内存存储,受 Atkinson-Shiffrin 内存模型的启发。先前关于视频对象分割的工作通常只使用一种类型的特征记忆。对于超过一分钟的视频,单个特征内存模型将内存消耗和准确性紧密联系在一起。相比之下,遵循 Atkinson-Shiffrin 模型,我们开发了一种架构,该架构包含多个独立但深度连接的特征记忆存储:快速更新的感觉记忆、高分辨率工作记忆和紧凑的持续长期记忆。至关重要的是,我们开发了一种记忆增强算法,该算法通常将积极使用的工作记忆元素整合到长期记忆中,从而避免记忆爆炸并最大限度地减少长期预测的性能衰减。结合新的内存读取机制,XMem 在长视频数据集上的性能大大超过了最先进的性能,同时在短视频上与最先进的方法(不适用于长视频)相当数据集。



Updated on : 19 Jul 2022

total number : 37

Rethinking Data Augmentation for Robust Visual Question Answering

  • 论文/Paper: http://arxiv.org/pdf/2207.08739
  • 代码/Code: https://github.com/ItemZheng/KDDAug

Semantic Novelty Detection via Relational Reasoning

  • 论文/Paper: http://arxiv.org/pdf/2207.08699
  • 代码/Code: None

Label2Label: A Language Modeling Framework for Multi-Attribute Learning

  • 论文/Paper: http://arxiv.org/pdf/2207.08677
  • 代码/Code: https://github.com/Li-Wanhua/Label2Label

Action-based Contrastive Learning for Trajectory Prediction

  • 论文/Paper: http://arxiv.org/pdf/2207.08664
  • 代码/Code: None

Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes

  • 论文/Paper: http://arxiv.org/pdf/2207.08656
  • 代码/Code: https://github.com/UncleMEDM/InstPIFu

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs

  • 论文/Paper: http://arxiv.org/pdf/2207.08630
  • 代码/Code: https://github.com/iceli1007/FakeCLR.

Class-incremental Novel Class Discovery

  • 论文/Paper: http://arxiv.org/pdf/2207.08605
  • 代码/Code: https://github.com/OatmealLiu/class-iNCD

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation

  • 论文/Paper: http://arxiv.org/pdf/2207.08549
  • 代码/Code: None

DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection

  • 论文/Paper: http://arxiv.org/pdf/2207.08531
  • 代码/Code: https://github.com/SPengLiang/DID-M3D.

Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation

  • 论文/Paper: http://arxiv.org/pdf/2207.08485
  • 代码/Code: https://github.com/NUST-Machine-Intelligence-Laboratory/HFAN

Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding

  • 论文/Paper: http://arxiv.org/pdf/2207.08455
  • 代码/Code: None

TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers

  • 论文/Paper: http://arxiv.org/pdf/2207.08409
  • 代码/Code: https://github.com/Sense-X/TokenMix

MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects

  • 论文/Paper: http://arxiv.org/pdf/2207.08403
  • 代码/Code: None

Adversarial Contrastive Learning via Asymmetric InfoNCE

  • 论文/Paper: http://arxiv.org/pdf/2207.08374
  • 代码/Code: https://github.com/yqy2001/A-InfoNCE

SepLUT: Separable Image-adaptive Lookup Tables for Real-time Image Enhancement

  • 论文/Paper: http://arxiv.org/pdf/2207.08351
  • 代码/Code: None

Learning with Recoverable Forgetting

  • 论文/Paper: http://arxiv.org/pdf/2207.08224
  • 代码/Code: None

Fast-MoCo: Boost Momentum-based Contrastive Learning with Combinatorial Patches

  • 论文/Paper: http://arxiv.org/pdf/2207.08220
  • 代码/Code: None

Zero-Shot Temporal Action Detection via Vision-Language Prompting

  • 论文/Paper: http://arxiv.org/pdf/2207.08184
  • 代码/Code: https://github.com/sauradip/STALE

Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal

  • 论文/Paper: http://arxiv.org/pdf/2207.08178
  • 代码/Code: None

FashionViL: Fashion-Focused Vision-and-Language Representation Learning

  • 论文/Paper: http://arxiv.org/pdf/2207.08150
  • 代码/Code: https://github.com/BrandonHanx/mmf.

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context

  • 论文/Paper: http://arxiv.org/pdf/2207.08132
  • 代码/Code: https://github.com/kyleleey/E-NeRV.

CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement

  • 论文/Paper: http://arxiv.org/pdf/2207.08082
  • 代码/Code: None

Neural Color Operators for Sequential Image Retouching

  • 论文/Paper: http://arxiv.org/pdf/2207.08080
  • 代码/Code: https://github.com/amberwangyili/neurop

Semi-Supervised Keypoint Detector and Descriptor for Retinal Image Matching

  • 论文/Paper: http://arxiv.org/pdf/2207.07932
  • 代码/Code: None

Learning Quality-aware Dynamic Memory for Video Object Segmentation

  • 论文/Paper: http://arxiv.org/pdf/2207.07922
  • 代码/Code: https://github.com/workforai/QDMN.

SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection

  • 论文/Paper: http://arxiv.org/pdf/2207.07898
  • 代码/Code: https://github.com/Hydragon516/SPSN

JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes

  • 论文/Paper: http://arxiv.org/pdf/2207.07895
  • 代码/Code: at~\href{https://github.com/sunnyHelen/JPerceiver}{https://github.com/sunnyHelen/JPerceiver}.

You Should Look at All Objects

  • 论文/Paper: http://arxiv.org/pdf/2207.07889
  • 代码/Code: None

NeFSAC: Neurally Filtered Minimal Samples

  • 论文/Paper: http://arxiv.org/pdf/2207.07872
  • 代码/Code: https://github.com/cavalli1234/NeFSAC.

CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS

  • 论文/Paper: http://arxiv.org/pdf/2207.07868
  • 代码/Code: https://github.com/walkerning/aw_nas.

TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval

  • 论文/Paper: http://arxiv.org/pdf/2207.07852
  • 代码/Code: None

Cross-Domain Cross-Set Few-Shot Learning via Learning Compact and Aligned Representations

  • 论文/Paper: http://arxiv.org/pdf/2207.07826
  • 代码/Code: https://github.com/WentaoChen0813/CDCS-FSL

Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization

  • 论文/Paper: http://arxiv.org/pdf/2207.07818
  • 代码/Code: https://github.com/zh460045050/BagCAMs.

Self-calibrating Photometric Stereo by Neural Inverse Rendering

  • 论文/Paper: http://arxiv.org/pdf/2207.07815
  • 代码/Code: https://github.com/junxuan-li/SCPS-NIR

Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection

  • 论文/Paper: http://arxiv.org/pdf/2207.07783
  • 代码/Code: https://github.com/SRA2/SPELL

Towards Understanding The Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search

  • 论文/Paper: http://arxiv.org/pdf/2207.08350
  • 代码/Code: None

TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance

  • 论文/Paper: http://arxiv.org/pdf/2207.07861
  • 代码/Code: https://github.com/yanjh97/TransGrasp.
