Log inSign up
Weidi Xie
571 posts
user avatar
Weidi Xie
@WeidiXie
Computer Vision Researcher. Associate Professor at SJTU, Previously @Oxford_VGG. 中文名:谢伟迪 Personal Webpage: weidixie.github.io
Oxford, England
scholar.google.co.uk/citations?user…
Joined May 2018
622
Following
2,957
Followers
  • user avatar
    Weidi Xie
    @WeidiXie
    Jul 21, 2019
    Check zlai0.github.io/CorrFlow/ We are excited to share code & model for Self-supervised Correspondence Flow (BMVC 2019 Oral) @bmvc2019, State-of-the-art performance on video segmentation and pose tracking. @Oxford_VGG
    Image
    GIF
  • user avatar
    Weidi Xie
    @WeidiXie
    Feb 1, 2022
    Personal update: After spending seven wonderful years at Oxford, I've decided to take new adventure. I'm joining Shanghai Jiao Tong University from this year 🐯.
  • user avatar
    Weidi Xie
    @WeidiXie
    Jun 26, 2020
    Tracking objects is among the first skills human infants learn, surely this must be a task without semantic understanding. We present a SOTA self-supervised tracking approach, all you need is just 10min raw videos, zero annotations required. arxiv.org/pdf/2006.12480… @Oxford_VGG
    Image
    Image
    Image
    Image
  • user avatar
    Weidi Xie
    @WeidiXie
    Apr 29, 2024
    A tiny milestone in my academic journey. I know these metrics do not carry much significance in today's academic landscape. Nevertheless, they serve as a personal gauge, allowing me to assess the papers' impact and reflect on if I've contributed something meaningful.
    Image
    Image
    17K
  • user avatar
    Weidi Xie
    @WeidiXie
    May 21, 2021
    Code: Self-supervised Video Object Segmentation by Motion Grouping: github.com/charigyang/mot… We show that self-supervised segmentation can be done purely motions.
    Image
    00:00
  • user avatar
    Weidi Xie
    @WeidiXie
    Aug 15, 2023
    ICCV23 work on Open-vocabulary Object Segmentation with Diffusion Models - we do visual instruction tuning on pre-trained diffusion model, to simultaneously generate image and open-vocabulary masks. - it can create synthetic datasets for training discriminative model for free.
    Image
    Image
    Image
    Image
    12K
  • user avatar
    Weidi Xie
    @WeidiXie
    Oct 16, 2023
    Can GPT-4V(vision) serve medical applications? We present recent efforts on assessing GPT-4V for multimodal medical diagnosis, by case studies, covering 17 human body systems, across 8 clinical imaging modalities, e.g., radiology, pathology. 🔥Report: drive.google.com/file/d/1kPDWgw…
    Image
    Image
    Image
    Image
    52K
  • user avatar
    Weidi Xie
    @WeidiXie
    May 18, 2023
    Just read Med-PaLM 2, the progress of LLMs in medical question answering is incredible ! but, I think multimodal medical question answering is quite far behind, here I present you, PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering: arxiv.org/pdf/2305.10415…
    Image
    Image
    Image
    Image
    17K
  • user avatar
    Weidi Xie
    @WeidiXie
    Jul 19, 2022
    Happy to share the work, "Visual-Language Models for Efficient Video Understanding" at ECCV2022. We benchmark 10 different datasets for various tasks, it turns out that, simply prompting CLIP can achieve comparable or sota results on many video tasks already. #ECCV2022
    Image
  • user avatar
    Weidi Xie
    @WeidiXie
    May 22, 2020
    We are releasing the code and model for #VGGSound A new large-scale audio-visual dataset, it was collected with audio-visual correspondence, accessible via: robots.ox.ac.uk/~vgg/data/vggs… codes & model: github.com/hche11/VGGSound
    Image
    Image
  • user avatar
    Weidi Xie
    @WeidiXie
    May 3, 2019
    arxiv.org/abs/1905.00875 We investigate self-supervised learning on video correspondence flow. If done properly, the self-supervised learning can be surprisingly powerful (closing the gap to supervised learning). We demonstrate state-of-the-art results on video segmentation.
    Image
    Image
  • user avatar
    Weidi Xie
    @WeidiXie
    Jun 19, 2020
    We are presenting our new paper at LUV2020 workshop today at 16:15 - 16:30pm. MAST: A Memory-Augmented Self-Supervised Tracker, by @LaiZihang, @erika_lu_, @Oxford_VGG. A strong tracking model trained with no manual annotation. Code: github.com/zlai0/MAST #VGGatCVPR2020
    Image
    GIF
  • user avatar
    Weidi Xie
    @WeidiXie
    Oct 3, 2021
    Also best paper on CVPR RVSU Workshop. TL;DR: We propose a self-supervised learning approach for segmentation based on motions, ie, Gestalt Principle. Achieve strong performance to strong supervision on several popular benchmarks, e.g. DAVIS2016, MoCA (camouflage detection).
    user avatar
    charig yang
    @chaaarig
    Oct 3, 2021
    Check out our paper at @ICCV_2021! Self-supervised Video Object Segmentation by Motion Grouping (w/ @hala_lamdouar, @erika_lu_, Andrew Zisserman & @WeidiXie) Project page (paper+video+code): charigyang.github.io/motiongroup/
    Image
    GIF
  • user avatar
    Weidi Xie
    @WeidiXie
    Apr 5, 2023
    Happy to share the paper of "Self-supervised Tumor Segmentation with Sim2Real Adaptation" published in IEEE Journal of Biomedical and Health Informatics. The model enables zero-shot tumor segmentation with Sim2Real training, requiring zero/few annotation from physicians.
    Image
    Image
    7.5K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up