Publications

This is a real-time multi-object tracker that ranks first in many MOT evaluation datasets.

We introduce MetaFuse, a pre-trained fusion model learned from a large number of cameras in the Panoptic dataset.

We present an approach to estimate 3D poses of multiple people from multiple camera views.

Semantic image segmentation is an important yet unsolved problem. One of the major challenges is the large variability of the object …

We address the problem of recovering absolute 3D human poses from multi-view images by incorporating multi-view geometric priors into …

We propose Locally Connected Network for 3D human pose estimation. It extracts and propagates features over the pose graph and can …

we focus on obtaining high quality object linking results for better classification. Unlike previous methods that link objects by …

Estimating 3D human poses from 2D joint positions is an ill-posed problem, and is further complicated by the fact that the estimated 2D …

We propose a method for estimating 3D human poses from single images or video sequences.

We propose a variant of archetypal analysis which scales gracefully to large datasets. The core idea is to decouple the binding between …

We address the problem of video object segmentation which outputs the masks of a target object throughout a video given only a bounding …

We address the task of action recognition from a sequence of 3D human poses. This is a challenging task firstly because the poses of …

A key-pose-motif contains a set of ordered poses or action units(a short sequence of poses), which are required to be close but not …

Pose-based action recognition in 3D is the task of recognizing an action (e.g., walking or running) from a sequence of 3D skeletal …

We propose a method of estimating 3D human poses from a single image, which works in conjunction with an existing 2D pose detector.

We propose one of the earliest approach for pose-based action recognition in wild videos.