Learning to track and identify players from broadcast sports videos

Tracking and identifying players in sports videos filmed with a single pan-tilt-zoom camera has many applications, but it is also a challenging problem. This thesis introduces the first intelligent system that tackles this difficult task. The system possesses the ability to detect and track multiple...

Full description

Bibliographic Details
Main Author: Lu, Wei-Lwun
Language:English
Published: University of British Columbia 2012
Online Access:http://hdl.handle.net/2429/39956
id ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-39956
record_format oai_dc
spelling ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-399562014-03-26T03:38:30Z Learning to track and identify players from broadcast sports videos Lu, Wei-Lwun Tracking and identifying players in sports videos filmed with a single pan-tilt-zoom camera has many applications, but it is also a challenging problem. This thesis introduces the first intelligent system that tackles this difficult task. The system possesses the ability to detect and track multiple players, estimates the homography between video frames and the court, and identifies the players. The tracking system is based on the tracking-by-detection philosophy. We first localize players using a player detector, categorize detections based on team colors, and then group them into tracks of specific players. Instead of using visual cues to distinguish between players, we instead rely on their short-term motion patterns. The homography estimation is solved by using a variant of the Iterated Closest Points (ICP). Unlike most existing algorithms that rely on matching robust feature points, we propose to match edge points in two images. In addition, we also introduce a technique to update the model online to accommodate logos and patterns in different stadiums. The identification system utilizes both visual and spatial cues, and exploits both temporal and mutual exclusion constraints in a Conditional Random Field. In addition, we propose a novel Linear Programming Relaxation algorithm for predicting the best player identification in a video clip. In order to reduce the number of labeled training data required to learn the identification system, we pioneer the use of weakly supervised learning with the assistance of play-by-play texts. Experiments show promising results in tracking, homography estimation, and identification. Moreover, weakly supervised learning with play-by-play texts greatly reduces the number of labeled training data required. Experiments show that we can use weakly supervised learning with merely 200 labels to achieve similar accuracies to a strongly supervised approach, which requires at least 20000 labels. 2012-01-09T19:09:18Z 2012-01-09T19:09:18Z 2011 2012-01-09 2012-05 Electronic Thesis or Dissertation http://hdl.handle.net/2429/39956 eng University of British Columbia
collection NDLTD
language English
sources NDLTD
description Tracking and identifying players in sports videos filmed with a single pan-tilt-zoom camera has many applications, but it is also a challenging problem. This thesis introduces the first intelligent system that tackles this difficult task. The system possesses the ability to detect and track multiple players, estimates the homography between video frames and the court, and identifies the players. The tracking system is based on the tracking-by-detection philosophy. We first localize players using a player detector, categorize detections based on team colors, and then group them into tracks of specific players. Instead of using visual cues to distinguish between players, we instead rely on their short-term motion patterns. The homography estimation is solved by using a variant of the Iterated Closest Points (ICP). Unlike most existing algorithms that rely on matching robust feature points, we propose to match edge points in two images. In addition, we also introduce a technique to update the model online to accommodate logos and patterns in different stadiums. The identification system utilizes both visual and spatial cues, and exploits both temporal and mutual exclusion constraints in a Conditional Random Field. In addition, we propose a novel Linear Programming Relaxation algorithm for predicting the best player identification in a video clip. In order to reduce the number of labeled training data required to learn the identification system, we pioneer the use of weakly supervised learning with the assistance of play-by-play texts. Experiments show promising results in tracking, homography estimation, and identification. Moreover, weakly supervised learning with play-by-play texts greatly reduces the number of labeled training data required. Experiments show that we can use weakly supervised learning with merely 200 labels to achieve similar accuracies to a strongly supervised approach, which requires at least 20000 labels.
author Lu, Wei-Lwun
spellingShingle Lu, Wei-Lwun
Learning to track and identify players from broadcast sports videos
author_facet Lu, Wei-Lwun
author_sort Lu, Wei-Lwun
title Learning to track and identify players from broadcast sports videos
title_short Learning to track and identify players from broadcast sports videos
title_full Learning to track and identify players from broadcast sports videos
title_fullStr Learning to track and identify players from broadcast sports videos
title_full_unstemmed Learning to track and identify players from broadcast sports videos
title_sort learning to track and identify players from broadcast sports videos
publisher University of British Columbia
publishDate 2012
url http://hdl.handle.net/2429/39956
work_keys_str_mv AT luweilwun learningtotrackandidentifyplayersfrombroadcastsportsvideos
_version_ 1716656192052264960