2014.11.7. Prof. Chunhua Shen: Encoding high dimensional local features by sparse coding based Fisher vectors

学术报告

当前位置: 首页学术报告

发布时间：2014-10-27 浏览次数：558

报告题目：Encoding high dimensional local features by sparse coding based Fisher vectors

报告人：Prof. Chunhua Shen

主持人：孙仕亮教授

时间：2014年11月7日10：00

地点：信息楼133

报告摘要：

Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) to characterize the generation process of local features. This choice is proved to be sufficient for traditional low dimensional local features, e.g., SIFT; and typically, good performance can be achieved using a mixture of only a few hundred Gaussian distributions. However, the same number of Gaussian distributions could become insufficient to model the feature space spanned by higher dimensional local features, which have become popular recently. In order to improve the modeling capacity for high dimensional features, it turns out to be inefficient and computationally impractical to simply increase the number of Gaussian distributions. In this paper, we propose a generation process in which each local feature is drawn from a Gaussian distribution whose mean vector is sampled from a subspace. With certain approximation, the resulting model can be converted to a sparse coding procedure and the learning/inference problems can be readily solved by standard sparse coding methods. By calculating the gradient vector of the proposed model, we derive a new fisher vector encoding strategy, termed Sparse Coding based Fisher Vector Coding (SCFVC). Moreover, we adopt the recently developed Deep Convolutional Neural Network (CNN) descriptor as high dimensional local features and implement image classification with the proposed SCFVC. Our experimental evaluations demonstrate that our method not only significantly outperforms traditional GMM based Fisher vector encoding but also achieves state-of-the-art performance in generic object recognition, indoor scene and fine-grained image classification problems.

报告人简介：

Chunhua Shen is a Full Professor at School of Computer Science, University of Adelaide. He is a Project Leader and Chief Investigator at the Australian Research Council Centre of Excellence for Robotic Vision. Before he moved to Adelaide as a Senior Lecturer, he was with the computer vision program at NICTA (National ICT Australia), Canberra Research Laboratory for about six years. His research interests are in the intersection of computer vision and statistical machine learning. Recent work has been on real-time object detection, large-scale image retrieval and classification, and scalable nonlinear optimization.

He studied at Nanjing University, at Australian National University, and received his PhD degree from the University of Adelaide. In 2012 he was awarded the Australian Research Council Future Fellowship.

中山北路3663号理科大楼 200062