Variant Grassmann Manifolds: A Representation Augmentation Method for Action Recognition


In classification tasks, classifiers trained with finite examples might generalize poorly to new data with unknown variance. For this issue, data augmentation is a successful solution where numerous artificial examples are added to training sets. In this article, we focus on the data augmentation for improving the accuracy of action recognition, where action videos are modeled by linear dynamical systems and approximately represented as linear subspaces. These subspace representations lie in a non-Euclidean space, named Grassmann manifold, containing points as orthonormal matrixes. It is our concern that poor generalization may result from the variance of manifolds when data come from different sources or classes. Thus, we introduce infinitely many variant Grassmann manifolds (VGM) subject to a known distribution, then represent each action video as different Grassmann points leading to augmented representations. Furthermore, a prior based on the stability of subspace bases is introduced, so the manifold distribution can be adaptively determined, balancing discrimination and representation. Experimental results of multi-class and multi-source classification show that VGM softmax classifiers achieve lower test error rates compared to methods with a single manifold.

ACM Transactions on Knowledge Discovery from Data
Junyuan Hong
Junyuan Hong
CSE PhD Student

My research interests include distributed learning, data privacy and deep learning.

comments powered by Disqus