Variant Grassmann Manifolds: A Representation Augmentation Method for Action Recognition


In classification tasks, classifiers trained with finite examples might generalize poorly to new data with unknown variance. For this issue, data augmentation is a successful solution where numerous artificial examples are added to training sets. In this article, we focus on the data augmentation for improving the accuracy of action recognition, where action videos are modeled by linear dynamical systems and approximately represented as linear subspaces. These subspace representations lie in a non-Euclidean space, named Grassmann manifold, containing points as orthonormal matrixes. It is our concern that poor generalization may result from the variance of manifolds when data come from different sources or classes. Thus, we introduce infinitely many variant Grassmann manifolds (VGM) subject to a known distribution, then represent each action video as different Grassmann points leading to augmented representations. Furthermore, a prior based on the stability of subspace bases is introduced, so the manifold distribution can be adaptively determined, balancing discrimination and representation. Experimental results of multi-class and multi-source classification show that VGM softmax classifiers achieve lower test error rates compared to methods with a single manifold.

ACM Transactions on Knowledge Discovery from Data
Junyuan Hong
Junyuan Hong
Postdoctoral Fellow

My research interest lies in the interaction of human-centered AI and healthcare.