First order scattering transform
1. Introduction
The big challenges of action recognition (variations in class, pose and appearance, occlusion and lightning) requires to design good features for video processing. Some comonly used ones are:
- Space time interest points
- Dense Trajectories
- Body joints
- Motion history images
Historically, these features are fed into an effective Machine Learning classifier (linear SVM being a popular choice)!
2. Action Recognition
There are different level of semantics to characterize what an action is:
- Action “Walking”
- Activity “Walking on grass”
- Event “A soccer game”
2.1. History Images
Temporal Templates Idea: summarize motion in video in a Mo on History Image (MHI): A.F. Bobick et al., 2001, “The Recogni on of Human • Movement Using Temporal Templates”, PAMI 2001
Compute MHI for each ac on sequence. • Describe each sequence with Hu descrip on, [Hu M. IEEE Transac on on Informa on Theory, 1962] • Use Nearest Neighbor ac on classi ca on with Mahalanobis distance between training and test descriptors d.
Dataset: Aerobics Dataset.
- Advantages = Simple + Fast
- Disadvantages =
- Static camera and background
- Sensitive to segmenta on errors
- Silhouettes do not capture interior motion / shape
2.2. Spatio-Temporal Features
A good idea is to extract features corresponding to space-time interest points.
A useful and e ec ve approach is to extract local features as space- me interest points and encode the temporal informa on directly into the local feature. This results into the de ni on of spa o-temporal local features that embed space and me jointly. Videos are considered as volumes of pixels. Spa o-temporal features are located at spa o-temporal salient points that are extracted with interest point operators. Similarly as for the 2D case, interest point structures are searched for that are stable under rota on, viewpoint, scale and illumina on changes. • Space me interest point detectors are extensions of 2D interest point detectors that incorporate temporal informa on.
Most popular soluions Detectors: STIP Spa o Temporal Interest Points (Harris3D) [I. Laptev, IJCV 2005] Dollar’s detector [P. Dollar et al., VS-PETS 2005] Hessian3D [G. Willems et al., ECCV 2008] Descriptors: HOG/HOF [I. Laptev et al., CVPR 2008] Dollar [P. Dollar et al., VS-PETS 2005] HoG3D [A. Klaeser et al., BMVC 2008] Extended SURF [G. Willems et al., ECCV 2008]
STIP: Spatio Temporal Interest Points Detecto
STIP Summar
Dollar’s periodic motion detector
Importance of multiple scales
Descriptors for spatio-temporal patches
3D HoG
Action Recognition
2.3. Dense Trajectories
3. Object/ Action Detection
* Frame level ## 4. Applications
* Vehicle Tracking
* Kinect
