Advanced Deep Learning Architectures for Action Recognition, Video Analysis and Classification
During past couple of weeks i have been doing some research in the area of Deep Learning and its application in Computer Vision. I was working on a project where the challenge was to design a Deep Learning Model for Smart Home Applications. The motivation for the project was to detect different human actions from a video installed in a smart home, so that an activity monitor can be designed. This activity monitor would have numerous applications particularly in healthcare. Such as detecting a person's overall activity and deciding if they are doing enough to stay healthy each day. Thus, one of the task that i worked on was designing a Deep Learning architecture that would detect a "Stretching Body". For that i started off with very basic Deep Learning Architectures of using Convolutional Neural Network with an LSTM in the end to learn thr temporal aspects in a video. This has been discussed in detail in my other repository names as "Activity Recognition in Videos using Keras".
I designed 2 of the complex architectures for the detecting stretching body. I used Keras's Functional API to design those Architectures. Both the Architectures are based on Ensembling and Fusing different architectures together. These architectures contains two of the important base architectures in Deep Learning for feature extraction in Video Classification Tasks. One is the 3D-CNN and other is a Convolutional Neural Network architecture with Time Distributed Layer to learn frame level features and then an LSTM to learn the Temporal aspects. I used Keras library with TensorFlow backend in Python for constructing these architectures. The code for both architectures is provided separately.
The project was divided in 3 phases, where the first phase was of defining a baseline model which i have uploaded in the other repository, with the name of "Activity Recognition in Videos using Keras". I have put all the details regarding phase 2 and 3 in this repository. For the 2nd and 3rd Phase i used data generation, augmentation and complex model architecture techniques to improve the model's performance. So, the Python Code where i used Keras's functional API to define the architectures is provided along with the reports for each of the Phase 2 and 3 to better understand the work