Enhanced Deep Learning Model to Detect Anomalies in Surveillance Videos
Abstract
Increased security challenges and advancements in technology have led to heavy usage of surveillance cameras. This has resulted in an overwhelming abundance of video data which requires automated analytics for better utilization. The big volume of the video data generated by surveillance devices presents an enormous problem to the security personnel since they must monitor the footage frame by frame to identify the abnormal activities (security threats) like violence, and thuggery, among others. Successful identification of anomalies in surveillance footage will ease the work of Closed-Circuit Television (CCTV) operators greatly since they can search through a big volume of the video data easily. Another importance of this research is the contribution to computer vision since the model can be applied in other areas like robotic surveillance or unmanned surveillance. There have been attempts to automate the surveillance process using smart surveillance. However, these solutions are challenged due to high error rates and inefficiency while identifying abnormal scenes. Modern automated video analytics, use deep learning algorithms like; Convolutional Neural Networks (CNN), Long-Short Term Memory (LSTM), convolutional LSTM and 3DCNN.These approaches have their strengths and weaknesses, and it becomes a research challenge to determine the best model to use in detecting anomalies. Another challenge presented herein is the accuracy of detecting anomalies in surveillance videos. A comparative study was carried out to cross-examine deep learning models used in anomaly detection. Empirical data was collected to measure the accuracy of the deep learning models in anomaly detection. The best model was determined by analyzing the accuracies of the model published since 2016. Experiments were set up in Google Collab and Google Cloud. These environments were configured to use Python 3.7, Keras and TensorFlow machine learning frameworks. The study improved the selected deep learning model through, optimization of the model structure and depth tuning. The study found that deeper autoencoders have high prediction accuracy and deeper spatial autoencoders draws more features from the videos and that increases their accuracy. Validation of the enhanced model was done through further experiments that compared the prediction accuracy acquired from the enhanced model against the existing model set as the control group. Their Receiver Operating Characteristic Curve (ROC) scores from UCSD Ped1 and Ped2 datasets were compared. Comparative analysis of the recorded model accuracies was tabulated and a percentage increase in the model accuracy was noted. A sign test was used to test the significance of the improvement and at both 1% and 5% significance levels, empirical evidence of the enhancement was found. This work contributed to the autoencoder design paradigms, improvement of Spatial-Temporal Autoencoder accuracy through depth and regularization tuning and reduction of anomaly detection errors in surveillance videos. The study has shown that the depth of spatial-temporal autoencoder impacts its anomaly prediction accuracy. In future work, integration of continual learning and real-time anomaly detection should be considered.