DERİN ÖĞRENME İLE GRUP HAREKETLERİNİN SABİT RESİM ÜZERİNDEN TANINMASI
Özet
The main problem focused in this thesis is inferring the group activity information from still images and classifying them. Activity information is often meaningful when analyzed based on the timeline. This is one of the reasons that complicates the problem. For example, if two people are not in the same activity but are standing side-by-side in a still image, they will most likely be classified to have the same activity. For reasons like this, classification of group activities in still images is a challenging problem. To overcome these difficulties, detection and classification of individual human from still image should be done with high accuracy. At the same time, this approach constitutes the first part of the thesis.
Deep learning techniques, which are yielding successful results in object detection and classification problems in recent years, are preferred methods to solve the problems dealt with in the scope of this thesis. It is important that the features should be well chosen to represent individual humans and groups in images. The success of deep learning techniques also provides more advantages than other methods at this point. These features can be automatically learned by the model in deep learning approaches.
There are additional challenges associated with choosing deep learning methods as a base problem solvers.
%When deep learning methods are chosen for problem solving, some other challenges occurred.
At the top of these difficulties is deciding a deep learning model that is suitable for problem. Since we are dealing with classification problems, Convolutional Neural Networks (CNN) \cite{paper1} have been chosen to be adapted for group activity recognition. As a model to be used in this method, ResNet \cite{paper2} architecture, which is preferred for complex classification problems in recent years, has been preferred. Another difficulty in the field of deep learning is that decide size and variety of dataset . SGD \cite{paper3} was preferred as a dataset in the thesis. The most challenging issue in the thesis is that in the preferred dataset \cite{paper3}, the singular human orientation and group activity classes are not able to increase the classification performance due to the lack of sufficient numbers and diversity of samples. In addition to group activity information, joint and segment informations were also used to overcome these difficulties. In order to merge these informations into the deep learning process, fusion processes were performed and then results were observed.
Within the thesis, it can be observed that the success of the detection and classification is achieved by the choice of the deep learning techniques.