An End-To-End Convolutional Neural Network Framework For Low-Resolution Attribute Recognition
xmlui.mirage2.itemSummaryView.MetaDataShow full item record
In video surveillance, visual person attributes such as gender, backpack, and type of clothing are crucial for searching and re-identification. For detecting and retrieving these attributes with high accuracy, the availability of high-quality videos is a necessity in general. The details in an image are described by image resolution; the higher the resolution, the more image details. However, in real-world video surveillance systems, videos are usually captured from a far distance, resulting in low-resolution person regions. The technique used for solving this obstacle is super-resolution, which constructs high-resolution images from several observed Low-Resolution images or one single Low-Resolution image. This thesis examines this problem and proposes an end-to-end Convolutional Neural Network that combines a Super Resolution network and Multi-Attribute detection network for more effective Multi-Attribute detection. Our framework consists of joint training of two main parts, the Super-Resolution and attributes learning. We use different Super-Resolution algorithms in the first part of the proposed method. For this purpose, some well-known and high-quality Super-Resolution algorithms were tested, and finally, two methods entitled EDSR and DBPN were selected. We evaluate the proposed method on two benchmark datasets, Market-1051 and DukeMTMC-reID, labeled with some important labels (attributes) and predict every image label. Experimental results on these two benchmark datasets demonstrate the effectiveness of the proposed approach for the Low-Resolution multiple attribute learning task. Furthermore, we also propose a higher-level linear combination scheme of the two network types (with and without super-resolution), yielding superior results in person attribute recognition.