Learning Visual Saliency For Static And Dynamic Scenes
Özet
The ultimate aim in visual saliency estimation is to mimic human visual system in predicting image regions which grab our attention. In the literature, many different features and models have been proposed, but still one of the key questions is how different features contribute to saliency. In this study, we try to get a better understanding of the integration of visual features to build more effective saliency models. Towards this goal, we investigated several machine learning techniques and analyze their saliency estimation performance in static and dynamic scenes. First, multiple kernel learning is employed in static saliency estimation, which provides an intermediate level fusion of features. Second, a thorough analysis is carried out for saliency estimation in dynamic scenes. Lastly, we proposed a fully unsupervised adaptive feature integration scheme for dynamic saliency estimation, which gives superior results compared to the approaches that use fixed set of parameters in fusion stage. Since the existing methods in the literature are far behind in accomplishing human level saliency estimation, we believe that our approaches provide new insights in this challenging problem.