Producing Synthetic Person Images with Deep Generative Artificial Neural Networks
Özet
Producing synthetic person images has wide variety of applications including digital photo sharing and editing, visual surveillance, fashion and art design and human interactive autonomous machines, among others. In the scope of this thesis, we explored two problems related to person image generation, namely attribute based person image generation and language guided editing of person images, especially for outfits. While the former problem considers generating realistic person images using attributes like pose, gender, clothes, whether a bag is present or not etc., the latter focuses on editing an outfit image through natural sentences and accordingly generating new outfits while keeping the unstated sections in the text description untouched.
Realization of synthetic person image generation processes is quite difficult due to several reasons such as foreground/background, partial occlusion, stance of a person, camera angle and distance, complex relationships between attributes or natural language descriptions and unbalanced and poor quality data.
In this thesis, we developed conditional generative adversarial network based models to solve each problem. With quantitative and qualitative experiments, we have shown that our first attribute related model produces reasonable synthetic person images and our language guided second model generates more plausible results than the baseline work with better localization capability when generating new outfits consistent with the target text descriptions.