Audio-Visual Emotion Recognition Using Deep Operational Networks
Özet
Emotions have a significant impact on interpersonal communication, marketing, healthcare, and the service sector. Consequently, much study continues to be conducted on the categorization of emotions up to the present day. Audio-visual emotion detection is a common area of study in the realm of machine learning. Its primary objective is to identify and categorize human emotions. It utilizes computer vision and audio processing methods to assess and understand the emotional states conveyed by people. But most of the studies are conducted by analyzing one type of data, such as texts or images. This research introduces an operational neural network-based deep learning model that utilizes various inputs to provide emotion identification. The proposed model employs a comprehensive strategy that incorporates both visual and audio features. The suggested architecture substitutes conventional convolutional layers with operational layers. The experimental findings indicate that the operational convolutional architecture outperforms the traditional convolutional neural network design.