Mass Predıctıon From 2D Images
Özet
The primary goal of this thesis is to estimate the mass of objects using 2D images. Although mass estimation was initially the primary goal, the scope was narrowed to "volume estimation from 2D images" by assuming that density is constant and normalized to one throughout the study, and the thesis's objective evolved to estimating object volumes using 2D images.
This thesis addresses two critical and interconnected computer vision problems by presenting a comprehensive framework for depth map and volume estimation from 2D images. Depth maps provide a dense representation of geometric information, which is critical for a variety of applications including robotics, augmented reality, and 3D reconstruction. Furthermore, volume estimation has substantial implications for medical imaging, manufacturing, and environmental monitoring. The proposed methodology estimates accurate depth maps while also incorporating advanced feature extraction techniques to estimate object volumes with high precision.
The methodology combines traditional image processing and advanced machine learning methods. To capture complex spatial and contextual information, "U-Net Based Convolutional Neural Network" architecture trained on the NYU Depth V2 dataset is used to estimate depth maps. These depth maps are then processed with feature extraction methods like ORB (Oriented FAST and Rotated BRIEF) and HOG (Histogram of Oriented Gradients) to produce robust and understandable feature sets for volume estimation. To improve model performance, data imbalance is addressed with the Synthetic Minority Oversampling Technique for Gaussian Noisy Regression (SMOGN), and feature attention mechanisms are integrated with CBAM to enhance feature extraction.
Volume estimation is performed using regression models such as LightGBM, CatBoost, and XGBoost, and also with ensemble methods such as stacking, voting, bagging, and boosting. These ensemble methods aim to improve prediction accuracy. Furthermore, outlier detection techniques such as Isolation Forest and KMeans clustering are used to improve data integrity and enable the construction of specific models for different data segments.
Experimental results show major improvements in prediction accuracy across metrics including Mean Square Error (MSE), Mean Absolute Error (MAE), and R-squared scores. The integration of SMOGN with ensemble methods for data balancing dramatically enhances prediction performance. The proposed hybrid framework not only outperforms others in volume prediction, but it also provides an effective strategy for managing data imbalance and ensuring model generalizability across diverse datasets.
This research's results underscore the significance of adapting models according to particular data features and dealing with dataset imbalances. This framework establishes a strong foundation for future research in domain adaptation, transfer learning, and real-time application development, and it expands its effect beyond the limits of current research to real-world applications.