An Evaluation of Statistical Matching Methods: An Application on Turkey Income and Living Conditions Survey and Household Budget Survey
View/ Open
Date
2022-08-04Author
Özkan, Cengiz
xmlui.dri2xhtml.METS-1.0.item-emb
Acik erisimxmlui.mirage2.itemSummaryView.MetaData
Show full item recordAbstract
The dissertation aims to evaluate the effectiveness of statistical matching methods from a comparative perspective. Since the studies in the literature mostly focus on non-parametric micro methods, it is aimed to conduct a study that deals with macro, micro, mixed, parametric and non-parametric methods in a holistic and comparative way as well as to observe the effects of different donor classes, and interventions in the sample size. In addition, it is aimed to expand the procedures regarding the selection processes of matching variables by including survey design variables and weights for the first time and to re-evaluate their effectiveness. With the inclusion of options in the processes, it is also aimed to observe the efficiency of matching between methods, to determine the practical limitations and to test the issues that are open to intervention.
Applications were made on the selection of matching variables and statistical matching methods using the 2018 datasets of the Turkey Statistics on Income and Living Conditions and Household Budget Survey, which have complex sample design features. After the survey data were harmonized, parametric, non-parametric and mixed methods were applied at the macro and micro levels, considering the mentioned breakdowns to produce the outputs. Imputation procedure, random hot deck, rank hot deck and nearest neighbor distance hot deck were used in non-parametric micro methods.
Statistical matching methods, which allow the production of high quality, faster, lower cost and timeliness data by using existing data sources such as administrative records and survey data, also have the potential to provide positive contributions in terms of theoretical statistical approaches such as reducing the response burden and interviewer bias. The method is also used for demography studies that aim to find the correlation between poverty and fertility. The results show that weighted and unweighted micro matching applications provide us with highly accurate and reliable estimations. Although the limitations of the mixed methods regarding the size of the observations have been determined, it has been observed that they are effective in producing quality synthetic data. Parametric methods, on the other hand, did not give the expected quality results on data integration.