Kişisel Verilerin Anonimleştirilmesinin İyileştirilmesine Yönelik Bir Model Geliştirilmesi ve E-Devlet Alanında Uygulanması
Göster/ Aç
Tarih
2019Yazar
Afyonluoğlu , Mustafa
Ambargo Süresi
Acik erisimÜst veri
Tüm öğe kaydını gösterÖzet
Nowadays, the realization that the data emerged through e-Government services is a valuable resource in many fields such as statistics, research and development, artificial intelligence training, service improvement and projection development has increased the need to obtain this data for processing. However, an important part of this large volume of data arising from e-Government services consists of personal data. Privacy is protected by the legislation infrastructure, both in international organizations such as the European Union, with the General Data Protection Regulation (GDPR) and in our country the Law on Protection of Personal Data No. 6698, and anonymization is required for sharing personal data. By means of anonymization with methods such as generalizing or masking certain parts of the data, said data cannot be associated with a single person. K-anonymity, the most well-known of the privacy standards in this regard (provides privacy by keeping individuals in at least k groups of records of the same quasi-identifier value), l -diversity (requires that each equivalence class has at least l well-represented values for each sensitive attribute) and t-closeness (requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table, so the distance between the two distributions should be no more than a threshold t), resulting in a certain level of data loss due to generalization on the data, which reduces the expected benefit from the resulting data set.
In this doctoral study, an innovative, aim oriented, applicable to any kind of data set including e-Government data, adaptive and dynamic anonymization model (ADAM) was introduced by minimizing data benefit loss and providing high improvements in the number of fully suppressed records, taking into account the content of data and adaptively creating records specific to groups. Since e-Government data sets are composed of large volumes, the model introduced should provide the expected improvement in high data volumes. Therefore, in order to measure the level of improvement provided by this heuristic method, an application has been developed which produces synthetic data in the field of health which is one of the important application fields of e-Government and ADAM algorithm has been applied gradually by creating synthetic data sets of starting from 1.000 data sets up to 100.000 people and it has been shown that the proposed model can provide significant improvements compared to existing anonymization methods