In today's world with the development of information technology and global trends is becoming increasingly difficult to hide information, including statistical nature. A common practice is to provide open access to statistical data for the various kinds of research by any interested parties.

The main source of statistical data is the census, which is aptly described by the Press Secretary Census Office of the United States  M. Wood is the most expensive operation, which the state holds in peacetime. The largest census database in the world is the database project IPUMS-International Minnesota Population Center (University of Minnesota, USA). Currently it has more than 560 million personal records, representing over 80% of the world's population (79 countries / territories).

July 28, 2014, a regular expansion of the project base was carried out by adding thereto census microdata from 8 countries: Ghana, Dominican Republic, Ecuador, Ireland, Liberia, Mali, Nigeria and Ukraine. Thus, the data of 10% of the sample households Ukrainian population census 2001 first became available to researchers from around the world.

Census data can contain information that uniquely identifies the person, such as her (his) personal data (name, passport, etc.). Such information is typically extracted from the database. However, in practice this is not enough. For example, as shown by researcher L. Sweeney, 97% of the American state of Massachusetts inhabitants have a unique combination of full date of birth and zip code. Therefore, to maximize the protection of information about the individual should be used much more sophisticated methods to ensure the anonymity of individual data.

In addition to protecting information about the person, before the publication of the data should also provide an adequate level of protection of the information about a group of individuals. The importance of this problem can be illustrated by a simple example. If there is a data set of information about the place of work of the person and the fact of his stay in the armed forces can find the distribution of the number of military personnel on  the state territory. Maxima in this distribution may indicate the location, such as a military base. To protect this kind of distribution it should be used methods to ensure the anonymity of the group data.

Methods to ensure the anonymity of the data were used in the preparation of micro-Ukrainian census. Leading role in this process was played by Doctor of Technical Sciences Department of Applied Mathematics NTUU"KPI"  O.R.Chertov, who firstly proposed the idea to ensure the anonymity of a group of data, and a representative of his school graduate student D.Yu.Tavrov. With the support of the State Statistics Service of Ukraine, they applied these methods of individual and group anonymization to census, thus providing a reliable level of protection.

For more details about the project IPUMS-International and the access to the data Ukrainian population census 2001 click a link  https: //international.ipums. org/international

 

O.R.Chertov Acting Head of the Department of Applied Mathematics