Synthetic microdata

Abstract: This licentiate thesis presents a work focused on synthetic microdata. The main purpose of this thesis is to build a basis for further work in the field of constructing (fabricating and reconstructing) synthetic populations and its objectives are threefold. The first objective is to describe some of the most common methods used for constructing synthetic populations. In addition to this, both data disturbance techniques and missing-data methods are described. The second objective is to test two different methods; the first of which uses a neural network for fabrication of synthetic microdata, while the second uses reconstruction to generate a fictive synthetic population of the future. The third objective is to identify further topics of research. From the two tests being performed, the same conclusions can be made. The first test concerns reconstruction, using a well documented method where microdata units are adjusted until the population achieves selected desired distributions. In the second test, neural networks and optimizing algorithms were used in fabricating a synthetic population. Both tests shows that the synthetic microdata units are good at a unit level. But when comparing the distributions at a population level, significant differences compared to real populations are found. The conclusion is that the main problem when constructing synthetic populations is to combine microdata units in a way that will result in a population with distributions similar to a real one. Other topics for further research includes improved methods for fabricating synthetic populations, methods for generating spatial attributes and additional statistical measures that can be used for evaluating the usability of synthetic populations.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.