Classification of large pollen datasets using neural networks with application to mapping and modelling pollen data

University dissertation from Department of Geology, Lund University

Abstract: This thesis concerns the usage of large pollen databases and their application to mapping and modelling past vegetation. Maps of past taxon distributions are generated and classification techniques are used to compile maps of past woodland types. These visualisations of pollen data have applications in forest ecology and in modelling the
impacts of climate change. Maps of the distribution limits of Picea abies in southern Scandinavia are compared with output from a bioclimatic model to explore distribution-climate relationships during the last 1500 years. Further a classification technique is used to map distributions of Danish forest types over the last 3000 years. Classification is done by assigning a sample to a group or a category of similar properties. The categories in this case are woodland types. The classification model is an artificial neural network as trained on an entire database of actual pollen assemblages, resulting in a classification model able to classify pollen samples to a woodland type. This classification model is then used on the grid of interpolated fossil pollen assemblages to produce woodland history maps. Classification methods group the most similar samples, but somewhere a decision has to be made on how many classes or groups to use. I have developed a method for choosing the number of classes that have the highest reproducibility . This is an objective, repeatable method for assessing the optimal number of clusters in a multivariate dataset.