Science mapping and research evaluation : a novel methodology for creating normalized citation indicators and estimating their stability

Abstract: The purpose of this thesis is to contribute to the methodology at the intersection of relational and evaluative bibliometrics. Experimental investigations are presented that address the question of how we can most successfully produce estimates of the subject similarity between documents. The results from these investigations are then explored in the context of citation-based research evaluations in an effort to enhance existing citation normalization methods that are used to enable comparisons of subject-disparate documents with respect to their relative impact or perceived utility. This thesis also suggests and explores an approach for revealing the uncertainty and stability (or lack thereof) coupled with different kinds of citation indicators.This suggestion is motivated by the specific nature of the bibliographic data and the data collection process utilized in citation-based evaluation studies.The results of these investigations suggest that similarity-detection methods that take a global view of the problem of identifying similar documents are more successful in solving the problem than conventional methods that are more local in scope. These results are important for all applications that require subject similarity estimates between documents. Here these insights are specifically adopted in an effort to create a novel citation normalization approach that – compared to current best practice – is more in tune with the idea of controlling for subject matter when thematically different documents are assessed with respect to impact or perceived utility. The normalization approach is flexible with respect to the size of the normalization baseline and enables a fuzzy partition of the scientific literature. It is shown that this approach is more successful than currently applied normalization approaches in reducing the variability in the observed citation distribution that stems from the variability in the articles’ addressed subject matter. In addition, the suggested approach can enhance the interpretability of normalized citation counts. Finally, the proposed method for assessing the stability of citation indicators stresses that small alterations that could be artifacts from the data collection and preparation steps can have a significant influence on the picture that is painted by the citationindicator. Therefore, providing stability intervals around derived indicators prevents unfounded conclusions that otherwise could have unwanted policy implications.Together, the new normalization approach and the method for assessing the stability of citation indicators have the potential to enable fairer bibliometric evaluative exercises and more cautious interpretations of citation indicators.