Text Representations and Explainability for Political Science Applications

Abstract: This work explores the utility of natural language processing approaches for the study of political behavior by examining two main aspects - representation and explainability. We investigate how current representation approaches capture politically relevant signals in a proportional representation system. In particular we test static word embeddings trained by transfer learning. We find that some signals in the embedding spaces can be validated from domain knowledge, however, there are multiple factors affecting the performance and stability of the results, such as pre-training and frequency of terms. Due to the complexity of current NLP techniques interactions between the model and the political scientist are limited, which can impact the utility of such modeling. Therefore, we turn to explainability and develop a novel approach for explaining a text classifier. Our method extracts relevant features for a whole prediction class and can sort those by their relevance to the political domain. Generally, we find current NLP methods are capable of capturing some politically relevant signals from text, but more work is needed to align the two fields. We conclude that the next step in this work should focus on investigating frameworks such as hybrid models and causality, which can improve both the representation capabilities and the interaction between model and social scientist.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)