Psychographic traits identification based on Political Ideology: An Author Analysis Study on Spanish Politicians' tweets posted in 2020
In general, people are usually more reluctant to follow advice and directions from politicians who are not of their ideology. In extreme cases, people can be heavily biased in favour of a political party at the same time they are in blinded disagreement with others, which makes for irrational decision making and it can put people's lives at risk by ignoring certain recommendations from the authorities. Therefore, considering political ideology as a psychographic trait can improve political micro-targeting by helping public authorities and local governments to design better communication policies during crises. In this work we explore the reliability of determining psychographic traits concerning political ideology. Our contribution is twofold. On the one hand, we release the PoliCorpus-2020, a dataset composed by Spanish politicians' tweets posted in 2020. On the other hand, we conduct two author analysis tasks with the aforementioned dataset: an author profiling task to extract demographic and psychographic traits, and an author attribution task to determine the author of an anonymous text in the political domain. Both experiments are evaluated with several neural network architectures grounded on explainable linguistic features, statistical features, and state-of-the-art transformers. In addition, we test if the neural network models can be extrapolated to detect the political ideology of non-politician citizens. Our results indicate that the linguistic features are good indicators for identifying fine-grained political affiliation, they boost the performance of neural networks models when combined with embedding-based features, and they preserve relevant information when the models are tested with citizens who are not politicians. Besides, we found that lexical and morphosyntax features are more effective on author profiling whereas stylometric features are more effective in author attribution.
```config/``` Contains the configuration of the PoliCorpus-2020
```code/``` Contains the scripts
```assets/``` contains assets (dataset, features, models, evaluations) for each dataset
```embeddings/``` Contains pretrained word embeddings models