Words like “leadership” and “motivation” are constructs. It means that they are constructed – we invented them. Just like constructions (or constructs) such as shareholding companies, car brands, and justice, these constructs are only real if we act as if they are. The facts that they are constructed does not mean that such phenomena are nonsense. To the contrary, like most other inventions they serve useful purposes. They still remain linguistic constructions, though. We can now show that questionnaires, frequently used in research on leadership, tend to be predictable before we ask people to fill them out. This is because they mostly tell us about how people talk about leadership and motivation, and not so much about what actually happens in practice. Through the use of digital text algorithms, we can now show how these words are constructed, and how the research on leadership tends to be research on language more than anything else.
Most of these research articles are published as Open Access, which means that you may download them for free. Here are a few of them if you want to read more about it:
Can we trust what surveys tell us about leadership?
Around the year 2012, my friend Kai Larsen and I started wondering about the data stemming from Likert-scale surveys. In 2014, we published this article, demonstrating how most survey studies on leadership are picking up self-evident data patterns. The relationships in the data are given a priori through language. It means that we can use computers to predict what people will answer. Here is the publication:
Arnulf, J. K., Larsen, K. R., Martinsen, O. L., & Bong, C. H. (2014). Predicting survey responses: how and why semantics shape survey statistics on organizational behaviour. PLoS ONE, 9(9), e106361. doi:10.1371/journal.pone.0106361
We may sometimes predict how people will score before they respond!
After publishing this, we could demonstrate another weird phenomenon as well. If the structures in the data can be known in advance – before asking anyone – it should be possible to guess what people will answer before they actually do! This is a bit more complicated but in the article that follows, we have whown how it could principally be possible. If we know a person’s first answers to a survey, we can use semantic algorithms to guess pretty well what the rest of the answers might be. The article is here:
But surveys on leadership and organizational behavior may be culture blind:
If the text algorithms can predict the data structures in one language, it is because the statistics simply reflect the meaning of the questions. Therefore, if the questionnaire is correctly translated (and the correlations are indeed due to semantics), the text algorithms will predict across languages. We tested this out among Chinese, Pakistanis, Norwegians, Germans, native English speakers and a bunch of other people and this is exactly what we found: The algorithms predicted the bulk of the statistics across all languages. There was virtually nothing left that could count as “culture”. Leadership surveys (and similar instruments) will be culture blind if they are based on semantic relationships:
Arnulf, J. K., & Larsen, K. R. (2020). Culture blind leadership research: How semantically determined survey data may fail to detect cultural differences. Frontiers in Psychology, 11(176). doi:https://doi.org/10.3389/fpsyg.2020.00176
People in different job types are equally motivated! Scores are similar because they read the questions differently:
We let 399 people from 18 very different job types fill out a questionnaire on motivation. According to previous theories, people with more autnomy, feedback and task variation should be more motivated than others. Moreover, pay for performance has been accused of destroying “intrinsic motivation”, the pleasure of doing work for its own sake. We found only indications of this. In our numbers all respondents, such as priests, sex-workers, CEOs and soldiers were all predominantly intrinsically motivated. They were less concerned with making money. Also, they were all committed to their organizations, and working with high effort and quality. Looking at this using semantic algorithms, it appears that people in different jobs seem to understand the questions in different ways. Different people in different situations may respond with the same score levels because they interpret the survey differently. This has consequences for how to compare motivational levels across job types. The full article is here:
Arnulf, J. K., Nimon, K., Larsen, K. R., Hovland, C. V., & Arnesen, M. (2020). The Priest, the Sex Worker, and the CEO: Measuring Motivation by Job Type. Frontiers in Psychology, 11, 1321. doi:10.3389/fpsyg.2020.01321
The statistics derived from some types of survyes is actually not reflecting what the questions are about:
The most common understanding of questionnaires is that the responses reflect people’s attitude or rather attitude strength. Someone responding with the score 5 (indicating “strongly agree”) on a question will display a stronger attitude than someone responding with a 1 (indicating “do not agree at all”). The most common type of statistics applied to such responses will explore how much responses to the various questions co-vary. For example, one may want to see if people who are satisfied with their managers also are less likely to quit their job. In this study from 2018 we found something strange. When the responses are semantically determined, the attitude strength is actually filtered out of the statistics. We could show that only the semantic relationships were remaining in the statistics from the respondents. Their attitude strength was gone. This is possibly the most difficult of the articles to understand, but also the one with the most problematic philosophic implications. The original is here (unfortunately not as open source publication):
Arnulf, J. K., Larsen, K. R., Martinsen, O. L., & Egeland, T. (2018). The failing measurement of attitudes: How semantic determinants of individual survey responses come to replace measures of attitude strength. Behav Res Methods, 50(6), 2345-2365. doi:10.3758/s13428-017-0999-y
The development of langauge about leadership and motivation can be traced across time and social groups:
To the extent that survey results are predictable, it will be because of their embededness in language. We can therefore try to trace how constructs like leadership, motivation and results have emerged over the years and among groups of people. The following article published in 2018 showed how the development of workplace-related language also shapes responses to surveys on leadership:
Arnulf, J. K., Larsen, K. R., & Martinsen, Ø. L. (2018). Semantic algorithms can detect how media language shapes survey responses in organizational behaviour. PLoS ONE, 13(2), 1-26. doi:https://doi.org/10.1371/journal.pone.0207643
People struggle to differ between leaders and heroes because these concepts share so much meaning:
Our ideas about leadership are so strongy determined by langauge that we tend to expect things of leaders simply because of the associations that the words evoke. One funny (or ominous) effect of this is how we quickly will believe that leaders are a sort of heroes. Or that heroes also should be leaders. Both ideas lead to exaggerated expectations abut leaders. This in turn seems to make most people disappointed by real flesh-and-blood leaders. Our own bosses are usually diappointingly different from the linguistic stereotype. You can read about it in this article:
Arnulf, J. K., & Larsen, K. R. (2015). Overlapping semantics of leadership and heroism: Expectations of omnipotence, identification with ideal leaders and disappointment in real managers. Scandinavian Psychologist, 2(e3). doi:10.15714/scandpsychol.2.e3
This means that we can use digital algorithms to break free from our own cognitive limitations:
We obviously do not know what we already know. That is why we can compute the relationships of survey statistics without asking anyone, and be surprised by it. The american philosopher Daniel Dennett says about human speakers that we are “competent without comprehension”: Most of us are able to speak a language, but cannot explain exactly how we do it. Language therefore contains a lot of knowledge that we could possibly use, but that we are unable to exploit consciously. In this way, we can get lost in our own linguistic constructions of the world. Language is like a huge labyrinth of words and meaningful expressions where we can “discover” insights that were accessible in there all the time. Jan Smedslund is a Norwegian professor in psychology who has worked on this for decades. He has warned us that much social science is unable to escape this labyrinth. In his words, we are doing “pseudo-empirical” research, which mostly re-discovers what is necessarily true given the semantic premises in language. I have written a chapter in a book that does homage to a lifetime of Smedslund’s work. In this chapter, I try to show how the text algorithms may offer a way out of the labyrinth. We can possibly use the algorithms to explore the limitations of our own linguistic constructs. I am here leaning a bit on the philosophers Gottlob Frege, Ludwig Wittgenstein, and Bertrand Russell. The book chapter is available here (regrettably not as oen access):
Arnulf, J. K. (2020). Wittgenstein’s revenge: How semantic algorithms can help survey research escape Smedslund’s labyrinth. In T. G. Lindstad, E. Stänicke, & J. Valsiner (Eds.), Respect for Thought; Jan Smedslund’s Legacy for Psychology (pp. 285-307). Cham: Springer.
Should you want to try out the semantic method itself, we explain it here:
This is a methods article, explaining the sematic algorithms and how to use them. A previous study from Human Resource Development on training is given as an example. And you can also find data and computer syntax to play around with to do it yourself:
Arnulf, J. K., Larsen, K., & Dysvik, A. (2018). Measuring Semantic Components in Training and Motivation: A Methodological Introduction to the Semantic Theory of Survey Response. Human Resource Development Quarterly, 30(1), 17-38. doi:https://doi.org/10.1002/hrdq.21324