Measuring the Similarities of Twitter Hashtags for Agriculture in the Czech Language

DOI 10.7160/aol.2019.110410
No 4/2019, December
pp. 105-112

Sabou, J. P., Cihelka, P., Ulman, M. and Klimešová, D. (2019) “Measuring the Similarities of Twitter Hashtags for Agriculture in the Czech Language ", AGRIS on-line Papers in Economics and Informatics, Vol. 11, No. 4, pp. 105-112. ISSN 1804-1930. DOI 10.7160/aol.2019.110410.


Our paper presents first analysis of Czech Twitter content within the agriculture context. We deployed textual analysis of more than 240,000 tweets over 2014-2019 hashtags that were, according to Google Trends, most trending and related to Czech agriculture such as #dotace, #repka, or #bionafta – both in Czech and English language. Besides descriptive statistics of the tweet dataset, we visualized keyword correlations which revealed strong focus of the discourse on rapeseed, biofuel and the prime minister Andrej Babiš. Owing to inherent political context of the given hashtags, we found spikes in topics which followed the public attention to the topics in mass media. We also found several accounts that produces high traffic for certain hashtags in Czech, yet those accounts were located abroad. Consistent with other studies, a high proportion of tweets was generated by unverified accounts that might be bots – automated accounts. We propose to conduct semantic analysis of a broader dataset over the main social media platforms in the Czech Republic.


Agriculture, Twitter, Czech language, word occurrence, descriptive statistics.


  1. Bordag, S. (2008) “A comparison of co-occurrence and similarity measures as simulations of context”, International Conference on Intelligent Text Processing and Computational Linguistics. pp 52-63. ISSN 0302-9743. DOI 10.1007/978-3-540-78135-6_5.
  2. Campoy, A. (2019) “More than 60% of Donald Trump’s Twitter followers look suspiciously fake”. [Online]. Available: [Accessed: 28 Nov. 2019].
  3. Carré, P. and Pouzet, A. (2014) “Rapeseed market, worldwide and in Europe”, Oilseeds and fats, Crops and Lipids, Vol. 21, No. 1. E-ISSN 2257-6614, ISSN 2272-6977. DOI 10.1051/ocl/2013054.
  4. Červenková, E., Šimek, P., Vogeltanzová, T. and Stočes, M., (2011) “Social networks as an integration tool in rural areas–agricultural enterprises of the Czech Republic”, AGRIS on-line Papers in Economics and Informatics, Vol. 3, No. 1, pp 53-60. ISSN 1804–1930.
  5. Church, K. W. and Hanks, P. (1990) „Word association norms, mutual information, and lexicography“, Proceeding of ACL ‚89 Proceedings of the 27th annual meeting on Association for Computational Linguistics, pp. 76-83. DOI 10.3115/981623.981633.
  6. Dickerson, J. P., Kagan, V. and Subrahmanian, V. S. (2014) “Using sentiment to detect bots on twitter: Are humans more opinionated than bots?”, In Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp 620-627. IEEE Press. E-ISBN 978-1-4799-5877-1. DOI 10.1109/ASONAM.2014.6921650.
  7. Jansen, B. J., Zhang, M., Sobel, K. and Chowdury, A. (2009) “Twitter power: Tweets as electronic word of mouth”, Journal of the American society for information science and technology, Vol. 60, No. 11, pp 2169-2188. E-ISSN 2330-1643. DOI 10.1002/asi.21149.
  8. Kołodziejczak, A. and Kossowski, T. (2011) “Diversification of farming systems in Poland in the years 2006-2009”, Quaestiones Geographicae, Vol. 30, No. 2, pp 49-56. ISSN 0137-477X. DOI 10.2478/v10117-011-0017-x.
  9. Lund, K. and Burgess, C. (1996) “Producing high-dimensional semantic spaces from lexical co-occurrence”, Behavior research methods, instruments, & computers, Vol. 28, No. 2, pp 203-208. E-ISSN 1554-3528. DOI 10.3758/BF03204766.
  10. Mazurek, G., Korzyński, P. and Górska, A. (2019) „Social Media in the Marketing of Higher Education Institutions in Poland: Preliminary Empirical Studies“, Entrepreneurial Business and Economics Review, Vol. 7, No. 1, pp 117-133. E-ISSN 2353-8821, ISSN 2353-883X. DOI 10.15678/EBER.2019.070107.
  11. Miller, G. A. and Charles, W. G. (1991) “Contextual correlates of semantic similarity”, Language and cognitive processes, Vol. 6, No. 1, pp 1-28. E-ISSN 2327-3801, ISSN 2327-3798. DOI 10.1080/01690969108406936.
  12. Orhan, M.A. (2017) „ The Evolution of the Virtuality Phenomenon in Organizations: A Critical Literature Review“, Entrepreneurial Business and Economics Review, Vol. 5, No. 4, pp. 171-188. E-ISSN 2353-8821, ISSN 2353-883X. DOI 10.15678/EBER.2017.050408.
  13. Ozdikis, O., Senkul, P. and Oguztuzun, H. (2012) “Semantic expansion of hashtags for enhanced event detection in Twitter”, Conference: VLDB Workshop on Online Social Systems (WOSS 2012), Istanbul. DOI 10.1109/ASONAM.2012.14.
  14. Pehe, J. (2018) “Czech Democracy Under Pressure”, Journal of Democracy, Vol. 29, No. 3, pp. 65-77. E-ISSN 1045-5736, ISSN 1086-3214. DOI 10.1353/jod.2018.0045.
  15. Reiff M., Surmanová K., Balcerzak A. P. and Pietrzak M. B. (2016) „Multiple Criteria Analysis of European Union Agriculture“, Journal of International Studies, Vol. 9, No 3, pp. 62-74. E-ISSN 2071-8330, ISSN 2306-3483. DOI 10.14254/2071-8330.2016/9-3/5.
  16. Spence, D. P. and Owens, K. C. (1990) “Lexical co-occurrence and association strength”, Journal of Psycholinguistic Research, Vol. 19, No. 5, pp. 317-330. E-ISSN 1573-6555, ISSN 0090-6905. DOI 10.1007/BF01074363.
  17. Svatoš, M. and Smutka, L. (2009) “Influence of the EU enlargement on the agrarian foreign trade development in member states”, Agricultural Economics (AGRICECON), Vol. 55, No. 5, pp. 233-249. ISSN 0139-570X. DOI 10.17221/34/2009-AGRICECON.
  18. Turney, P. D. (2001) “Mining the web for synonyms: PMI-IR versus LSA on TOEFL”, Proceedings of the 12ve European Conference on Machine Learning, pp. 491-502. Springer, Berlin, Heidelberg. DOI 10.1007/3-540-44795-4_42.
  19. Vaněk, J., Šimek, P., Vogeltanzová, T., Červenková, E. and Jarolímek, J. (2010) “ICT in Agricultural Enterprises in the Czech Republic–Exploration 2010”, Agris on-line Papers in Economics and Informatics, Vol. 2, No. 3, pp 69-75. ISSN 1804–1930
  20. Wald, R. and Khoshgoftaar, T. M., Napolitano, A. and Sumner, C. (2013) “Predicting susceptibility to social bots on twitter”, 14th International Conference on Information Reuse & Integration (IRI), IEEE Computer Society, pp. 6-13.
  21. Wątróbski, J., Jankowski, J. and Ziemba, P. (2016) „Multistage Performance Modelling in Digital Marketing Management“, Economics and Sociology, Vol. 9, No. 2, pp. 101-125. ISSN 2071-789X. DOI 10.14254/2071-789X.2016/9-2/7.
  22. Xu, Z., Ru, L., Xiang, L. and Yang, Q. (2011) “Discovering user interest on twitter with a modified author-topic model”, Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, , Vol. 1, pp. 422-429. DOI 10.1109/WI-IAT.2011.47.

Full paper

  Full paper (.pdf, 401.69 KB).