Browsing by Author "Schoonwinkel, Petrus"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemClassification of social media event-discussions using interaction patterns : a social network analysis approach(Stellenbosch : Stellenbosch University, 2020-12) Schoonwinkel, Petrus; Cornelissen, Laurenz Aldu; Parry, Douglas A.; Stellenbosch University. Faculty of Arts and Social Sciences. Dept. of Information Science.ENGLISH SUMMARY : This thesis uses social network analysis to explore the classification of social media discussions, utilising network structure derived from interactions on Twitter, while requiring minimal domain knowledge. In academia and industry, researchers strive to understand the patterns of interaction between actors on social media platforms, and how their actions may relate to particular events, topics, network characteristics, personalities and characters, among other factors. From literature, it is found that researchers in a wide range of disciplines lack the tools to classify in a variety of event-discussions. Further exemplified with the scenario where topics of interest to researchers on social media can overlap and that users are often engaged in a multitude of topics simultaneously, an approach to classification that necessitates minimal prior domain knowledge on the contents of the datasets is required. This study is a proof of concept for the use of network metrics to characterise and classify a diverse set of events that were discussed on social media. To classify social media data, one can utilise unsupervised machine learning methods. From the literature it is found that a multitude of clustering methods with regards to social media has been explored, in multi-media, networks, textual and other contexts. However, only limited approaches to classifying social media data—specifically Twitter—in terms of their network structure have been explored. This study does not aim to replace those methods but add to an array of tools that can be used by researchers, both in academia and in industry, to maximise the value obtained from social media data. In order to obtain metrics whereby to perform classification, a novel approach to modelling interactions with the data source, Twitter, was developed and a set of network measures and data descriptors that characterise the data were explored. The network measures and data descriptors were subjected to dimensionality reduction to account for co-variance in the measurements and to evaluate the contribution of each network measure, in order to expand the literature on what they define in the context of this study. The resulting principal components were used to classify the discussions of diverse events and the quality and quantity of clusters were evaluated. Finally, a set of tests and criteria were defined with which the research question was addressed. The study found that the approach produced an optimal number of clusters with reasonable structure quality without requiring any domain knowledge to produce them. Although the method proposed in this study is effective in finding underlying patterns and similarities, it mainly serves to point researchers in the right direction, more detailed analysis is necessary for definite conclusions and labelled categorisation. The study recognises the prior work performed in classifying social media data and recommends that future work include a wide variety of user features, sentiment, topic, and network measures. Furthermore, the study can be expanded upon by testing alternative dimensionality reduction and clustering methods at each stage of the proposed approach. The study furthered the understanding of classifying social media data in terms of social network analysis and the various network measures and data descriptors that was discussed.