Analysing ranking algorithms and publication trends on scholarly citation networks

dc.contributor.advisorVisser, Willemen_ZA
dc.contributor.advisorGeldenhuys, Jacoen_ZA
dc.contributor.authorDunaiski, Marcel Paulen_ZA
dc.contributor.otherStellenbosch University. Faculty of Science. Department of Mathematical Sciences.en_ZA
dc.date.accessioned2015-01-13T11:50:27Z
dc.date.available2015-01-13T11:50:27Z
dc.date.issued2014-12en_ZA
dc.descriptionThesis (MSc)--Stellenbosch University, 2014.en_ZA
dc.description.abstractENGLISH ABSTRACT: Citation analysis is an important tool in the academic community. It can aid universities, funding bodies, and individual researchers to evaluate scientific work and direct resources appropriately. With the rapid growth of the scientific enterprise and the increase of online libraries that include citation analysis tools, the need for a systematic evaluation of these tools becomes more important. The research presented in this study deals with scientific research output, i.e., articles and citations, and how they can be used in bibliometrics to measure academic success. More specifically, this research analyses algorithms that rank academic entities such as articles, authors and journals to address the question of how well these algorithms can identify important and high-impact entities. A consistent mathematical formulation is developed on the basis of a categorisation of bibliometric measures such as the h-index, the Impact Factor for journals, and ranking algorithms based on Google’s PageRank. Furthermore, the theoretical properties of each algorithm are laid out. The ranking algorithms and bibliometric methods are computed on the Microsoft Academic Search citation database which contains 40 million papers and over 260 million citations that span across multiple academic disciplines. We evaluate the ranking algorithms by using a large test data set of papers and authors that won renowned prizes at numerous Computer Science conferences. The results show that using citation counts is, in general, the best ranking metric. However, for certain tasks, such as ranking important papers or identifying high-impact authors, algorithms based on PageRank perform better. As a secondary outcome of this research, publication trends across academic disciplines are analysed to show changes in publication behaviour over time and differences in publication patterns between disciplines.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING: Sitasiesanalise is ’n belangrike instrument in die akademiese omgewing. Dit kan universiteite, befondsingsliggams en individuele navorsers help om wetenskaplike werk te evalueer en hulpbronne toepaslik toe te ken. Met die vinnige groei van wetenskaplike uitsette en die toename in aanlynbiblioteke wat sitasieanalise insluit, word die behoefte aan ’n sistematiese evaluering van hierdie gereedskap al hoe belangriker. Die navorsing in hierdie studie handel oor die uitsette van wetenskaplike navorsing, dit wil sê, artikels en sitasies, en hoe hulle gebruik kan word in bibliometriese studies om akademiese sukses te meet. Om meer spesifiek te wees, hierdie navorsing analiseer algoritmes wat akademiese entiteite soos artikels, outeers en journale gradeer. Dit wys hoe doeltreffend hierdie algoritmes belangrike en hoë-impak entiteite kan identifiseer. ’n Breedvoerige wiskundige formulering word ontwikkel uit ’n versameling van bibliometriese metodes soos byvoorbeeld die h-indeks, die Impak Faktor vir journaale en die rang-algoritmes gebaseer op Google se PageRank. Verder word die teoretiese eienskappe van elke algoritme uitgelê. Die rang-algoritmes en bibliometriese metodes gebruik die sitasiedatabasis van Microsoft Academic Search vir berekeninge. Dit bevat 40 miljoen artikels en meer as 260 miljoen sitasies, wat oor verskeie akademiese dissiplines strek. Ons gebruik ’n groot stel toetsdata van dokumente en outeers wat bekende pryse op talle rekenaarwetenskaplike konferensies gewen het om die rang-algoritmes te evalueer. Die resultate toon dat die gebruik van sitasietellings, in die algemeen, die beste rangmetode is. Vir sekere take, soos die gradeering van belangrike artikels, of die identifisering van hoë-impak outeers, presteer algoritmes wat op PageRank gebaseer is egter beter. ’n Sekondêre resultaat van hierdie navorsing is die ontleding van publikasie tendense in verskeie akademiese dissiplines om sodoende veranderinge in publikasie gedrag oor tyd aan te toon en ook die verskille in publikasie patrone uit verskillende dissiplines uit te wys.af_ZA
dc.format.extentxii, 128 p. : ill.
dc.identifier.urihttp://hdl.handle.net/10019.1/96106
dc.language.isoen_ZAen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.rights.holderStellenbosch Universityen_ZA
dc.subjectScholarly periodicals -- Ratings and rankingsen_ZA
dc.subjectBibliometricsen_ZA
dc.subjectCitation analysisen_ZA
dc.subjectAlgorithmsen_ZA
dc.subjectUCTDen_ZA
dc.subjectScholarly citation networksen_ZA
dc.subjectDissertations -- Mathematicsen_ZA
dc.subjectTheses -- Mathematicsen_ZA
dc.subjectRanking and selection (Statistics)en_ZA
dc.titleAnalysing ranking algorithms and publication trends on scholarly citation networksen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
dunaiski_analysing_2014.pdf
Size:
9.25 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.62 KB
Format:
Plain Text
Description: