Department of Computer Science
Permanent URI for this community
Browse
Browsing Department of Computer Science by browse.metadata.advisor "Engelbrecht, Andries Petrus"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- ItemLandscape aware algorithm selection for feature selection(Stellenbosch : Stellenbosch University, 2023-10) Mostert, Werner; Engelbrecht, Andries Petrus; Malan, Katherine Mary; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Computer Science Division.ENGLISH ABSTRACT: Feature selection is commonly applied as a pre-processing technique for machine learning to reduce the dimensionality of a problem by removing redundant and irrelevant features. Another desirable outcome of feature selection lies in the potential performance improvement of predictive models. The development of new feature selection algorithms are common within the field, however, relatively little research has historically been done to better understand the feature selection problem from a theoretical perspective. Researchers and practitioners in the field often rely on a trial-and-error strategy to decide on which feature selection algorithm to use for a specific instance of a machine learning problem. This thesis contributes towards a better understanding of the complex feature selection problem by investigating the link between feature selection problem characteristics and the performance of feature selection algorithms. A variety of fitness landscape analysis techniques are used to gain insights into the structure of the feature selection fitness landscape. Performance complementarity for feature selection algorithms is empirically shown, emphasising the potential value of automated algorithm selection for feature selection algorithms. Towards the realisation of a landscape aware algorithm selector for feature selection, a novel performance metric for feature selection algorithms is presented. The baseline fitness improvement (BFI) performance metric is unbiased and can be used for comparative analysis across feature selection problem instances. The insights obtained via landscape analysis are used with other meta-features of datasets and the BFI performance measure to develop a new landscape aware algorithm selector for feature selection. The landscape aware algorithm selector provides a human-interpretable predictive model of the best feature selection algorithm for a specific dataset and classification problem.
- ItemRule Induction with Swarm Intelligence(Stellenbosch : Stellenbosch University, 2022-03) van Zyl, Jean-Pierre; Engelbrecht, Andries Petrus; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: Rule induction is the process by which explainable mappings are created between a set of input data instances and a set of labels for the input instances. This process can be seen as an extension of traditional classification algorithms, because rule induction algorithms perform classification b ut h ave t he addedproperty of being transparent when making inferences. Popular algorithms in existing literature tend to use antiquated approaches to induce rule sets. The existing approaches tend to be greedy in nature and do not provide a platform for algorithm expansion or improvement. This thesis investigates a new approach to rule induction using a set-based particle swarm optimisation algorithm. The investigation starts with a comprehensive review of the relevant literature, after which the novel algorithm is proposed and compared with popular rule induction algorithms. After the establishment of the capabilities and validity of the set-based particle swarm optimisation rule induction algorithm, the effect of the objective function on the algorithm is investigated. The objective function is tested with 12 existing performance evaluation metrics in order to understand how the performance of the algorithm can be improved. These 12 existing metrics are then used as inspiration for the proposal of 11 new performance evaluation metrics which are also tested as part of the objective function effect analysis. The effect o f v arying d istributions o f t he v alues o f t he t arget c lass i s also examined. This thesis also investigates the reformulation of the rule induction problem as a multi-objective optimisation problem and applies the newly developed multi-guide set-based particle swarm optimisation algorithm to the multiobjective formulation of rule induction. The performance of rule induction as a multi-objective problem is evaluated by examining how the trade-off between the defined objectives functions affects performance for different datasets. The existing metrics and newly proposed metrics tested in the single objective formulation of the rule induction problem are also tested in the multi-objective formulation.