Department of Computer Science

Permanent URI for this community

https://scholar.sun.ac.za/handle/10019.1/96336

Browse

Now showing 1 - 6 of 6

An assessment of algorithms for deriving failure deterministic finite automata
(South African Institute of Computer Scientists and Information Technologists, 2017) Nxumalo, Madoda; Kourie, Derrick G.; Cleophas, Loek; Watson, Bruce W.
Failure deterministic finite automata (FDFAs) represent regular languages more compactly than deterministic finite automata (DFAs). Four algorithms that convert arbitrary DFAs to language-equivalent FDFAs are empirically investigated. Three are concrete variants of a previously published abstract algorithm, the DFA-Homomorphic Algorithm (DHA). The fourth builds a maximal spanning tree from the DFA to derive what it calls a delayed input DFA. A first suite of test data consists of DFAs that recognise randomised sets of finite length keywords. Since the classical Aho-Corasick algorithm builds an optimal FDFA from such a set (and only from such a set), it provides benchmark FDFAs against which the performance of the general algorithms can be compared. A second suite of test data consists of random DFAs generated by a specially designed algorithm that also builds language-equivalent FDFAs, some of which may have non-divergent cycles. These random FDFAs provide (not necessarily tight) lower bounds for assessing the effectiveness of the four general FDFA generating algorithms.
Coverage directed algorithms for test suite construction from LR-automata
(Stellenbosch : Stellenbosch University, 2022-04) Rossouw, Christoffel Jacobus; Fischer, Bernd; Stellenbosch University. Faculty of Science. Dept. of Computer Science.
ENGLISH ABSTRACT: Bugs in software can have disastrous results in terms of both economic cost and human lives. Parsers can have bugs, like any other type of software, and must therefore be thoroughly tested in order to ensure that a parser recognizes its intended language accurately. However, parsers often need to recognize many different variations and combinations of grammar structures which can make it time consuming and difficult to construct test suites by hand. We therefore require automated methods of test suite construction for these systems. Currently, the majority of test suite construction algorithms focus on the grammar describing the language to be recognized by the parser. In this thesis we take a different approach. We consider the LR-automaton that recognizes the target language and use the context information encoded in the automaton. Specifically, we define a new class of algorithm and coverage criteria over a variant of the LR-automaton that we define, called an LR-graph. We define methods of constructing positive test suites, using paths over this LR-graph, as well as mutations on valid paths to construct negative test suites. We evaluate the performance of our new algorithms against other state-of-the-art algorithms. We do this by comparing coverage achieved over various systems, some smaller systems used in a university level compilers course and other larger, real-world systems. We find good performance of our algorithms over these systems, when compared to algorithms that produce test suites of equivalent size. Our evaluation has uncovered a problem in grammar-based testing algorithms that we call bias. Bias can lead to significant variation in coverage achieved over a system, which can in turn lead to a flawed comparison of two algorithms or unrealized performance when a test suite is used in practice. We therefore define bias and measure it for all grammar-based test suite construction algorithms we use in this thesis.
Flow allocation in wireless networks with selfish nodes
(SATNAC, 2013) Krzesinski, A. E.
Consider an ad hoc network where packet transmissions occur between the nodes. Optimal flow allocation in such systems can be modelled as a constrained nonlinear optimisation problem. This problem can be solved either by standard methods which assume global knowledge of the system being modelled, or by a distributed algorithm which assumes local knowledge only. We consider an ad hoc network which contains selfish nodes. A selfish node cares only about maximising its own flows and does not care about the utility that any other nodes get. Flow allocation in a network of altruistic and selfish nodes can be modelled as a constrained nonlinear optimisation problem and solved by standard methods. However, in this case a dynamic algorithm to compute the network flows in not available. We modify the behaviour of the selfish nodes so that a dynamic solution is possible. In this scheme, selfish nodes advertise false (inflated) resource prices to the other nodes. These nodes respond by not routing their flows through the selfish nodes, and the selfish nodes can now can use all their resources to transmit their own flows. The flows return to their optimal values if the selfish nodes subsequently advertise the correct prices for their resources. Altruistic nodes can detect the inflated prices charged by the selfish nodes and respond by advertising false (inflated) prices to the selfish nodes. In this case the flows originating at the selfish nodes are reduced, but the flows do not return to their optimal values. This scheme also has a distributed solution.
Landscape aware algorithm selection for feature selection
(Stellenbosch : Stellenbosch University, 2023-10) Mostert, Werner; Engelbrecht, Andries Petrus; Malan, Katherine Mary; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Computer Science Division.
ENGLISH ABSTRACT: Feature selection is commonly applied as a pre-processing technique for machine learning to reduce the dimensionality of a problem by removing redundant and irrelevant features. Another desirable outcome of feature selection lies in the potential performance improvement of predictive models. The development of new feature selection algorithms are common within the field, however, relatively little research has historically been done to better understand the feature selection problem from a theoretical perspective. Researchers and practitioners in the field often rely on a trial-and-error strategy to decide on which feature selection algorithm to use for a specific instance of a machine learning problem. This thesis contributes towards a better understanding of the complex feature selection problem by investigating the link between feature selection problem characteristics and the performance of feature selection algorithms. A variety of fitness landscape analysis techniques are used to gain insights into the structure of the feature selection fitness landscape. Performance complementarity for feature selection algorithms is empirically shown, emphasising the potential value of automated algorithm selection for feature selection algorithms. Towards the realisation of a landscape aware algorithm selector for feature selection, a novel performance metric for feature selection algorithms is presented. The baseline fitness improvement (BFI) performance metric is unbiased and can be used for comparative analysis across feature selection problem instances. The insights obtained via landscape analysis are used with other meta-features of datasets and the BFI performance measure to develop a new landscape aware algorithm selector for feature selection. The landscape aware algorithm selector provides a human-interpretable predictive model of the best feature selection algorithm for a specific dataset and classification problem.
Mixtures of heterogeneous experts
(Stellenbosch : Stellenbosch University, 2022-10) Omomule, Taiwo Gabriel; Engelbrecht, Andries; Stellenbosch University. Faculty of Science. Dept. of Computer Science.
ENGLISH ABSTRACT: This research considers the problem of the No-Free-Launch-Theorem, which states that no one machine learning algorithm performs best on all problems due to algorithms having different inductive biases. Another problem is that the combinations of experts of the same type, referred to as a mixture of homogeneous experts, do not capitalize on the different inductive biases of different machine learning algorithms. Research has shown that mixtures of homogeneous experts deliver improved accuracy compared to that of the base experts in the mixture. However, the predictive power of a homogeneous mixture of experts is still limited by the inductive bias of the algorithm that makes up the mixture of experts. Therefore, this research proposes the development of mixtures of heterogeneous experts through the combination of different machine learning algorithms to take advantage of the strengths of the machine learning algorithms and to reduce the adverse effects of the inductive biases of the different algorithms. A set of different machine learning algorithms are selected to develop four different types of mixtures of experts in the research. Empirical analyses are performed using nonparametric statistical tests to compare the generalization performance of the ensembles. The comparison is carried out to investigate the performance of the homogeneous and heterogeneous ensembles in a number of modelling studies examined on a set of classification and regression problems using selected performance measures. The problems represent varying levels of complexity and characteristics to determine the characteristics and complexities for which the heterogeneous ensembles outperform homogeneous ensembles. For classification problems, the empirical results across six modelling studies indicate that heterogeneous ensembles generate improved predictive performance compared to the developed homogeneous ensembles by taking advantage of the different inductive biases of the different base experts in the ensembles. Specifically, the heterogeneous ensembles developed using different machine learning algorithms, with the same and different configurations, showed superiority over other heterogeneous ensembles and the homogeneous ensembles developed in this research. The ensembles achieved the best and second-best overall accuracy rank across the classification datasets in each modelling study. For regression problems, the heterogeneous ensembles outperformed the homogeneous ensembles across five modelling studies. Although, a random forest algorithm achieved competitive generalization performance compared to that of the heterogeneous ensembles. Based on the average ranks, the heterogeneous ensembles developed using different machine learning algorithms where the base members consist of the same and different configurations still produced better predictive performance than a number of heterogeneous ensembles and homogeneous ensembles across the modelling studies. Therefore, the implementation of a mixture of heterogeneous experts removes the need for the computationally expensive process of finding the best performing homogeneous ensemble. The heterogeneous ensembles of different machine learning algorithms are consistently the most or one of the most accurate ensembles across all classification and regression problems. This is attributed to the advantage of capitalizing on the inductive biases of the different machine learning algorithms and the different configurations of the base members in the ensembles.
Particle swarm optimization for constrained multimodal function optimization
(Stellenbosch : Stellenbosch University, 2024-03) Strelitz, Benjamin Steenveld; Engelbrecht, Andries; Stellenbosch University. Faculty of Science. Dept. of Computer Science.
ENGLISH ABSTRACT: This thesis investigates the efficiency of particle swarm optimization (PSO) algorithms at finding many feasible global optima for constrained multimodal optimization prob- lems. The proposed approach is the niching migratory multi-swarm optimizer with Deb's comparison criteria (NMMSO-DCC) algorithm. The NMMSO-DCC algorithm uses the same core architecture as the niching migratory multi-swarm optimization (NMMSO) al- gorithm, but uses Deb's comparison criteria as a constraint handling method. Deb's com- parison criteria allows the NMMSO-DCC algorithm to find many feasible global optima for constrained multimodal optimization problems (CMMOPs), whereas the NMMSO algorithm was designed only to find global optima for boundary constrained multimodal optimization problems (MMOPs). The NMMSO algorithm is one of the state-of-the-art multiomodal optimization algorithms, but cannot be used when constraints are placed on the objective function. Thus, the proposed algorithm addresses the inability of the NMMSO algorithm to solve constrained multimodal optimization problems. This study assumes that the objective function to be optimized remains static throughout the search process. This study also assumes that the constraints placed upon the objective func- tion remain static during the search process. All benchmark problems in this study contain boundary constraints. The results indicate that the NMMSO-DCC performs competitively compared to other state-of-the-art constrained multimodal optimization algorithms. The results in terms of success rate are particularly convincing, whereas NMMSO-DCC struggled more with respect to the peak ratio. This means that although the NMMSO-DCC algorithm is able to locate all global optima within a given tolerance level in some of the independent runs, it struggles to do so consistently across multiple independent runs.

Browse

Browsing Department of Computer Science by Subject "Algorithms"

Results Per Page

Sort Options