Department of Mathematical Sciences

Permanent URI for this community

https://scholar.sun.ac.za/handle/10019.1/223

Browse

Now showing 1 - 7 of 7

3D position estimation of sports players through multi-view tracking
(Stellenbosch : University of Stellenbosch, 2010-12) Vos, Robert (Robbie); Brink, Willie; Herbst, Ben; University of Stellenbosch. Faculty of Science. Dept. of Mathematical Sciences.
ENGLISH ABSTRACT: Extracting data from video streams and using the data to better understand the observed world allows many systems to automatically perform tasks that ordinarily needed to be completed by humans. One such problem with a wide range of applications is that of detecting and tracking people in a video sequence. This thesis looks speci cally at the problem of estimating the positions of players on a sports eld, as observed by a multi-view camera setup. Previous attempts at solving the problem are discussed, after which the problem is broken down into three stages: detection, 2D tracking and 3D position estimation. Possible solutions to each of the problems are discussed and compared to one another. Motion detection is found to be a fast and e ective solution to the problem of detecting players in a single view. Tracking players in 2D image coordinates is performed by implementing a hierarchical approach to the particle lter. The hierarchical approach is chosen as it improves the computational complexity without compromising on accuracy. Finally 3D position estimation is done by multiview, forward projection triangulation. The components are combined to form a full system that is able to nd and locate players on a sports eld. The overall system that is developed is able to detect, track and triangulate player positions. The components are tested individually and found to perform well. By combining the components and introducing feedback between them the results of the individual components as well as those of the overall system are improved.
Adaptive occupancy grid mapping with measurement and pose uncertainty
(Stellenbosch : Stellenbosch University, 2012-12) Joubert, Daniek; Herbst, B. M.; Brink, Willie; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences.
ENGLISH ABSTRACT: In this thesis we consider the problem of building a dense and consistent map of a mobile robot’s environment that is updated as the robot moves. Such maps are vital for safe and collision-free navigation. Measurements obtained from a range sensor mounted on the robot provide information on the structure of the environment, but are typically corrupted by noise. These measurements are also relative to the robot’s unknown pose (location and orientation) and, in order to combine them into a world-centric map, pose estimation is necessary at every time step. A SLAM system can be used for this task. However, since landmark measurements and robot motion are inherently noisy, the pose estimates are typically characterized by uncertainty. When building a map it is essential to deal with the uncertainties in range measurements and pose estimates in a principled manner to avoid overconfidence in the map. A literature review of robotic mapping algorithms reveals that the occupancy grid mapping algorithm is well suited for our goal. This algorithm divides the area to be mapped into a regular lattice of cells (squares for 2D maps or cubes for 3D maps) and maintains an occupancy probability for each cell. Although an inverse sensor model is often employed to incorporate measurement uncertainty into such a map, many authors merely state or depict their sensor models. We derive our model analytically and discuss ways to tailor it for sensor-specific uncertainty. One of the shortcomings of the original occupancy grid algorithm is its inability to convey uncertainty in the robot’s pose to the map. We address this problem by altering the occupancy grid update equation to include weighted samples from the pose uncertainty distribution (provided by the SLAM system). The occupancy grid algorithm has been criticized for its high memory requirements. Techniques have been proposed to represent the map as a region tree, allowing cells to have different sizes depending on the information received for them. Such an approach necessitates a set of rules for determining when a cell should be split (for higher resolution in a local region) and when groups of cells should be merged (for lower resolution). We identify some inconsistencies that can arise from existing rules, and adapt those rules so that such errors are avoided. We test our proposed adaptive occupancy grid algorithm, that incorporates both measurement and pose uncertainty, on simulated and real-world data. The results indicate that these uncertainties are included effectively, to provide a more informative map, without a loss in accuracy. Furthermore, our adaptive maps need far fewer cells than their regular counterparts, and our new set of rules for deciding when to split or merge cells significantly improves the ability of the adaptive grid map to mimic its regular counterpart.
Enhancing mobile camera pose estimation through the inclusion of sensors
(Stellenbosch : Stellenbosch University, 2014-12) Hughes, Lloyd Haydn; Brink, Willie; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences.
ENGLISH ABSTRACT: Monocular structure from motion (SfM) is a widely researched problem, however many of the existing approaches prove to be too computationally expensive for use on mobile devices. In this thesis we investigate how inertial sensors can be used to increase the performance of SfM algorithms on mobile devices. Making use of the low cost inertial sensors found on most mobile devices we design and implement an extended Kalman filter (EKF) to exploit their complementary nature, in order to produce an accurate estimate of the attitude of the device. We make use of a quaternion based system model in order to linearise the measurement stage of the EKF, thus reducing its computational complexity. We use this attitude estimate to enhance the feature tracking and camera localisation stages in our SfM pipeline. In order to perform feature tracking we implement a hybrid tracking algorithm which makes use of Harris corners and an approximate nearest neighbour search to reduce the search space for possible correspondences. We increase the robustness of this approach by using inertial information to compensate for inter-frame camera rotation. We further develop an efficient bundle adjustment algorithm which only optimises the pose of the previous three key frames and the 3D map points common between at least two of these frames. We implement an optimisation based localisation algorithm which makes use of our EKF attitude estimate and the tracked features, in order to estimate the pose of the device relative to the 3D map points. This optimisation is performed in two steps, the first of which optimises only the translation and the second optimises the full pose. We integrate the aforementioned three sub-systems into an inertial assisted pose estimation pipeline. We evaluate our algorithms with the use of datasets captured on the iPhone 5 in the presence of a Vicon motion capture system for ground truth data. We find that our EKF can estimate the device’s attitude with an average dynamic accuracy of ±5°. Furthermore, we find that the inclusion of sensors into the visual pose estimation pipeline can lead to improvements in terms of robustness and computational efficiency of the algorithms and are unlikely to negatively affect the accuracy of such a system. Even though we managed to reduce execution time dramatically, compared to typical existing techniques, our full system is found to still be too computationally expensive for real-time performance and currently runs at 3 frames per second, however the ever improving computational power of mobile devices and our described future work will lead to improved performance. From this study we conclude that inertial sensors make a valuable addition into a visual pose estimation pipeline implemented on a mobile device.
Long-term tracking of multiple interacting pedestrians using a single camera
(Stellenbosch : Stellenbosch University, 2014-04) Keaikitse, Advice Seiphemo; Brink, Willie; Govender, N.; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences.
ENGLISH ABSTRACT: Object detection and tracking are important components of many computer vision applications including automated surveillance. Automated surveillance attempts to solve the challenges associated with closed-circuit camera systems. These include monitoring large numbers of cameras and the associated labour costs, and issues related to targeted surveillance. Object detection is an important step of a surveillance system and must overcome challenges such as changes in object appearance and illumination, dynamic background objects like ickering screens, and shadows. Our system uses Gaussian mixture models, which is a background subtraction method, to detect moving objects. Tracking is challenging because measurements from the object detection stage are not labelled and could be from false targets. We use multiple hypothesis tracking to solve this measurement origin problem. Practical long-term tracking of objects should have re-identi cation capabilities to deal with challenges arising from tracking failure and occlusions. In our system each tracked object is assigned a one-class support vector machine (OCSVM) which learns the appearance of that object. The OCSVM is trained online using HSV colour features. Therefore, objects that were occluded or left the scene can be reidenti ed and their tracks extended. Standard, publicly available data sets are used for testing. The performance of the system is measured against ground truth using the Jaccard similarity index, the track length and the normalized mean square error. We nd that the system performs well.
Planar segmentation of range images
(Stellenbosch : Stellenbosch University, 2013-03) Muller, Simon Adriaan; Brink, Willie; Herbst, B. M.; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences.
ENGLISH ABSTRACT: Range images are images that store at each pixel the distance between the sensor and a particular point in the observed scene, instead of the colour information. They provide a convenient storage format for 3-D point cloud information captured from a single point of view. Range image segmentation is the process of grouping the pixels of a range image into regions of points that belong to the same surface. Segmentations are useful for many applications that require higherlevel information, and with range images they also represent a significant step towards complete scene reconstruction. This study considers the segmentation of range images into planar surfaces. It discusses the theory and also implements and evaluates some current approaches found in the literature. The study then develops a new approach based on the theory of graph cut optimization which has been successfully applied to various other image processing tasks but, according to a search of the literature, has otherwise not been used to attempt segmenting range images. This new approach is notable for its strong guarantees in optimizing a specific energy function which has a rigorous theoretical underpinning for handling noise in images. It proves to be very robust to noise and also different values of the few parameters that need to be trained. Results are evaluated in a quantitative manner using a standard evaluation framework and datasets that allow us to compare against various other approaches found in the literature. We find that our approach delivers results that are competitive when compared to the current state-of-the-art, and can easily be applied to images captured with different techniques that present varying noise and processing challenges.
Predicting water quality variables
(Stellenbosch : Stellenbosch University., 2020-03) Elmahdi, Reem; Brink, Willie; Wilms, Josefine M.; Stellenbosch University. Faculty of Science. Department of Mathematical Sciences (Applied Mathematics).
ENGLISH ABSTRACT: Water is an important substance for all of life, and can be used in domestic, agricultural and industrial activities. Water quality determines the usefulness of water for particular purposes, and can be defined in terms of time-varying water quality variables such as dissolved oxygen, turbidity, temperature, pH, specific conductance, chlorophylls, nitrate and salinity. Different mathematical and statistical models have been used for the prediction of time-series data. Machine learning can also be used when enough data is available. In particular, artificial neural networks (ANNs) have demonstrated success in solving such problems. They are conceptually simple and easily implemented. In this thesis,an overview of two ANN s tructures is presented for solving the problem of predicting water quality variables. Specifically, multilayer perceptrons (MLPs) and long short-term memory (LSTM) networks are presented. Experiments are conducted on Hog Island water quality variables and the results of the models are compared using various accuracy metrics like root mean squared error. It is found that LSTM performs better than MLP across most of the accuracy metrics.
Real-time stereo reconstruction using hierarchical dynamic programming and LULU filtering
(Stellenbosch : University of Stellenbosch, 2010-03) Singels, Francois; Brink, Willie; Herbst, B. M.; University of Stellenbosch. Faculty of Science. Dept. of Mathematical Sciences.
ENGLISH ABSTRACT: In this thesis we consider the essential topics relating to stereo-vision and the correspondence problem in general. The aim is to reconstruct a dense 3D scene from images captured by two spatially related cameras. Our main focus, however, is on speed and real-time implementation on a standard desktop PC. We wish to use the CPU to solve the correspondence problem and to reserve the GPU for model rendering. We discuss three fundamental types of algorithms and evaluate their suitability to this end. We eventually choose to implement a hierarchical version of the dynamic programming algorithm, because of the good balance between accuracy and speed. As we build our system from the ground up we gradually introduce necessary concepts and established geometric principles, common to most stereovision systems, and discuss them as they become relevant. It becomes clear that the greatest weakness of the hierarchical dynamic programming algorithm is scanline inconsistency. We nd that the one-dimensional LULU- lter is computationally inexpensive and e ective at removing outliers when applied across the scanlines. We take advantage of the hierarchical structure of our algorithm and sub-pixel re nement to produce results at video rates (roughly 20 frames per second). A 3D model is also constructed at video rates in an on-line system with only a small delay between obtaining the input images and rendering the model. Not only is the quality of our results highly competitive with those of other state of the art algorithms, but the achievable speed is also considerably faster.

Browse

Browsing Department of Mathematical Sciences by browse.metadata.advisor "Brink, Willie"

Results Per Page

Sort Options