Department of Civil Engineering
Permanent URI for this community
Browse
Browsing Department of Civil Engineering by Author "Adams, Logan Charl"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemApplication of data mining and machine learning on occupational health and safety struck-by incidents on south African construction sites: a CRISP-DM approach.(Stellenbosch : Stellenbosch University, 2023-02) Adams, Logan Charl; Wium, Jan Andries; Stellenbosch University. Faculty of Engineering. Dept. of Civil Engineering.ENGLISH ABSTRACT: Occupational Health and Safety in the South African construction industry face many performance challenges that result in potentially avoidable incident occurrences. The study aims to propose the utilisation of data mining and classification machine learning models to improve data understanding, promote knowledge and information extraction, and encourage prediction capabilities through classification methods. A mixed research approach was applied in the study to enable a holistic usage of data and its applications. Interviews (qualitative research component) allowed the identification of the current state of OHS data and data management in the South African construction industry while identifying data considerations for the quantitative research component (Exploratory Data Analysis and classification machine learning models). Data sourced from Federated Employers Mutual Assurance Company (an insurance database), and additional databases (sourced from the Federal Reserve Bank of St. Louis and Organisation for Economic Cooperation and Development), enabled a quantitative Exploratory Data Analysis and the development of multiple classification machine learning models. The Exploratory Data Analysis provided insights into data understanding and the potential of using it to enable datadriven safety decision-making. The classification models provided insights into the possibility of an industry-wide classification prediction model based on existing data while also providing valuable insights into the fundamental concerns and limitations. The qualitative and quantitative components of the study highlighted several concerns regarding data, data management, and data innovations across OHS in the South African construction industry. At the core was the lack of understanding regarding the possibilities of data and the misaligned value proposition witnessed. Furthermore, the notable limitations in the quality of data and the mechanisms that influence its quality were highlighted, including the effects of ineffective incident investigations for fact-finding and prominent underreporting experienced in the construction industry. Data mining and machine learning offered the ability to extract deeper insights from incidents and enable improvements in OHS performance through data-driven safety decision-making. Three output variables were evaluated across several machine learning algorithms in terms of the model's ability to successfully predict and classify the state of an incident namely (1) Injury Location (the physical injury location on the affected individual's body), (2) Nature of Injury (the type of injury the affected individual experienced), and (3) Days off (number of days required off from work for recovery). The results obtained from the machine learning models demonstrate the capability to predict the Days off variable to high accuracy levels (average of 81.8%), moderate accuracy levels for the Nature of Injury (average of 37.4%), and low accuracy levels for Injury Location (average of 17.8%). The performance Stellenbosch University https://scholar.sun.ac.za iii | P a g e of the various machine learning models are directly influenced by the underlying correlation between the output and input variables and the number of classifications required within the output variable itself – with the largest correlation coefficient and the number of classifications respectively noted as Injury Location (0.07, 20), Nature of Injury (0.14, 9), and Days off (0.07, 3). It is recommended that the successful implementation of data mining and machine learning requires collaborative efforts between the industry, Government, and academia.