On Eigendecomposition-based algorithms as feature extraction techniques used with hidden Markov model for the detection of whale vocalisations
Date
2024-03
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH ABSTRACT: Whales emit a variety of distinctive sound signals for communication, echolocation, and other social functions, which are gathered through passive acoustic monitoring (PAM). Different automated methods have been proposed in the literature for analysing PAM datasets to detect and classify whale species, including the use of the hidden Markov model (HMM). This thesis proposes eigendecomposition-based (ED) algorithms as feature extraction (FE) techniques used with HMM for the detection of whale vocalisations. Specifically, the principal components analysis (PCA) and the dynamic mode decomposition (DMD) are deployed to extract the latent underlying characteristics of whale signals in PAM datasets. In addition, enhanced FE techniques are proposed through the kernelisation of PCA and DMD. The emerging ED-based hidden Markov models (ED-HMMs): PCA-HMM, kPCAHMM, DMD-HMM, and kDMD-HMM are grouped according to the underlying algorithm deployed for the FE: the PC-based hidden Markov models (PC-HMMs) and the DMD-based hidden Markov model (DM-HMMs). Each of the models is tested on PAM datasets containing southern right whale (SRW) and humpback whale (HW) vocalisations. Their performances are evaluated using metrics such as the true positive rate (TPR), precision (PREC), error rate (ERR), and F1 scores. Performance outcomes vary subject to different experimental conditions like the dimension of the feature vectors, the size of the training data, and the species vocalisations. The models demonstrated good performance across different evaluation metrics. For the PC-HMMs, the kPCA-HMM did not only outperform the PCA-HMM in terms of TPR and PREC, but it also exhibited a lower ERR. However, the kCPA-HMM exhibits a higher computational cost when compared to the PCA-HMM. Similarly, for DM-HMMs, the kDMD-HMM outperformed the DMD-HMM in terms of TPR and PREC, and it also exhibited lower ERR as well as a lower computational cost. The comparison showed that PC-HMMs stabilised faster than DM-HMMs in terms of performance. Thus, the PC-HMMs are less complex than the DM-HMMs in terms of dimension. However, the DM-HMMs outperformed the PC-HMMs, albeit at higher https://scholar.sun.ac.za Abstract iii dimensions. The reliability of the developed models was confirmed with F1 scores, as all the models achieved F1 scores > 0.9 at their respective optimal dimensions. Lastly, the results of the proposed ED-HMMs are compared with the existing FE techniques used with HMM in the literature for the detection of whale vocalisations. The ED-HMMs do outperform the existing HMM methods. A general observation is that every model displays better performance with an increase in the number of samples deployed for training. Hence, large window sizes are recommended for model training. The different experimental results showed that a model’s performance must be evaluated on a species-to-species basis. It is also important that the training data be a subset of the datasets for testing, or at least using recordings from the same region. This is to avoid bias that may arise from the variation that does exist between the vocalisations of the same species. The ED-HMMs proposed in this study can be further tested on other whale vocalisations to confirm their robustness. Besides, they can be explored by researchers working on the automatic detection of other vocalising animal species.
AFRIKAANSE OPSOMMING: Die vergelyking het getoon dat PC-HMM’s vinniger gestabiliseer het as DM-HMM’s in terme van werkverrigting. Dus, die PC-HMMs is minder kompleks as die DMHMMs in terme van dimensie. Die DM-HMM’s het egter beter as die PC-HMM’s gevaar, alhoewel by ho¨er afmetings. Die betroubaarheid van die ontwikkelde modelle is bevestig met F1 tellings, aangesien al die modelle F1 tellings > 0.9 by hul onderskeie optimale afmetings behaal het. Laastens word die resultate van die voorgestelde ED-HMMs vergelyk met die bestaande FE-tegnieke wat met HMM in die literatuur gebruik word vir die opsporing van walvisvokalisering. Die ED-HMM’s presteer wel beter as die bestaande HMM-metodes. ’n Algemene waarneming is dat elke model beter prestasie toon met ’n toename in die aantal monsters wat vir opleiding ontplooi word. Daarom word groot venstergroottes aanbeveel vir modelopleiding. Die verskillende eksperimentele resultate het getoon dat ’n model se prestasie op ’n spesie-tot-spesie basis ge¨evalueer moet word. Dit is ook belangrik dat die opleidingsdata ’n subset van die datastelle vir toetsing is, of ten minste opnames van dieselfde streek gebruik. Die ED-HMM’s wat in hierdie studie voorgestel word, kan verder getoets word op ander walvisvokalisering om hul robuustheid te bevestig. Boonop kan hulle ondersoek word deur navorsers wat werk aan die outomatiese opsporing van ander vokale dierspesies.
AFRIKAANSE OPSOMMING: Die vergelyking het getoon dat PC-HMM’s vinniger gestabiliseer het as DM-HMM’s in terme van werkverrigting. Dus, die PC-HMMs is minder kompleks as die DMHMMs in terme van dimensie. Die DM-HMM’s het egter beter as die PC-HMM’s gevaar, alhoewel by ho¨er afmetings. Die betroubaarheid van die ontwikkelde modelle is bevestig met F1 tellings, aangesien al die modelle F1 tellings > 0.9 by hul onderskeie optimale afmetings behaal het. Laastens word die resultate van die voorgestelde ED-HMMs vergelyk met die bestaande FE-tegnieke wat met HMM in die literatuur gebruik word vir die opsporing van walvisvokalisering. Die ED-HMM’s presteer wel beter as die bestaande HMM-metodes. ’n Algemene waarneming is dat elke model beter prestasie toon met ’n toename in die aantal monsters wat vir opleiding ontplooi word. Daarom word groot venstergroottes aanbeveel vir modelopleiding. Die verskillende eksperimentele resultate het getoon dat ’n model se prestasie op ’n spesie-tot-spesie basis ge¨evalueer moet word. Dit is ook belangrik dat die opleidingsdata ’n subset van die datastelle vir toetsing is, of ten minste opnames van dieselfde streek gebruik. Die ED-HMM’s wat in hierdie studie voorgestel word, kan verder getoets word op ander walvisvokalisering om hul robuustheid te bevestig. Boonop kan hulle ondersoek word deur navorsers wat werk aan die outomatiese opsporing van ander vokale dierspesies.
Description
Thesis (PhD)--Stellenbosch University, 2024.