Forskningsoutput per år
Forskningsoutput per år
Forskningsoutput: Avhandling › Licentiatavhandling
This thesis covers three contributions in applications of neural networks. The first is related to diversity and ensemble learning, while the other two cover novel applications of the selfattention mechanism. An important aspect of training a neural network is the choice of objective function. Regression via Classification (RvC) is often used to tackle problems in deep learning where the target variable is continuous, but standard regression objectives fail to capture the underlying distance metric of the domain. This can result in better performance of the trained model, but the optimal choice of discrete classes used in RvC is not well understood. In Paper 1, we introduce the concept of label diversity by generalizing the RvC method. By exploiting the fact that labels can be generated in arbitrary ways for continuous and ordinal target variables, we show that using multiple labels can improve the prediction accuracy of a neural network compared to using a single label and provide theoretical justification from ensemble theory. We apply our method to several tasks in computer vision and show increased performance compared to regression and RvC baselines. The performance of a neural network is also influenced by the choice of network architecture, and in the design process it is important to consider the domain of the inputs and its symmetries. Graph neural networks (GNNs) is the family of networks that operates on graphs, where information is propagated between the graph nodes using for example selfattention. However, selfattention can be used for other data domains as well if the inputs can be converted into graphs, which is not always trivial. In Paper 2, we do this for audio by using a complete graph over audio features extracted from different time slots. We apply this technique to the task of keyword spotting and show that a neural network solely based on selfattention is more accurate than previously considered architectures. Finally, in Paper 3 we apply attentionbased learning to point cloud processing, where the permutation symmetry must be preserved. In order to make the selfattention mechanism both more efficient and more expressive, we propose a hierarchical approach that allows individual points to interact on both a local and global scale. By extensive experiments on several benchmarks, we show that this approach improves the descriptiveness of the learned features, while simultaneously reducing the computational complexity compared to an architecture that applies selfattention naively on all input points.
Originalspråk  engelska 

Kvalifikation  Licentiat 
Tilldelande institution 

Handledare 

Sponsorer för avhandling  
Tilldelningsdatum  2022 feb. 11 
Förlag  
ISBN (tryckt)  9789180391528 
ISBN (elektroniskt)  9789180391511 
Status  Published  2022 
Forskningsoutput: Kapitel i bok/rapport/Conference proceeding › Konferenspaper i proceeding › Peer review
Forskningsoutput: Kapitel i bok/rapport/Conference proceeding › Konferenspaper i proceeding › Peer review
Forskningsoutput: Kapitel i bok/rapport/Conference proceeding › Konferenspaper i proceeding › Peer review
Berg, A., Oskarsson, M., Åström, K. & O'Connor, M.
2019/02/01 → 2024/02/01
Projekt: Avhandling