Time efficiency and mistake rates for online learning algorithms : A comparison between Online Gradient Descent and Second Order Perceptron algorithm and their performance on two different data sets

This dissertation investigates the differences between two different online learning algorithms: Online Gradient Descent (OGD) and Second-Order Perceptron (SOP) algorithm, and how well they perform on different data sets in terms of mistake rate, time cost and number of updates. By studying differen...

Full description

Bibliographic Details
Main Authors: Holmgren Faghihi, Josef, Gorgis, Paul
Format: Others
Language:English
Published: KTH, Skolan för elektroteknik och datavetenskap (EECS) 2019
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-260087
Description
Summary:This dissertation investigates the differences between two different online learning algorithms: Online Gradient Descent (OGD) and Second-Order Perceptron (SOP) algorithm, and how well they perform on different data sets in terms of mistake rate, time cost and number of updates. By studying different online learning algorithms and how they perform in different environments will help understand and develop new strategies to handle further online learning tasks. The study includes two different data sets, Pima Indians Diabetes and Mushroom, together with the LIBOL library for testing. The results in this dissertation show that Online Gradient Descent performs overall better concerning the tested data sets. In the first data set, Online Gradient Descent recorded a notably lower mistake rate. For the second data set, although it recorded a slightly higher mistake rate, the algorithm was remarkably more time efficient compared to Second-Order Perceptron. Future work would include a wider range of testing with more, and different, data sets as well as other relative algorithms. This will lead to better result and higher credibility. === Den här avhandlingen undersöker skillnaden mellan två olika “online learning”-algoritmer: Online Gradient Descent och Second-Order Perceptron, och hur de presterar på olika datasets med fokus på andelen felklassificeringar, tidseffektivitet och antalet uppdateringar. Genom att studera olika “online learning”-algoritmer och hur de fungerar i olika miljöer, kommer det hjälpa till att förstå och utveckla nya strategier för att hantera vidare “online learning”-problem. Studien inkluderar två olika dataset, Pima Indians Diabetes och Mushroom, och använder biblioteket LIBOL för testning. Resultatet i denna avhandling visar att Online Gradient Descent presterar bättre överlag på de testade dataseten. För det första datasetet visade Online Gradient Descent ett betydligt lägre andel felklassificeringar. För det andra datasetet visade OGD lite högre andel felklassificeringar, men samtidigt var algoritmen anmärkningsvärt mer tidseffektiv i jämförelse med Second-Order Perceptron. Framtida studier inkluderar en bredare testning med mer, och olika, datasets och andra relaterade algoritmer. Det leder till bättre resultat och höjer trovärdigheten.