Measuring correlation between commit frequency and popularity on GitHub

This thesis studies the correlation between the commit frequency and popularity of Github projects. Over 12 000 projects were retrieved using the Github API, resulting in a dataset containing 85 projects after filtering out projects that were deemed unfit. The analysis of the projects consisted of c...

Full description

Bibliographic Details
Main Authors: Grönlund, Mårten, Jefford-Baker, Jonathan
Format: Others
Language:English
Published: KTH, Skolan för datavetenskap och kommunikation (CSC) 2017
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209819
id ndltd-UPSALLA1-oai-DiVA.org-kth-209819
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-kth-2098192018-01-14T05:11:39ZMeasuring correlation between commit frequency and popularity on GitHubengMätning av korrelation mellan commitfrekvens och popularitet på GitHubGrönlund, MårtenJefford-Baker, JonathanKTH, Skolan för datavetenskap och kommunikation (CSC)KTH, Skolan för datavetenskap och kommunikation (CSC)2017Computer SciencesDatavetenskap (datalogi)This thesis studies the correlation between the commit frequency and popularity of Github projects. Over 12 000 projects were retrieved using the Github API, resulting in a dataset containing 85 projects after filtering out projects that were deemed unfit. The analysis of the projects consisted of calculating the Pearson Correlation Coefficient using the frequency of commits and popularity as variables. Different time intervals were studied along with several metrics of popularity based upon the project’s metadata retrieved from Github. The results varied for the different time intervals and metrics of popularity but none of the measurements resulted in a correlation coefficient which indicated a strong or moderate correlation. Therefore this study reached the conclusion of no existing correlation between commit frequency and popularity. Although no correlation was found, several potential measures of improvement for further research were discovered. Denna studie undersöker korrelationen mellan frekvensen av commits och popularitet hos Github projekt. Över 12 000 projekt utvanns genom Github API:et vilket resulterade i en datamängd innehållandes 85 projekt efter att gallringen av oönskade projekt ägt rum. Analysen av projekten bestod av att beräkna Pearsons korrelationskoefficient med frekvensen av commits och popularitet som variabler. Baserat på projektens metadata från Github undersöktes olika tidsintervall kombinerat med flera mått på popularitet. Resultaten varierade för de olika tidsintervallen och popularitetsmåtten men ingen av mätningarna resulterade i en korrelationskoefficient som indikerade en stark eller medelstark korrelation. Således fastställde denna studie slutsatsen att ingen korrelation existerade mellan frekvensen av commits och popularitet. Trots att ingen korrelation hittades, upptäcktes däremot flera potentiella förbättringsåtgärder för vidare forskning Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209819application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Computer Sciences
Datavetenskap (datalogi)
spellingShingle Computer Sciences
Datavetenskap (datalogi)
Grönlund, Mårten
Jefford-Baker, Jonathan
Measuring correlation between commit frequency and popularity on GitHub
description This thesis studies the correlation between the commit frequency and popularity of Github projects. Over 12 000 projects were retrieved using the Github API, resulting in a dataset containing 85 projects after filtering out projects that were deemed unfit. The analysis of the projects consisted of calculating the Pearson Correlation Coefficient using the frequency of commits and popularity as variables. Different time intervals were studied along with several metrics of popularity based upon the project’s metadata retrieved from Github. The results varied for the different time intervals and metrics of popularity but none of the measurements resulted in a correlation coefficient which indicated a strong or moderate correlation. Therefore this study reached the conclusion of no existing correlation between commit frequency and popularity. Although no correlation was found, several potential measures of improvement for further research were discovered. === Denna studie undersöker korrelationen mellan frekvensen av commits och popularitet hos Github projekt. Över 12 000 projekt utvanns genom Github API:et vilket resulterade i en datamängd innehållandes 85 projekt efter att gallringen av oönskade projekt ägt rum. Analysen av projekten bestod av att beräkna Pearsons korrelationskoefficient med frekvensen av commits och popularitet som variabler. Baserat på projektens metadata från Github undersöktes olika tidsintervall kombinerat med flera mått på popularitet. Resultaten varierade för de olika tidsintervallen och popularitetsmåtten men ingen av mätningarna resulterade i en korrelationskoefficient som indikerade en stark eller medelstark korrelation. Således fastställde denna studie slutsatsen att ingen korrelation existerade mellan frekvensen av commits och popularitet. Trots att ingen korrelation hittades, upptäcktes däremot flera potentiella förbättringsåtgärder för vidare forskning
author Grönlund, Mårten
Jefford-Baker, Jonathan
author_facet Grönlund, Mårten
Jefford-Baker, Jonathan
author_sort Grönlund, Mårten
title Measuring correlation between commit frequency and popularity on GitHub
title_short Measuring correlation between commit frequency and popularity on GitHub
title_full Measuring correlation between commit frequency and popularity on GitHub
title_fullStr Measuring correlation between commit frequency and popularity on GitHub
title_full_unstemmed Measuring correlation between commit frequency and popularity on GitHub
title_sort measuring correlation between commit frequency and popularity on github
publisher KTH, Skolan för datavetenskap och kommunikation (CSC)
publishDate 2017
url http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209819
work_keys_str_mv AT gronlundmarten measuringcorrelationbetweencommitfrequencyandpopularityongithub
AT jeffordbakerjonathan measuringcorrelationbetweencommitfrequencyandpopularityongithub
AT gronlundmarten matningavkorrelationmellancommitfrekvensochpopularitetpagithub
AT jeffordbakerjonathan matningavkorrelationmellancommitfrekvensochpopularitetpagithub
_version_ 1718609754231668736