Measuring correlation between commit frequency and popularity on GitHub
This thesis studies the correlation between the commit frequency and popularity of Github projects. Over 12 000 projects were retrieved using the Github API, resulting in a dataset containing 85 projects after filtering out projects that were deemed unfit. The analysis of the projects consisted of c...
Main Authors: | , |
---|---|
Format: | Others |
Language: | English |
Published: |
KTH, Skolan för datavetenskap och kommunikation (CSC)
2017
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209819 |
id |
ndltd-UPSALLA1-oai-DiVA.org-kth-209819 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-kth-2098192018-01-14T05:11:39ZMeasuring correlation between commit frequency and popularity on GitHubengMätning av korrelation mellan commitfrekvens och popularitet på GitHubGrönlund, MårtenJefford-Baker, JonathanKTH, Skolan för datavetenskap och kommunikation (CSC)KTH, Skolan för datavetenskap och kommunikation (CSC)2017Computer SciencesDatavetenskap (datalogi)This thesis studies the correlation between the commit frequency and popularity of Github projects. Over 12 000 projects were retrieved using the Github API, resulting in a dataset containing 85 projects after filtering out projects that were deemed unfit. The analysis of the projects consisted of calculating the Pearson Correlation Coefficient using the frequency of commits and popularity as variables. Different time intervals were studied along with several metrics of popularity based upon the project’s metadata retrieved from Github. The results varied for the different time intervals and metrics of popularity but none of the measurements resulted in a correlation coefficient which indicated a strong or moderate correlation. Therefore this study reached the conclusion of no existing correlation between commit frequency and popularity. Although no correlation was found, several potential measures of improvement for further research were discovered. Denna studie undersöker korrelationen mellan frekvensen av commits och popularitet hos Github projekt. Över 12 000 projekt utvanns genom Github API:et vilket resulterade i en datamängd innehållandes 85 projekt efter att gallringen av oönskade projekt ägt rum. Analysen av projekten bestod av att beräkna Pearsons korrelationskoefficient med frekvensen av commits och popularitet som variabler. Baserat på projektens metadata från Github undersöktes olika tidsintervall kombinerat med flera mått på popularitet. Resultaten varierade för de olika tidsintervallen och popularitetsmåtten men ingen av mätningarna resulterade i en korrelationskoefficient som indikerade en stark eller medelstark korrelation. Således fastställde denna studie slutsatsen att ingen korrelation existerade mellan frekvensen av commits och popularitet. Trots att ingen korrelation hittades, upptäcktes däremot flera potentiella förbättringsåtgärder för vidare forskning Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209819application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Computer Sciences Datavetenskap (datalogi) |
spellingShingle |
Computer Sciences Datavetenskap (datalogi) Grönlund, Mårten Jefford-Baker, Jonathan Measuring correlation between commit frequency and popularity on GitHub |
description |
This thesis studies the correlation between the commit frequency and popularity of Github projects. Over 12 000 projects were retrieved using the Github API, resulting in a dataset containing 85 projects after filtering out projects that were deemed unfit. The analysis of the projects consisted of calculating the Pearson Correlation Coefficient using the frequency of commits and popularity as variables. Different time intervals were studied along with several metrics of popularity based upon the project’s metadata retrieved from Github. The results varied for the different time intervals and metrics of popularity but none of the measurements resulted in a correlation coefficient which indicated a strong or moderate correlation. Therefore this study reached the conclusion of no existing correlation between commit frequency and popularity. Although no correlation was found, several potential measures of improvement for further research were discovered. === Denna studie undersöker korrelationen mellan frekvensen av commits och popularitet hos Github projekt. Över 12 000 projekt utvanns genom Github API:et vilket resulterade i en datamängd innehållandes 85 projekt efter att gallringen av oönskade projekt ägt rum. Analysen av projekten bestod av att beräkna Pearsons korrelationskoefficient med frekvensen av commits och popularitet som variabler. Baserat på projektens metadata från Github undersöktes olika tidsintervall kombinerat med flera mått på popularitet. Resultaten varierade för de olika tidsintervallen och popularitetsmåtten men ingen av mätningarna resulterade i en korrelationskoefficient som indikerade en stark eller medelstark korrelation. Således fastställde denna studie slutsatsen att ingen korrelation existerade mellan frekvensen av commits och popularitet. Trots att ingen korrelation hittades, upptäcktes däremot flera potentiella förbättringsåtgärder för vidare forskning |
author |
Grönlund, Mårten Jefford-Baker, Jonathan |
author_facet |
Grönlund, Mårten Jefford-Baker, Jonathan |
author_sort |
Grönlund, Mårten |
title |
Measuring correlation between commit frequency and popularity on GitHub |
title_short |
Measuring correlation between commit frequency and popularity on GitHub |
title_full |
Measuring correlation between commit frequency and popularity on GitHub |
title_fullStr |
Measuring correlation between commit frequency and popularity on GitHub |
title_full_unstemmed |
Measuring correlation between commit frequency and popularity on GitHub |
title_sort |
measuring correlation between commit frequency and popularity on github |
publisher |
KTH, Skolan för datavetenskap och kommunikation (CSC) |
publishDate |
2017 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209819 |
work_keys_str_mv |
AT gronlundmarten measuringcorrelationbetweencommitfrequencyandpopularityongithub AT jeffordbakerjonathan measuringcorrelationbetweencommitfrequencyandpopularityongithub AT gronlundmarten matningavkorrelationmellancommitfrekvensochpopularitetpagithub AT jeffordbakerjonathan matningavkorrelationmellancommitfrekvensochpopularitetpagithub |
_version_ |
1718609754231668736 |