Analysis of clickstream data

This thesis is concerned with providing further statistical development in the area of web usage analysis to explore web browsing behaviour patterns. We received two data sources: web log files and operational data files for the websites, which contained information on online purchases. There are ma...

Full description

Bibliographic Details
Main Author: Jamalzadeh, Mohammadamin
Published: Durham University 2011
Subjects:
519
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.545530
id ndltd-bl.uk-oai-ethos.bl.uk-545530
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-5455302015-03-20T04:50:12ZAnalysis of clickstream dataJamalzadeh, Mohammadamin2011This thesis is concerned with providing further statistical development in the area of web usage analysis to explore web browsing behaviour patterns. We received two data sources: web log files and operational data files for the websites, which contained information on online purchases. There are many research question regarding web browsing behaviour. Specifically, we focused on the depth-of-visit metric and implemented an exploratory analysis of this feature using clickstream data. Due to the large volume of data available in this context, we chose to present effect size measures along with all statistical analysis of data. We introduced two new robust measures of effect size for two-sample comparison studies for Non-normal situations, specifically where the difference of two populations is due to the shape parameter. The proposed effect sizes perform adequately for non-normal data, as well as when two distributions differ from shape parameters. We will focus on conversion analysis, to investigate the causal relationship between the general clickstream information and online purchasing using a logistic regression approach. The aim is to find a classifier by assigning the probability of the event of online shopping in an e-commerce website. We also develop the application of a mixture of hidden Markov models (MixHMM) to model web browsing behaviour using sequences of web pages viewed by users of an e-commerce website. The mixture of hidden Markov model will be performed in the Bayesian context using Gibbs sampling. We address the slow mixing problem of using Gibbs sampling in high dimensional models, and use the over-relaxed Gibbs sampling, as well as forward-backward EM algorithm to obtain an adequate sample of the posterior distributions of the parameters. The MixHMM provides an advantage of clustering users based on their browsing behaviour, and also gives an automatic classification of web pages based on the probability of observing web page by visitors in the website.519Durham Universityhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.545530http://etheses.dur.ac.uk/3366/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 519
spellingShingle 519
Jamalzadeh, Mohammadamin
Analysis of clickstream data
description This thesis is concerned with providing further statistical development in the area of web usage analysis to explore web browsing behaviour patterns. We received two data sources: web log files and operational data files for the websites, which contained information on online purchases. There are many research question regarding web browsing behaviour. Specifically, we focused on the depth-of-visit metric and implemented an exploratory analysis of this feature using clickstream data. Due to the large volume of data available in this context, we chose to present effect size measures along with all statistical analysis of data. We introduced two new robust measures of effect size for two-sample comparison studies for Non-normal situations, specifically where the difference of two populations is due to the shape parameter. The proposed effect sizes perform adequately for non-normal data, as well as when two distributions differ from shape parameters. We will focus on conversion analysis, to investigate the causal relationship between the general clickstream information and online purchasing using a logistic regression approach. The aim is to find a classifier by assigning the probability of the event of online shopping in an e-commerce website. We also develop the application of a mixture of hidden Markov models (MixHMM) to model web browsing behaviour using sequences of web pages viewed by users of an e-commerce website. The mixture of hidden Markov model will be performed in the Bayesian context using Gibbs sampling. We address the slow mixing problem of using Gibbs sampling in high dimensional models, and use the over-relaxed Gibbs sampling, as well as forward-backward EM algorithm to obtain an adequate sample of the posterior distributions of the parameters. The MixHMM provides an advantage of clustering users based on their browsing behaviour, and also gives an automatic classification of web pages based on the probability of observing web page by visitors in the website.
author Jamalzadeh, Mohammadamin
author_facet Jamalzadeh, Mohammadamin
author_sort Jamalzadeh, Mohammadamin
title Analysis of clickstream data
title_short Analysis of clickstream data
title_full Analysis of clickstream data
title_fullStr Analysis of clickstream data
title_full_unstemmed Analysis of clickstream data
title_sort analysis of clickstream data
publisher Durham University
publishDate 2011
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.545530
work_keys_str_mv AT jamalzadehmohammadamin analysisofclickstreamdata
_version_ 1716786820022272000