Meta Dynamic Pricing: Transfer Learning Across Experiments

We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation in which the unknown demand parameters for each product come from an unknown distribution (pri- or) that is shared across products. We then pro...

Full description

Bibliographic Details
Main Authors:	Bastani, H. (Author), Simchi-Levi, D. (Author), Zhu, R. (Author)
Format:	Article
Language:	English
Published:	INFORMS Inst.for Operations Res.and the Management Sciences 2022
Subjects:	Costs Dynamic pricing empirical Bayes Empirical Bayes Learn+ Learning algorithms meta learning Metalearning misspecified prior Misspecified prior Pricing experiments Related products Shared structures Thompson sampling Thompson samplings transfer learning Transfer learning Uncertainty analysis
Online Access:	View Fulltext in Publisher


LEADER	02526nam a2200373Ia 4500
001	10.1287-mnsc.2021.4071
008	220706s2022 CNT 000 0 und d
020			\|a 00251909 (ISSN)
245	1	0	\|a Meta Dynamic Pricing: Transfer Learning Across Experiments
260		0	\|b INFORMS Inst.for Operations Res.and the Management Sciences \|c 2022
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1287/mnsc.2021.4071
520	3		\|a We study the problem of learning shared structure across a sequence of dynamic pricing experiments for related products. We consider a practical formulation in which the unknown demand parameters for each product come from an unknown distribution (pri- or) that is shared across products. We then propose a meta dynamic pricing algorithm that learns this prior online while solving a sequence of Thompson sampling pricing experiments (each with horizon T) for N different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (meta-exploration) with the need to leverage the estimated prior to achieve good performance (meta-exploitation) and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the estimated prior as a function of its estimation error. We introduce a novel prior alignment technique to analyze the regret of Thompson sampling with a misspecified prior, which may be of independent interest. Unlike prior-independent approaches, our algorithm’s meta regret grows sublinearly in N, demonstrating that the price of an unknown prior in Thompson sampling can be negligible in experiment-rich environments (large N). Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared with prior-independent algorithms. Copyright: © 2021 INFORMS
650	0	4	\|a Costs
650	0	4	\|a Dynamic pricing
650	0	4	\|a empirical Bayes
650	0	4	\|a Empirical Bayes
650	0	4	\|a Learn+
650	0	4	\|a Learning algorithms
650	0	4	\|a meta learning
650	0	4	\|a Metalearning
650	0	4	\|a misspecified prior
650	0	4	\|a Misspecified prior
650	0	4	\|a Pricing experiments
650	0	4	\|a Related products
650	0	4	\|a Shared structures
650	0	4	\|a Thompson sampling
650	0	4	\|a Thompson samplings
650	0	4	\|a transfer learning
650	0	4	\|a Transfer learning
650	0	4	\|a Uncertainty analysis
700	1		\|a Bastani, H. \|e author
700	1		\|a Simchi-Levi, D. \|e author
700	1		\|a Zhu, R. \|e author
773			\|t Management Science

Meta Dynamic Pricing: Transfer Learning Across Experiments

Similar Items