Generalized mirror descents in congestion games

doi:10.1016/j.artint.2016.09.002

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Chen, Po-An	en_US
dc.contributor.author	Lu, Chi-Jen	en_US
dc.date.accessioned	2017-04-21T06:56:08Z	-
dc.date.available	2017-04-21T06:56:08Z	-
dc.date.issued	2016-12	en_US
dc.identifier.issn	0004-3702	en_US
dc.identifier.uri	http://dx.doi.org/10.1016/j.artint.2016.09.002	en_US
dc.identifier.uri	http://hdl.handle.net/11536/132792	-
dc.description.abstract	Different types of dynamics have been studied in repeated game play, and one of them which has received much attention recently consists of those based on "no-regret" algorithms from the area of machine learning. It is known that dynamics based on generic no-regret algorithms may not converge to Nash equilibria in general, but to a larger set of outcomes, namely coarse correlated equilibria. Moreover, convergence results based on generic no-regret algorithms typically use a weaker notion of convergence: the convergence of the average plays instead of the actual plays. Some work has been done showing that when using a specific no-regret algorithm, the well-known multiplicative updates algorithm, convergence of actual plays to equilibria can be shown and better quality of outcomes in terms of the price of anarchy can be reached for atomic congestion games and load balancing games. Are there more cases of natural no-regret dynamics that perform well in suitable classes of games in terms of convergence and quality of outcomes that the dynamics converge to? We answer this question positively in the bulletin-board model by showing that when employing the mirror-descent algorithm, a well-known generic no-regret algorithm, the actual plays converge quickly to equilibria in nonatomic congestion games. This gives rise to a family of algorithms, including the multiplicative updates algorithm and the gradient descent algorithm as well as many others. Furthermore, we show that our dynamics achieves good bounds on the outcome quality in terms of the price-of-anarchy type of measures with two different social costs: the average individual cost and the maximum individual cost. Finally, the bandit model considers a probably more realistic and prevalent setting with only partial information, in which at each time step each player only knows the cost of her own currently played strategy, but not any costs of unplayed strategies. For the class of atomic congestion games, we propose a family of bandit algorithms based on the mirror descent algorithms previously presented, and show that when each player individually adopts such a bandit algorithm, their joint (mixed) strategy profile quickly converges with implications. (C) 2016 Elsevier B.V. All rights reserved.	en_US
dc.language.iso	en_US	en_US
dc.subject	Mirror-descent algorithm	en_US
dc.subject	No-regret dynamics	en_US
dc.subject	Convergence	en_US
dc.subject	Bandit model	en_US
dc.title	Generalized mirror descents in congestion games	en_US
dc.identifier.doi	10.1016/j.artint.2016.09.002	en_US
dc.identifier.journal	ARTIFICIAL INTELLIGENCE	en_US
dc.citation.volume	241	en_US
dc.citation.spage	217	en_US
dc.citation.epage	243	en_US
dc.contributor.department	資訊管理與財務金融系註：原資管所+財金所	zh_TW
dc.contributor.department	Department of Information Management and Finance	en_US
dc.identifier.wosnumber	WOS:000387518000009	en_US
顯示於類別：	期刊論文