Playing Congestion Games with Bandit Feedbacks

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chen, Po-An	en_US
dc.contributor.author	Lu, Chi-Jen	en_US
dc.date.accessioned	2019-05-02T00:26:47Z	-
dc.date.available	2019-05-02T00:26:47Z	-
dc.date.issued	2015-01-01	en_US
dc.identifier.isbn	978-1-4503-3413-6	en_US
dc.identifier.uri	http://hdl.handle.net/11536/151714	-
dc.description.abstract	Almost all convergence results from each player adopting specific "no-regret" learning algorithms such as multiplicative updates or the more general mirror-descent algorithms in repeated games are only known in the more generous information model, in which each player is assumed to have access to the costs of all possible choices, even the unchosen ones, at each time step. This assumption in general may seem too strong, while a more realistic one is captured by the bandit model, in which each player at each time step is restricted to know only the cost of her currently chosen path, but not any of the unchosen ones. Can convergence still be achieved in such a more challenging bandit model? We answer this question positively. While existing bandit algorithms do not seem to work here, we develop a new family of bandit algorithms based on the mirror-descent algorithm with such a guarantee in atomic congestion games.	en_US
dc.language.iso	en_US	en_US
dc.subject	Mirror-descent algorithm	en_US
dc.subject	No-regret dynamics	en_US
dc.subject	Convergence	en_US
dc.title	Playing Congestion Games with Bandit Feedbacks	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.journal	PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15)	en_US
dc.citation.spage	1721	en_US
dc.citation.epage	1722	en_US
dc.contributor.department	交大名義發表	zh_TW
dc.contributor.department	National Chiao Tung University	en_US
dc.identifier.wosnumber	WOS:000461455000213	en_US
dc.citation.woscount	0	en_US
Appears in Collections:	Conferences Paper