Figure: Illustration of the unimodality assumption in the transposition graph of permutations. Each node represents a permutation (a recommendation) ฯ€ โˆˆ Sโ‚™, and edges connect transposition neighbors. The efficiency function f(ฯ€) (the number of clicks per recommendation) forms a single global peak, ensuring that from any node there exists a strictly increasing path toward the optimal permutation ฯ€*.

๐Ÿ’กAbstract

We tackle the online ranking problem of assigning L items to K positions on a web page in order to maximize the number of user clicks. We propose an original algorithm, easy to implement and with strong theoretical guarantees to tackle this problem in the Position-Based Model (PBM) setting, well suited for applications where items are displayed on a grid. Besides learning to rank, our algorithm, GRAB (for parametric Graph for unimodal RAnking Bandit), also learns the parameter of a compact graph over permutations of K items among L. The logarithmic regret bound of this algorithm is a direct consequence of the unimodality property of the bandit setting with respect to the learned graph. Experiments against state-of-the-art learning algorithms which also tackle the PBM setting, show that our method is more efficient while giving regret performance on par with the best known algorithms on simulated and real life datasets

๐Ÿ“Š Results

The theoretical performance of our algorithm compared to existing state-of-the-art methods is shown in the table below:

Our approach stands out with lighter assumptions and the lowest upper bound on theoretical regret across all compared methods.

These findings are further supported by experiments. The only exception is PB-MHB, which achieved the best empirical performance but comes with two major drawbacks: it is over ten times slower than GRAB when producing recommendations and lacks theoretical guarantees.

๐Ÿ“š Citation

If you use this code in your research, please cite:

BibTeX:

@InProceedings{pmlr-v139-gauthier21a,
  title = 	 {Parametric Graph for Unimodal Ranking Bandit},
  author =       {Gauthier, Camille-Sovanneary and Gaudel, Romaric and Fromont, Elisa and Lompo, Boammani Aser},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {3630--3639},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/gauthier21a/gauthier21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/gauthier21a.html},
}