| 
 Publications
    
    
	
	      Preprints
		
		Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond. [PDF, arXiv]
		Lijun Zhang, Haomin Bai, Peng Zhao, Tianbao Yang, and Zhi-Hua Zhou.
 
 Conference Papers2025:
		
			Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness. [PDF forthcoming]   (Spotlight)
			Yuheng Zhao, Yu-Hu Yan, Kfir Yehuda Levy, Peng Zhao.
 In: Advances in Neural Information Processing Systems 38 (NeurIPS 2025), San Diego, California, 2025. Page: to appear.
 
 
			Provably Efficient Online RLHF with One-Pass Reward Modeling. [PDF, arXiv, bibtex]
			Long-Fei Li*, Yu-Yang Qian*, Peng Zhao, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 38 (NeurIPS 2025), San Diego, California, 2025. Page: to appear.
 
 
			Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update. [PDF, arXiv, bibtex]
			Yu-Jie Zhang, Sheng-An Xu, Peng Zhao, and Masashi Sugiyama.
 In: Advances in Neural Information Processing Systems 38 (NeurIPS 2025), San Diego, California, 2025. Page: to appear.
 
 
			Optimistic Online-to-Batch Conversions for Accelerated Convergence and Universality. [PDF forthcoming]
			Yu-Hu Yan, Peng Zhao, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 38 (NeurIPS 2025), San Diego, California, 2025. Page: to appear.
 
 
			Parameter-free Algorithms for the Stochastically Extended Adversarial Model. [PDF, arXiv, bibtex]
			Shuche Wang, Adarsh Barik, Peng Zhao, Vincent Y. F. Tan.
 In: Advances in Neural Information Processing Systems 38 (NeurIPS 2025), San Diego, California, 2025. Page: to appear.
 
 
		Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update. [PDF, arXiv, bibtex]
		Jing Wang, Yu-Jie Zhang, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 42nd International Conference on Machine Learning (ICML 2025), Vancouver, Canada, 2025. Page: to appear.
 
 
		TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree. [PDF, arXiv, bibtex]
		Yu-Yang Qian, Yuan-Ze Xu, Zhen-Yu Zhang, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 42nd International Conference on Machine Learning (ICML 2025), Vancouver, Canada, 2025. Page: to appear.
 
 
		Non-stationary Online Learning for Curved Losses: Improved Dynamic Regret via Mixablity. [PDF, arXiv, bibtex]
		Yu-Jie Zhang, Peng Zhao, and Masashi Sugiyama.
 In: Proceedings of the 42nd International Conference on Machine Learning (ICML 2025), Vancouver, Canada, 2025. Page: to appear.
 
 
 2024:
	 
		Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation. [PDF, arXiv, bibtex]
		Long-Fei Li, Yu-Jie Zhang, Peng Zhao, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, Canada, 2024. Page: 58539-58573.
 
 
		Gradient-Variation Online Learning under Generalized Smoothness. [PDF, arXiv, bibtex]
		Yan-Feng Xie, Peng Zhao, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, Canada, 2024. Page: 37865-37899.
 
 
		A Simple and Optimal Approach for Universal Online Learning with Gradient Variations. [PDF, bibtex]
		Yu-Hu Yan, Peng Zhao, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, Canada, 2024. Page: 11132-11163.
 
 
		Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs. [PDF, arXiv, bibtex]
		Long-Fei Li, Peng Zhao, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, Canada, 2024. Page: 55858-55883.
 
 
		Universal Online Convex Optimization with 1 Projection per Round. [PDF, arXiv, bibtex]
		Wenhao Yang, Yibo Wang, Peng Zhao, and Lijun Zhang.
 In: Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, Canada, 2024. Page: 31438-31472.
 
 
		Efficient Non-stationary Online Learning by Wavelets with Applications to Online Distribution Shift Adaptation. [PDF, bibtex]
		Yu-Yang Qian, Peng Zhao, Yu-Jie Zhang, Masashi Sugiyama, and Zhi-Hua Zhou.
 In: Proceedings of the 41st International Conference on Machine Learning (ICML 2024), Vienna, Austria, 2024. Page: 41383-41415.
 
 
		Learning with Adaptive Resource Allocation. [PDF, bibtex]
		Jing Wang, Miao Yu, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 41st International Conference on Machine Learning (ICML 2024), Vienna, Austria, 2024. Page: 52099-52116.
 
 
		Handling Heterogeneous Curvatures in Bandit LQR Control. [PDF, bibtex] 
		Yu-Hu Yan, Jing Wang, and Peng Zhao.
 In: Proceedings of the 41st International Conference on Machine Learning (ICML 2024), Vienna, Austria, 2024. Page: 55839-55858.
 
 
		Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition. [PDF, arXiv, bibtex]
		Long-Fei Li, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), Valencia, Spain, 2024. Page: 3061-3069.
 
 
		Dynamic Regret of Adversarial MDPs with Unknown Transition and Linear Function Approximation. [PDF, bibtex]
		Long-Fei Li, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024), Vancouver, Canada, 2024. Page: 13572-13580.
 
 
 2023:
	  
		Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach. [PDF, arXiv, bibtex]   (Spotlight)
		Yu-Hu Yan, Peng Zhao, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, Louisiana, 2023. Page: 37682-37715.
 
 
		Dynamic Regret of Adversarial Linear Mixture MDPs. [PDF, bibtex]
		Long-Fei Li, Peng Zhao, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, Louisiana, 2023. Page: 60685-60711.
 
 
		Adapting to Continuous Covariate Shift via Online Density Ratio Estimation. [PDF, arXiv, bibtex]
		Yu-Jie Zhang, Zhen-Yu Zhang, Peng Zhao, and Masashi Sugiyama.
 In: Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, Louisiana, 2023. Page: 29074-29113.
 
 
		Stochastic Approximation Approaches to Group Distributionally Robust Optimization. [PDF, arXiv, bibtex]
		Lijun Zhang, Peng Zhao, Zhen-Hua Zhuang, Tianbao Yang, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, Louisiana, 2023. Page: 52490-52522.
 
 
		Handling New Class in Online Label Shift. [PDF, bibtex]
		Yu-Yang Qian*, Yong Bai*, Zhen-Yu Zhang, Peng Zhao, and Zhi-Hua Zhou. (* indicates equal contribution)
 In: Proceedings of the 23rd IEEE International Conference on Data Mining (ICDM 2023), Shanghai, China, 2023. Page: 1283-1288.
      ♣ Journal version [PDF] published at IEEE TKDE, with more results (particularly allowing emerging new classes).
 
 
		Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization. [PDF, long version, arXiv, bibtex]
		Sijia Chen, Wei-Wei Tu, Peng Zhao, and Lijun Zhang.
 In: Proceedings of the 40th International Conference on Machine Learning (ICML 2023), Hawaii, Honolulu, 2023. Page: 5002-5035.
      ♣ Journal version [PDF] published at JMLR, with many more results (improved bounds for strongly convex function, new bounds for non-smooth scenarios).
 
 
		Fast Rates in Time-Varying Strongly Monotone Games. [PDF, bibtex]
		Yu-Hu Yan, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 40th International Conference on Machine Learning (ICML 2023), Hawaii, Honolulu, 2023. Page: 39138-39164.
 
 
		Revisiting Weighted Strategy for Non-stationary Parametric Bandits. [PDF, arXiv, bibtex]
		Jing Wang, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023), Valencia, Spain, 2023. Page: 7913-7942.
 
 
		Beyond Performative Prediction: Open-environment Learning with Presence of Corruptions. [PDF, bibtex]
		Jia-Wei Shan, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023), Valencia, Spain, 2023. Page: 7981-7998.
 
 
 2022:
	  
		Efficient Methods for Non-stationary Online Learning. [PDF, long version, arXiv, bibtex]
		  (Oral)Peng Zhao, Yan-Feng Xie, Lijun Zhang, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 35 (NeurIPS 2022), New Orleans, Louisiana, 2022. Page: 11573-11585.
      ♣ Journal version [PDF] published at JMLR, with many more results (e.g., interval dynamic regret, new applications including online non-stochastic control and online PCA).
 
 
		Adapting to Online Label Shift with Provable Guarantees. [PDF, arXiv, code, bibtex] 
		Yong Bai*, Yu-Jie Zhang*, Peng Zhao, Masashi Sugiyama, and Zhi-Hua Zhou.  (* indicates equal contribution)
 In: Advances in Neural Information Processing Systems 35 (NeurIPS 2022), New Orleans, Louisiana, 2022. Page: 29960-29974.
 
 
		Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits. [PDF, arXiv, bibtex]
		Haipeng Luo, Mengxiao Zhang, Peng Zhao, and Zhi-Hua Zhou. (alphabetical order)
 In: Proceedings of the 35th Annual Conference on Learning Theory (COLT 2022), London, UK, 2022. Page: 3635-3684.
 
 
		Adaptive Bandit Convex Optimization with Heterogeneous Curvature. [PDF, arXiv, bibtex]
		Haipeng Luo, Mengxiao Zhang, and Peng Zhao. (alphabetical order)
 In: Proceedings of the 35th Annual Conference on Learning Theory (COLT 2022), London, UK, 2022. Page: 1576-1612.
 
 
		No-Regret Learning in Time-Varying Zero-Sum Games. [PDF, arXiv, bibtex]
		Mengxiao Zhang*, Peng Zhao*, Haipeng Luo, and Zhi-Hua Zhou. (* indicates equal contribution)
 In: Proceedings of the 39th International Conference on Machine Learning (ICML 2022), Baltimore, Maryland, 2022. Page: 26772-26808.
 
 
		Dynamic Regret of Online Markov Decision Processes. [PDF, arXiv, full version, bibtex]
		Peng Zhao, Long-Fei Li, and Zhi-Hua Zhou.
 In: Proceedings of the 39th International Conference on Machine Learning (ICML 2022), Baltimore, Maryland, 2022. Page: 26865-26894.
 
 
		Non-stationary Online Learning with Memory and Non-stochastic Control. [PDF, arXiv, bibtex]
		Peng Zhao, Yu-Xiang Wang, and Zhi-Hua Zhou.
 In: Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022), online, 2022. Page: 2101-2133.
      ♣ Journal version [PDF] published at JMLR with improved memory dependence (as well as lower bound).
 
 
		Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits. [PDF, arXiv, bibtex]Youming Tao*, Yulian Wu*, Peng Zhao, and Di Wang. (* indicates equal contribution)
 In: Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022), online, 2022. Page: 1546-1574.
 
 
 2021:
	  
		Improved Analysis for Dynamic Regret of Strongly Convex and Smooth Functions. [PDF, arXiv, bibtex]Peng Zhao and Lijun Zhang.
 In: Proceedings of the 3rd Conference on Learning for Dynamics and Control (L4DC 2021), online, 2021. Page: 48-59.
 
 
        Exploratory Machine Learning with Unknown Unknowns. [PDF, code, bibtex]Peng Zhao, Yu-Jie Zhang, and Zhi-Hua Zhou.
 In: Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), online, 2021. Page: 10999-11006.
      ♣ Journal version [PDF] published at Artificial Intelligence.
 
 
        Towards Enabling Learnware to Handle Unseen Jobs. [PDF,  code, bibtex]Yu-Jie Zhang, Yu-Hu Yan, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), online, 2021. Page: 10964-10972.
 
 
        Storage Fit Learning with Feature Evolvable Streams. [PDF,  code, bibtex]Bo-Jian Hou, Yu-Hu Yan, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), online, 2021. Page: 7729-7736.
 
 
 2020:
	  
		Dynamic Regret of Convex and Smooth Functions. [PDF,  arXiv, bibtex]Peng Zhao, Yu-Jie Zhang, Lijun Zhang, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Vancouver, Canada, 2020. Page: 12510-12520.
      ♣ Journal version [PDF] finally got published at JMLR (with an almost three-year review period..)
      ♣  Many new results are included, e.g., one-gradient complexity for gradient-variation dynamic regret and collaborative online ensemble framework.
 
 
        An Unbiased Risk Estimator for Learning with Augmented Classes. [PDF,  arXiv, code, bibtex]Yu-Jie Zhang, Peng Zhao, Lanjihong Ma, and Zhi-Hua Zhou.
 In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Vancouver, Canada, 2020. Page: 10247-10258.
 
 
        Learning with Feature and Distribution Evolvable Streams. [PDF, code, bibtex]Zhen-Yu Zhang, Peng Zhao, Yuan Jiang, and Zhi-Hua Zhou.
 In: Proceedings of the 37th International Conference on Machine Learning (ICML 2020), Vienna, Austria, 2020. Page: 11317-11327.
 
 
        A Simple Online Algorithm for Competing with Dynamic Comparators. [PDF, bibtex]
		Yu-Jie Zhang, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI 2020), Toronto, Canada, 2020. Page: 390-399.
 
 
        Bandit Convex Optimization in Non-stationary Environments. [PDF, journal, arXiv, bibtex]
		Peng Zhao, Guanghui Wang, Lijun Zhang, and Zhi-Hua Zhou.
 In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020), Palermo, Italy, 2020. Page: 1508-1518.
      ♣ Journal version [PDF] published at JMLR with new results for adaptive regret bound of BCO.
 
 
        A Simple Approach for Non-stationary Linear Bandits. [PDF, arXiv version2, errata, bibtex] 
		Peng Zhao, Lijun Zhang, Yuan Jiang, and Zhi-Hua Zhou.
 In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020), Palermo, Italy, 2020. Page: 746-755.
      ♣ A correct and self-contained version [PDF] is updated at arXiv version2.
 
 
        Optimal Margin Distribution Learning in Dynamic Environments. [PDF, bibtex] 
		Teng Zhang, Peng Zhao, and Hai Jin.
 In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI 2020), New York, NY, 2020. Page: 6821-6828.
 
 
 Before 2019:
	  
		 Nearest Neighbor Ensembles: An Effective Method for Difficult Problems in Streaming Classification with Emerging New Classes. [PDF, code, bibtex]
		 Xin-Qiang Cai, Peng Zhao, Kai Ming Ting, Xin Mu, and Yuan Jiang.
 In: Proceedings of the 19th International Conference on Data Mining (ICDM 2019), Beijing, China, 2019. Page: 970-975.
 
 
		Learning from Incomplete and Inaccurate Supervision. [PDF, code, bibtex]
		Zhen-Yu Zhang, Peng Zhao, Yuan Jiang, and Zhi-Hua Zhou.
 In: Proceedings of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019), Anchorage, AL, 2019. Page: 1017-1025.
      ♣ Journal version [PDF] published at  IEEE TKDE.
 
 
		Improving Deep Forest by Confidence Screening. [PDF, code, bibtex]
		Ming Pang, Kai Ming Ting, Peng Zhao, and Zhi-Hua Zhou.
 In: Proceedings of the 18th IEEE International Conference on Data Mining (ICDM 2018), Singapore, 2018. Page: 1194-1199.
      ♣ Journal version [PDF] published at  IEEE TKDE.
 
 
		Label Distribution Learning by Optimal Transport. [PDF, supp, code, bibtex]
		Peng Zhao and Zhi-Hua Zhou.
 In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI 2018), New Orleans, Louisiana, 2018. Page: 4506-4513.
 
 
        Dual Set Multi-Label Learning. [PDF, supp, code, bibtex]
		Chong Liu, Peng Zhao, Sheng-Jun Huang, Yuan Jiang, and Zhi-Hua Zhou.
 In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI 2018), New Orleans, Louisiana, 2018. Page: 3635-3642.
 
 
        Multi-View Matrix Completion for Clustering with Side Information. [PDF, code, bibtex] 
		Peng Zhao, Yuan Jiang, and Zhi-Hua Zhou.
 In: Proceedings of the 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2017), LNCS, Jeju, Korea, 2017. Page: 403-415.
 Journal Papers
	  
		Efficient Methods for Non-stationary Online Learning. [PDF, arXiv, bibtex]
		Peng Zhao, Yan-Feng Xie, Lijun Zhang, and Zhi-Hua Zhou.
 Journal of Machine Learning Research (JMLR), 25(xxx):1−66, 2025.
 
 
	  
		Handling New Class in Online Label Shift. [PDF, bibtex]
		Yu-Yang Qian, Yong Bai, Zhen-Yu Zhang, Peng Zhao, and Zhi-Hua Zhou.
 IEEE Transactions on Knowledge and Data Engineering (TKDE), in press, 2025.
 
 
		Learning Objective Adaptation by Correlation-Based Model Reuse. [PDF, official version, bibtex]
		Lanjihong Ma, Yao-Xiang Ding, Peng Zhao, and Zhi-Hua Zhou.
 IEEE Transactions on Neural Networks and Learning Systems (TNNLS), in press, 2024.
 
 
		Learning with Asynchronous Labels. [PDF, code, bibtex]
		Yu-Yang Qian, Zhen-Yu Zhang, Peng Zhao, and Zhi-Hua Zhou.
 ACM Transactions on Knowledge Discovery from Data (TKDD), 18(8):1−27, 2024.
 
 
		Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization. [PDF, arXiv, bibtex]
		Sijia Chen, Yu-Jie Zhang, Wei-Wei Tu, Peng Zhao, and Lijun Zhang.
 Journal of Machine Learning Research (JMLR), 25(178):1−62, 2024.
 
 
		Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization. [PDF, arXiv, bibtex]
		Peng Zhao, Yu-Jie Zhang, Lijun Zhang, and Zhi-Hua Zhou.
 Journal of Machine Learning Research (JMLR), 25(98):1−52, 2024.
 
 
	  Exploratory Machine Learning with Unknown Unknowns. [PDF, arXiv, code, bibtex]Peng Zhao, Jia-Wei Shan, Yu-Jie Zhang, and Zhi-Hua Zhou.
 Artificial Intelligence (AIJ), Volume 327:104059, 2024.
 
 
		Online Non-stochastic Control with Partial Feedback. [PDF, bibtex]
		Yu-Hu Yan, Peng Zhao, and Zhi-Hua Zhou.
 Journal of Machine Learning Research (JMLR), 24(273):1−50, 2023.
 
 
		Non-stationary Online Learning with Memory and Non-stochastic Control. [PDF, arXiv, bibtex]
		Peng Zhao, Yu-Hu Yan, Yu-Xiang Wang, and Zhi-Hua Zhou.
 Journal of Machine Learning Research (JMLR), 24(206):1−70, 2023.
 
 
		Learning from Incomplete and Inaccurate Supervision. [PDF, official version, code, bibtex]
		Zhen-Yu Zhang, Peng Zhao, Yuan Jiang, and Zhi-Hua Zhou.
 IEEE Transactions on Knowledge and Data Engineering (TKDE), 34(12), 5854-5868, 2022.
 
 
		Improving Deep Forest by Screening. [PDF, official version, bibtex]
		Ming Pang, Kai Ming Ting, Peng Zhao, and Zhi-Hua Zhou.
 IEEE Transactions on Knowledge and Data Engineering (TKDE), 34(9), 4298-4312, 2022.
 
 
        Bandit Convex Optimization in Non-stationary Environments. [PDF, arXiv, bibtex]
		Peng Zhao, Guanghui Wang, Lijun Zhang, and Zhi-Hua Zhou.
 Journal of Machine Learning Research (JMLR), 22(125):1−45, 2021.
 
 
		基于决策树模型重用的分布变化流数据学习. [PDF]
		赵鹏, 周志华.
 中国科学:信息科学, 2021, 51(1): 1-12.(封面文章)
 
 
        Distribution-Free One-Pass Learning. [PDF, official version, code, bibtex]
		Peng Zhao, Xinqiang Wang, Siyu Xie, Lei Guo, and Zhi-Hua Zhou.
 IEEE Transaction on Data Engineering (TKDE), 33(3): 951-963, 2021.
 
 
        Handling Concept Drift via Model Reuse. [PDF, official version, code, bibtex] 
		Peng Zhao, Le-Wen Cai, and Zhi-Hua Zhou.
 Machine Learning (Special Issue of the ACML 2019 Journal Track), 109(3): 533-568, 2020.
 
 Technical Notes/Lecture Notes
      
		Lecture 9: Optimism for Acceleration. [PDF, course site, bibtex]
		Peng Zhao. Lecture Notes for Advanced Optimization, 2025.
 
 
		Non-stationary Linear Bandits Revisited. [PDF, arXiv]
		Peng Zhao and Lijun Zhang. Technical Note, 2021.
 
 
 [go back] 
	
 |