基於深度強化學習之聚合商價格策略應用於再生能源市場__國立清華大學博碩士論文全文影像系統

帳號：guest(216.73.216.146) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	莊喻捷
作者(外文):	Chuang, Yu-Chieh
論文名稱(中文):	基於深度強化學習之聚合商價格策略應用於再生能源市場
論文名稱(外文):	Deep Reinforcement Learning Based Pricing Strategy of Aggregators Considering Renewable Energy
指導教授(中文):	邱偉育
指導教授(外文):	Chiu, Wei-Yu
口試委員(中文):	楊念哲陳以錚陳翔傑
口試委員(外文):	Yang, Nien-Che Chen, Yi-Cheng Chen, Hsiang-Chieh
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	電機工程學系
學號:	107061538
出版年(民國):	109
畢業學年度:	109
語文別:	英文
論文頁數:	34
中文關鍵詞:	深度強化學習、智慧電網、能源聚合商、競價策略、能源交易、可再生能源
外文關鍵詞:	deep reinforcement learning、smart grid、energy aggregator、pricing strategy、energy trading、renewable energy
相關次數:	推薦:0 點閱:230 評分: 下載:0 收藏:0

隨著科技日新月異的變化以及因環保意識抬頭而造成可再生能源的普及，聚合商作為電網中生產者與消費者仲介的功能也日益重要。其角色是整合金錢流與能源流並維持電網的穩定性與可靠性。本研究基於上述背景，提出了一個基於深度強化學習的出價策略，在與其他競爭聚合商競爭的同時，維持利益最大化與供需平衡。所提出的出價策略同時考慮了上述的參數指標、對手策略、以及市場狀態。並提出一些參數指標，將可再生能源的間歇不確定性整合進框架中。另外，隨著聚合商儲能系統的普及且充放電的上下界會因當前儲能系統狀態而有所改變，因此本研究設計了基於規則的儲能系統充放電策略，此法可整合進本研究提出的出價策略框架中。於結果比較中可見本研究所提之框架在表現上勝過當前其他模型，且透過指標的設計，能夠有效加速學習速度。

With the rapid development of information and communications technology and high penetration of renewable energy, the role of an aggregator in a smart grid has emerged to better coordinate power and cash flows between energy producers and consumers. In this study, variation indices about the statistics of renewables and a control law for an energy storage system are proposed. A deep reinforcement learning based pricing strategy of an aggregator for profit maximization in consideration of the energy balance is developed accordingly. The proposed approach can consider opponents' behaviors, variability of renewables, and varying bounds of charging and discharging events in a nonstationary environment, which can be hardly addressed by conventional learning algorithms such as Q-learning and deep Q-network. Numerical analysis using real-world data shows that the proposed approach can outperform existing pricing strategies in terms of the learning speed and profit of aggregators.

摘要 ii
Abstract iii
Acknowledgement iv
Contents v
List of Figures vii
List of Tables viii
Glossary ix
1 Introduction 1
2 Related Work 6
3 System Models 8
3.1 Market Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Aggregator Selection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Markov Decision Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Storage System Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Proposed Pricing Strategy 13
4.1 Variation Indices and Control Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Deep Reinforcement Learning for Pricing Strategy. . . . . . . . . . . . . . . . . . . . . . . 16
5 Numerical Results 21
6 Conclusion 28
Bibliography 29

[1] P. P. Reddy and M. M. Veloso, “Strategy learning for autonomous agents in smart grid
markets,” in Proc. Int. Joint Conf. Artif. Intell., Barcelona, Spain, Jul. 2011, pp. 1446–
1451.
[2] H. Zhang, Y. Li, D. W. Gao, and J. Zhou, “Distributed optimal energy management for
energy Internet,” IEEE Trans. Ind. Informat., vol. 13, no. 6, pp. 3081–3097, Jun. 2017.
[3] F. Y. Xu and L. L. Lai, “Novel active timebased
demand response for industrial consumers
in smart grid,” IEEE Trans. Ind. Informat., vol. 11, no. 6, pp. 1564–1573, 2015.
[4] J. Yang, J. Zhao, F. Luo, F. Wen, and Z. Y. Dong, “Decisionmaking
for electricity retailers:
A brief survey,” IEEE Trans. Smart Grid, vol. 9, no. 5, pp. 4140–4153, Sep. 2018.
[5] R. Rana and F. S. Oliveira, “Realtime
dynamic pricing in a nonstationary
environment
using modelfree
reinforcement learning,” Omega, vol. 47, pp. 116–126, Sep. 2014.
[6] J. Yang, J. Zhao, F. Wen, and Z. Y. Dong, “A framework of customizing electricity retail
prices,” IEEE Trans. Power Syst., vol. 33, no. 3, pp. 2415–2428, May 2018.
[7] W. Pei, Y. Du, W. Deng, K. Sheng, H. Xiao, and H. Qu, “Optimal bidding strategy and intramarket
mechanism of microgrid aggregator in realtime
balancing market,” IEEE Trans.
Ind. Informat., vol. 12, no. 2, pp. 587–596, 2016.
[8] S. Kim and H. Lim, “Reinforcement learning based energy management algorithm for
smart energy buildings,” Energies, vol. 11, no. 8, p. 2010, Aug. 2018.
[9] X. Wang, M. Zhang, and F. Ren, “A hybridlearning
based broker model for strategic power
trading in smart grid markets,” Knowl. Based Syst., vol. 119, pp. 142–151, 2017.
[10] B. Liefers, J. Hoogland, and H. L. Poutré, “A successful broker agent for power tac,” in
Proc. Int. Workshop AgentMediated
Electron. Commerce, Paris, France, Mar. 2014, pp.
99–113.
[11] T. R. P. M. Rúbio, J. Queiroz, H. L. Cardoso, A. P. Rocha, and E. Oliveira, “Tugatac broker:
A fuzzy logic adaptive reasoning agent for energy trading,” in Proc. Eur. Conf. MultiAgent
Syst., Athens, Greece, Dec. 2015, pp. 188–202.
[12] S. Özdemir and R. Unland, “Agentude17: A genetic algorithm to optimize the parameters
of an electricity tariff in a smart grid environment,” in Advances in Practical Appl. of
Agents, MultiAgent
Syst., and Complexity: The PAAMS Collection, Stockholm, Sweden,
Jul. 2018, pp. 224–236.
[13] W. Ketter, J. Collins, and P. Reddy, “Power TAC: A competitive economic simulation of
the smart grid,” Energy Econ., vol. 39, pp. 262–270, Sep. 2013.
[14] T. Remani, E. A. Jasmin, and T. P. I. Ahamed, “Residential load scheduling with renewable
generation in the smart grid: A reinforcement learning approach,” IEEE Syst. J., vol. 13,
no. 3, pp. 3283–3294, 2019.
[15] Z. Wen, D. O’Neill, and H. Maei, “Optimal demand response using devicebased
reinforcement
learning,” IEEE Trans. Smart Grid, vol. 6, no. 5, 2015.
[16] M. N. Kurt, O. Ogundijo, C. Li, and X. Wang, “Online cyberattack
detection in smart
grid: A reinforcement learning approach,” IEEE Trans. Smart Grid, vol. 10, no. 5, pp.
5174–5185, 2019.
[17] M. Peters, W. Ketter, M. SaarTsechansky,
and J. Collins, “A reinforcement learning
approach to autonomous decisionmaking
in smart electricity markets,” Mach. Learn.,
vol. 92, no. 1, pp. 5–39, Apr. 2013.
[18] D. Urieli and P. Stone, “An MDPbased
winning approach to autonomous power trading:
Formalization and empirical analysis,” in Proc. Int. Conf. Autonomous Agents and Multiagent
Systems, Singapore, May 2016.
[19] H. Chung, S. Maharjan, Y. Zhang, and F. Eliassen, “Distributed deep reinforcement learning
for intelligent load scheduling in residential smart grid,” IEEE Trans. Ind. Informat.,
pp. 1–1, 2020.
[20] S. Ghosh, E. Subramanian, S. P. Bhat, S. Gujar, and P. Paruchuri, “VidyutVanika: A reinforcement
learning based broker agent for a power trading competition,” in Proc. Conf.
Artif. Intell., vol. 33, Honolulu, Hawaii, USA, Feb. 2019, pp. 914–921.
[21] V. Mnih and et al., “Humanlevel
control through deep reinforcement learning,” Nature,
vol. 518, no. 7540, pp. 529–533, 2015.
[22] Y. Yang, J. Hao, M. Sun, Z. Wang, C. Fan, and G. Strbac, “Recurrent deep multiagent
Qlearning
for autonomous brokers in smart grid,” in Proc. Int. Joint Conf. Artif. Intell.,
Stockholm, Sweden, Jul. 2018, pp. 569–575.
[23] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, “Deterministic
policy gradient algorithms,” in Proc. Int. Conf. Mach. Learn., Beijing, China, 2014, pp.
387–395.
[24] T. P. Lillicrap and et al., “Continuous control with deep reinforcement learning,” in Proc.
Int. Conf. Learn. Representations, San Juan, Puerto Rico, 2016.
[25] H. Zhao, J. Zhao, J. Qiu, G. Liang, and Z. Y. Dong, “Cooperative wind farm control with
deep reinforcement learning and knowledge assisted learning,” IEEE Trans. Ind. Informat.,
pp. 1–1, 2020.
[26] H. Xu, H. Sun, D. Nikovski, S. Kitamura, K. Mori, and H. Hashimoto, “Deep reinforcement
learning for joint bidding and pricing of load serving entity,” IEEE Trans. Smart Grid,
vol. 10, no. 6, pp. 6366–6375, 2019.
[27] W. Wei, F. Liu, and S. Mei, “Charging strategies of ev aggregator under renewable generation
and congestion: A normalized nash equilibrium approach,” IEEE Transactions on
Smart Grid, vol. 7, no. 3, pp. 1630–1641, 2016.
[28] J. E. ContrerasOcaña,
M. A. OrtegaVazquez,
and B. Zhang, “Participation of an energy
storage aggregator in electricity markets,” IEEE Transactions on Smart Grid, vol. 10, no. 2,
pp. 1171–1183, 2019.
[29] T. M. Hansen, R. Roche, S. Suryanarayanan, A. A. Maciejewski, and H. J. Siegel, “Heuristic
optimization for an aggregatorbased
resource allocation in the smart grid,” IEEE Transactions
on Smart Grid, vol. 6, no. 4, pp. 1785–1794, 2015.
[30] K. Zare, M. P. Moghaddam, and M. K. SheikhElEslami,
“Riskbased
electricity procurement
for large consumers,” IEEE Transactions on Power Systems, vol. 26, no. 4, pp.
1826–1835, 2011.
[31] T. Urban and W. Conen, “Maxon16: A successful power tac broker,” 05 2017.
[32] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018.
[33] D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre,
D. Kumaran, T. Graepel et al., “A general reinforcement learning algorithm that masters
chess, shogi, and go through selfplay,”
Science, vol. 362, no. 6419, pp. 1140–1144, 2018.
[34] F. Niroui, K. Zhang, Z. Kashino, and G. Nejat, “Deep reinforcement learning robot for
search and rescue applications: Exploration in unknown cluttered environments,” IEEE
Robotics and Automation Letters, vol. 4, no. 2, pp. 610–617, 2019.
[35] W. Y. Wang, J. Li, and X. He, “Deep reinforcement learning for nlp,” in Proceedings of the
56th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts,
2018, pp. 19–21.
[36] N. C. Luong, D. T. Hoang, S. Gong, D. Niyato, P. Wang, Y.C.
Liang, and D. I. Kim, “Applications
of deep reinforcement learning in communications and networking: A survey,”
IEEE Communications Surveys & Tutorials, vol. 21, no. 4, pp. 3133–3174, 2019.
[37] L. Yan, C. Yongning, T. Xinshou, Z. Zhankui, and C. Haoyong, “The experiences and
practices on market with largescale
renewable energy grid integration,” in 2019 IEEE
Sustainable Power and Energy Conference (iSPEC), 2019, pp. 68–71.
[38] H. G. Svendsen, A. A. Shetaya, and K. Loudiyi, “Integration of renewable energy and the
benefit of storage from a grid and market perspective results
from morocco and egypt case
studies,” in 2016 International Renewable and Sustainable Energy Conference (IRSEC),
2016, pp. 1164–1168.
[39] B. Zakeri and S. Syri, “Intersection of national renewable energy policies in countries with
a common power market,” in 2016 13th International Conference on the European Energy
Market (EEM), 2016, pp. 1–5.
[40] P. Zou, Q. Chen, Q. Xia, G. He, and C. Kang, “Evaluating the contribution of energy
storages to support largescale
renewable generation in joint energy and ancillary service
markets,” IEEE Transactions on Sustainable Energy, vol. 7, no. 2, pp. 808–818, 2016.
[41] D. Miyagi, R. Sato, N. Ishida, Y. Sato, M. Tsuda, T. Hamajima, T. Shintomi, Y. Makida,
T. Takao, and K. Iwaki, “Experimental research on compensation for power fluctuation of
the renewable energy using the smes under the stateofcurrent
feedback control,” IEEE
Transactions on Applied Superconductivity, vol. 25, no. 3, pp. 1–5, 2015.
[42] D. Zhao, H. Wang, J. Huang, and X. Lin, “Storage or no storage: Duopoly competition
between renewable energy suppliers in a local energy market,” IEEE Journal on Selected
Areas in Communications, vol. 38, no. 1, pp. 31–47, 2020.
[43] Z. Zhou, F. Xiong, B. Huang, C. Xu, R. Jiao, B. Liao, Z. Yin, and J. Li, “Gametheoretical
energy management for energy internet with big databased
renewable power forecasting,”
IEEE Access, vol. 5, pp. 5731–5746, 2017.
[44] Q. Cai, A. FilosRatsikas,
P. Tang, and Y. Zhang, “Reinforcement mechanism design for
ecommerce,”
in Proc. World Wide Web Conf., Lyon, France, Apr. 2018, pp. 1339–1348.
[45] S. Y. Chen, Y. Yu, Q. Da, J. Tan, H. K. Huang, and H. H. Tang, “Stabilizing reinforcement
learning in dynamic environment with application to online recommendation,” in Proc.
Int. Conf. Knowl. Discovery Data Mining, London, United Kingdom, Jul. 2018, pp. 1187–
1196.
[46] H. H. Chang, W. Chiu, H. Sun, and C. M. Chen, “Usercentric
multiobjective approach to
privacy preservation and energy cost minimization in smart home,” IEEE Syst. J., vol. 13,
no. 1, pp. 1030–1041, 2018.
[47] S. P. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inf. Theory, vol. 28, no. 2,
pp. 129–137, 1982.

電子全文
中英文摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文