運用深度強化學習於全調變索引系統之碼本設計__國立清華大學博碩士論文全文影像系統

帳號：guest(3.17.173.100) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	莊雅伊
作者(外文):	Chuang, Ya-Yi
論文名稱(中文):	運用深度強化學習於全調變索引系統之碼本設計
論文名稱(外文):	Codebook Design of All Index Modulation with Deep Reinforcement Learning
指導教授(中文):	吳仁銘
指導教授(外文):	Wu, Jen-Ming
口試委員(中文):	簡鳳村鐘偉和桑梓賢
口試委員(外文):	Chien, Feng-Tsun Chung, Wei-Ho Sang, Tzu-Hsien
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	通訊工程研究所
學號:	108064505
出版年(民國):	110
畢業學年度:	109
語文別:	英文
論文頁數:	48
中文關鍵詞:	深度強化學習、全調變索引系統、碼本設計
外文關鍵詞:	Deep Reinforcement learning、All Index Modulation、Codebook design
相關次數:	推薦:0 點閱:572 評分: 下載:0 收藏:0

本篇論文提出運用深度強化學習 (Deep Reinforcement Learning)進行碼本設計應用
於正交分頻多工之全索引調變。全索引調變(All Index Modulation)是一種改良的索引
調變(Index Modulation),不同於傳統的索引調變,它藉由移除傳統的索引調變的調變
訊號,僅透過索引符號對應到碼本(Codebook)進行編碼來達到更高的分集增益(diversity
gain)。
全索引調變的錯誤表現依賴碼本的設計。然而傳統的碼本設計方法在低訊雜比時的錯
誤表現不佳,而透過窮盡搜尋找到最佳碼本的複雜度過高,故我們希望運用深度強化學習
來降低複雜度,同時能達到較佳的錯誤率表現。
我們將設計碼本的問題映射到深度強化學習的情境上,並設計出獎勵函數與其他要
素。我們運用深度循環Q-網路模型(Deep Recurrent Q-network)使之可以應用於碼字長度
較長的碼本。為了得知引導訓練模型的獎勵收斂值,我們分析了在給定碼本大小與子
載波(subcarrier)個數的情況下,理想中最佳碼本的漢明距離(Hamming distance)與歐式距
離(Euclidean distance)的關係,並推導出錯誤率(BER)的上界。
根據我們的模擬結果顯示,我們所設計的模型相較其他碼本設計方法而言複雜度更
低。錯誤率的表現也優於先前的碼本設計,甚至達到接近錯誤率上界的表現。
碼本的密度隨著子載波數量的增加而降低。因此,在相同的頻譜效率下,錯誤率表
現能有所提高。然而,以往的碼本設計方法因複雜度太高而難以實現。具有深度強化學
習的碼本設計可以應用於任意數量的子載波,這使得DRL的方法非常有吸引力。

In this thesis, we present a codebook design with deep reinforcement learning for OFDM
with All Index Modulation (OFDM-AIM). The AIM is an improved Index Modulation (IM), which removes the modulation symbols of the IM and only encodes the index symbols by mapping them to the codebook to achieve higher diversity gain.
The error performance of AIM depends on the design of the codebook. However, the error performance of the previous codebook design method is sub-optimal, and the complexity of finding the optimal codebook through exhaustive search is too high. In this work, we propose to use deep reinforcement learning (DRL) to reduce complexity and achieve better error performance.
We reformulate the problem of designing codebooks to DRL and design the reward function. We analyze the relationship between the Hamming distance and Euclidean distance of the optimal codebook to derive the SER upper bound.
According to the results we simulated, the BER performance is close to the theoretical upper bound and outperforms the classic AIM. The model we designed is less complex than other codebook design methods. The density of the codebook decreases as the number of subcarriers increases. Therefore, the error performance improves under the same spectrum efficiency. However, previous codebook design methods are difficult to implement due to the high complexity. The design of codebooks with DRL can be applied to any number of subcarriers which makes the DRL approach very attractive.

Chinese Abstract i
English Abstract ii
Contents iii
1 INTRODUCTION 1
1.1 Foreword . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . 2
1.3 Research Motivation and Objective . . . . . . . . . . 2
1.4 Proposed Method . . . . . . . . . . . . . . . . . . . . 3
1.5 Contribution and Achievement . . . . . . . . . . . . . . 3
1.6 Thesis Organization . . . . . . . . . . . . . . . . . . . 4
2 BACKGROUNDS 5
2.1 OFDM with Index Modulation . . . . . . . . . . . . . 5
2.1.1 OFDM with Index Modulation . . . . . . . . . . . . . 5
2.1.2 OFDM with Multi-Mode Index Modulation (MM-OFDM-IM) . . 7
2.1.3 OFDM with Q-Ary Multi-Mode Index Modulation (Q-MM-OFDM-IM)..8

2.2 OFDM with All Index Modulation ........ 9
2.3 OFDM with Improved All Index Modulation .....11
2.4 Reinforcement Learning ......14
2.4.1 Reinforcement Learning ......14
2.4.2 Deep Q-network (DQN) ......15
2.4.3 Deep Recurrent Network (DRQN) ......16
3 Codebook Design with Deep Reinforcement Learning for AIM...19
3.1 System Model ...19
3.2 Problem Setup and Performance Analysis ......21
3.2.1 Problem statement ......21
3.2.2 Complexity Analysis ......23
3.2.3 Performance Analysis ......25
3.2.4 Union Bound of Symbol Error Rate ......26
3.3 The Design of Codebook with DRL ......29
3.3.1 States, Actions, Rewards ......29
3.3.2 Deep Recurrent Q-Network ......33
3.3.3 Codebook Model Design ......33
4 SIMULATION RESULT ......37
4.1 Reward ......37
4.2 Complexity ......39
4.3 SER Theoretical Upper Bound of OFDM-AIM ......41
4.4 BER Performance of OFDM-AIM ......43
5 CONCLUSION ......45
Bibliography ......46

[1] E. Basar, “Index modulation techniques for 5G wireless networks,” IEEE Commun.
Mag., vol. 54, pp. 168–175, Jul. 2016.
[2] T. Mao, Q. Wang, Z. Wang, and S. Chen, “Novel Index Modulation Techniques: A
Survey,” IEEE Commun. Surveys Tutorials, vol. 21, pp. 315–348, Jul. 2019.
[3] E. Ba¸sar, m. Ayg¨ol¨u, E. Panayırcı, and H. V. Poor, “Orthogonal frequency division
multiplexing with index modulation,” in 2012 IEEE Global Communications Conference
(GLOBECOM), pp. 4741–4746, 2012.
[4] M. Irfan and S. A¨ıssa, “On the spectral efficiency of orthogonal frequency-division multiplexing with index modulation,” in 2018 IEEE Global Communications Conference
(GLOBECOM), pp. 1–6, 2018.
[5] J. Choi, “Coded ofdm-im with transmit diversity,” IEEE Transactions on Communications, vol. 65, no. 7, pp. 3164–3171, 2017.
[6] Y. Shi, X. Lu, K. Gao, J. Zhu, and S. Wang, “Subblocks Set Design Aided Orthogonal
Frequency Division Multiplexing With All Index Modulation,” IEEE Access, vol. 7,
pp. 52659–52668, Apr. 2019.
[7] M. Wen, E. Basar, Q. Li, B. Zheng, and M. Zhang, “Multiple-Mode Orthogonal Frequency Division Multiplexing With Index Modulation,” IEEE Trans. Commun., vol. 65,
pp. 3892–3906, Sep. 2017.
[8] F. Yarkin and J. P. Coon, “Q-ary multi-mode ofdm with index modulation,” IEEE
Wireless Communications Letters, vol. 9, no. 7, pp. 1110–1114, 2020.
[9] R. Singleton, “Maximum distance q-nary codes,” IEEE Transactions on Information
Theory, vol. 10, no. 2, pp. 116–118, 1964.
[10] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press,
2018.
[11] V.Mnih, K.Kavukcuoglu, and D. Silver, “Human-level control through deep reinforcement learning,” Nature, pp. 529–533, Feb. 2015.
[12] J.Leem and HY.Kim, “Action-specialized expert ensemble trading system with extended
discrete action space using deep reinforcement learning,” PLoS ONE, Jul. 2020.
[13] M. Hausknecht and P. Stone, “Deep recurrent q-learning for partially observable mdps,”
in 2015 aaai fall symposium series, 2015.
[14] W.-W. Su and J.-M. Wu, “Codebook design for ofdm with in-phase/quadrature all
index modulation,” in 2020 IEEE 92nd Vehicular Technology Conference (VTC2020-
Fall), pp. 1–5, 2020.
[15] S. Fashandi, S. O. Gharan, and A. K. Khandani, “Coding over an erasure channel with
a large alphabet size,” in 2008 IEEE International Symposium on Information Theory,
pp. 1053–1057, 2008.
[16] M. Chiani and D. Dardari, “Improved exponential bounds and approximation for the
q-function with application to average error probability computation,” in Proc. IEEE
Global Commun. Conf. (GLOBECOM), pp. 1399–1402, Nov. 2002.
[17] C. Lee, “Some properties of nonbinary error-correcting codes,” IRE Transactions on
Information Theory, vol. 4, no. 2, pp. 77–82, 1958.
[18] A. Wyner and R. Graham, “An upper bound on minimum distance for a k-ary code,”
Inf. Control., vol. 13, pp. 46–52, 1968.

電子全文
中英文摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文