帳號:guest(18.189.192.220)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):李重岳
作者(外文):LI, CHONG-YUE
論文名稱(中文):用於自動駕駛之深度學習語義分割神經網路
論文名稱(外文):Deep learning neural networks of semantic segmentation for self driving
指導教授(中文):劉晉良
指導教授(外文):Liu, Jinn-Liang
口試委員(中文):陳人豪
陳仁純
口試委員(外文):Chen, Jen-Hao
Chen, Ren-Chun
學位類別:碩士
校院名稱:國立清華大學
系所名稱:計算與建模科學研究所
學號:109026502
出版年(民國):111
畢業學年度:110
語文別:中文
論文頁數:61
中文關鍵詞:深度學習自動駕駛
外文關鍵詞:neural networksself driving
相關次數:
  • 推薦推薦:0
  • 點閱點閱:44
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
隨著科技日新月異,人工智慧的技術及應用絡繹不絕,其中一個備受矚目的焦點便是汽車自動駕駛。若要使電腦完全代替人類在駕駛座上的位置,有許多複雜且困難的目標與問題需要一一解決。美國上市公司comma.ai推出了搭載openpilot的comma2,期望使所有路上行駛之汽車皆能透過安裝此設備,達成 level 2 的自動駕駛,並且也在網路上開源了程式碼,開放任何使用網際網路的用戶參加此專案,協助改進comma2的技術及內容成為更完善的自動駕駛裝置。其中一個項目,便是 comma10k 資料集,藉由開放使用者標註及分類,形成一筆數量可觀的美國公路資料集。可用以增進裝置在comma2或其餘神經網路的辨識精準度。
由於comma.ai所訓練的神經網路supercombo.dlc未開源,我們僅能從onnx檔案得知其結構而未知其原始程式碼。若是我們想要自行搭建功能相似的神經網路取代supercombo,置於 comma2 中進行道路駕駛預測,則comma10k是不可或缺的訓練資料。其提供不同時間及環境的美國公路實景,可以很有效地訓練並加強模型的辨識能力。反過來說,如果連comma10k都預測的很差,則更無法指望其能有匹敵 supercombo的效能甚至是取代它。此論文將介紹一種能有效地完成語義分割任務之神經網路Eff-UNet,以及各種不同加密器與解密器的特色與優勢,並且嘗試變換使
用,以找出最佳組合。確立神經網路結構後,我們將實際對其進行多種不同參數的訓練。由於實驗室的硬體設備有限,我們無法完整重現他人的實驗結果,在此前提下,透過理論調整學習參數,盡可能的逼近驗證誤差,並且歸納調整方向與實驗結論,最後,我們會將最佳權重實際對comma10k資料集進行預測,使用視覺觀察實驗結果,實際感受Eff-UNet對comma10k強大的辨識能力,以及我們的實驗成果。

With the rapid development of science and technology, the technology and application of artificial intelligence are in an endless stream, and one of the focuses that has attracted much attention is the automatic driving of cars. There are many complex and difficult goals and problems that need to be solved if computers are to completely replace humans in the driver's seat. Comma.ai, a company in the United States, has launched comma2 equipped with openpilot, hoping to enable all cars on the road to achieve level 2 autonomous driving by installing this device, and has also open sourced the code on the Internet for users to participate comma.ai's projects and improve openpilot. It can be used to improve the recognition accuracy of the device in comma2 or other neural networks.

Since the neural network supercombo.dlc trained by comma.ai is not open source, we can only know its structure from the onnx file without knowing its source code. If we want to build a neural network with similar functions to replace supercombo and put it in comma2 for road driving prediction, comma10k is an indispensable training data. It provides real scenes of U.S. highways in different times and environments, which can effectively train and strengthen the recognition ability of the model. Conversely, if our model predicts badly on comma10k, we cannot expect it to match or even replace supercombo. This thesis will introduce a neural network Eff-UNet that can effectively complete the task of semantic segmentation, as well as the characteristics and advantages of various encoders and decoders, and try to use them to find the best combination. Once the neural network structure is established, we will actually train it with a number of different parameters. Due to the limited hardware equipment in the laboratory, we cannot completely reproduce the experimental results of others. Under this premise, we adjust the learning parameters through theory to approximate the verification error as much as possible, and summarize the adjustment direction and experimental conclusions. We will actually predict the comma10k dataset with the best trained model, and use the experimental results of visual observation to actually feel the powerful recognition ability of Eff-UNet for comma10k, as well as our experimental results.
摘要----------------------i
Abstract------------------ii
1 Introduction------------1
1.1 Openpilot-------------1
1.2 Comma10k--------------1
2 Literature Review-------3
3 Motivation--------------5
3.1 實驗動機一-------------5
3.2 實驗動機二-------------7
4 Deep learning neural networks------9
4.1 Eff-UNet--------------9
4.1.1 Introduction--------9
4.1.2 Encoder-------------10
4.1.3 Decoder-------------11
4.1.4 Training method-----11
4.1.5 Image preprocessing-12
4.1.6 Learning rate decay-24
4.1.7 Model structure-----24
4.1.8 Up convolution------26
4.1.9 Accuracy calculation---27
4.2 DeepLabV3+------------28
4.3 UNet++----------------29
5 Implementation----------31
5.1 Computing resources---31
5.2 Training method-------32
6 Results-----------------33
6.1 Training Results------33
6.1.1 Eff-UNet training condition table---33
6.1.2 EfficientNet_B4+U-Net on server B---34
6.1.3 EfficientNet_B4+DeepLabV3+ on server B---36
6.1.4 EfficientNet_B4+DeepLabV3+ on server A---37
6.1.5 EfficientNet_B5+U-Net on server A--------39
6.1.6 EfficientNet_B5+U-Net++ on server A------40
6.1.7 EfficientNet_B2+U-Net on server B--------41
6.1.8 EfficientNet_B4+U-Net on server A--------44
6.1.9 EfficientNet_B4+U-Net on server A with adjusting eta minimum---48
6.1.10 Summary tables-----52
6.2 Prediction------------53
7 Conclusion--------------57
8 Reference---------------59
1. M. Tan and L. E. Quoc. Efficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, 2019. p. 6105-6114.

2. Ronneberger, O., Fischer, P., \& Brox, T. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015. p. 234-241.

3. Zhou, Z., Rahman Siddiquee, M. M., Tajbakhsh, N., \& Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Cham, 2018. p. 3-11.

4. Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., \& Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV). 2018. p. 801-818.

5. Long, J., Shelhamer, E., \& Darrell, T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. p. 3431-3440.

6. Yu, F., \& Koltun, V. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122, 2015.

7. Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., \& Yuille, A. L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 2017, 40.4: 834-848.

8. Chen, L. C., Papandreou, G., Schroff, F., \& Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587, 2017.

9. Baheti, B., Innani, S., Gajre, S., \& Talbar, S. Eff-unet: A novel architecture for semantic segmentation in unstructured environment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020. p. 358-359.

10. Comma.ai:https://github.com/commaai/

11. Deep Learning Containers File:https://cloud.google.com/deep-learning-containers/docs/overview

12. Adaptive Cruise Control:https://mycardoeswhat.org/safety-features/adaptive-cruise-control/

13. Comma10k:https://github.com/commaai/comma10k

14. Mapillary Vistas:https://www.mapillary.com/dataset/vistas

15. Cityscapes dataset:https://www.cityscapes-dataset.com/

16. Qualcomm snapdragon S821: https://www.qualcomm.com/products/application/smartphones/snapdragon-8-series-mobile-platforms/snapdragon-821-mobile-platform

17. Snapdragon Neural Processing Engine: https://developer.qualcomm.com/sites/default/files/docs/snpe/overview.html

18. Y. Yousfi:https://yassineyousfi.github.io/

19. Eff-UNet:https://github.com/YassineYousfi/comma10k-baseline

20. Imagenet:https://www.image-net.org/

21. Pytorch:https://pytorch.org/

22. Semantic segmentation:https://paperswithcode.com/task/semantic-segmentation

23. Pytorch lightning:https://www.pytorchlightning.ai/

24. Albumentations:https://albumentations.ai/

25. Opencv:https://docs.opencv.org/4.x/index.html

26. Segmentation\_models.pytorch:https://github.com/qubvel/segmentation\_models.pytorch

27. P. Iakubovskii:https://github.com/qubvel

28. Up convolution:https://naokishibuya.medium.com/up-sampling-with-transposed-convolution-9ae4f2df52d0

29. cosine annealing learning rate: https://pytorch.org/docs/stable/generated/torch.optim.lr\_scheduler.CosineAnnealingLR.html
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *