作者(外文):Lin, Tz-Yu
論文名稱(外文):Rethinking the Storage for High-Dimensional Query Approximation using Deep Neural Networks
指導教授(外文):Wu, Shan-Hung
外文關鍵詞:online analytical processingbusiness decision making supportapproximate query processcurse of dimensionalitydeep neural network
為了處理高維度下維度災難的問題,本篇論文提出基於深度類神經網路的近似查詢處理(DNN-Based Approach),利用深度類神經網路能夠克服維度災難的特性來維持高維度下優秀的存儲空間、回應時間以及準確率。另外以 DNN-Based Approach 為基礎,提出用於回答近似答案的新型態存儲系統 - 神經存儲(Neural Storage),其特點在於存儲空間的高壓縮比以及答案的高準確率。
最後透過實際的實驗驗證了 DNN-Based Approach 的確有能力克服維度災難,並且與最先進的近似查詢研究相比,DNN-Based Approach 可以利用比較少的存儲空間達到比較低的相對誤差並且回應時間非常短。另外在神經存儲方面,隨著資料的插入,神經存儲能夠保持低誤差,並且相較於其餘方法能夠有著更好的準確率-存儲空間比。

關鍵字: 線上分析處理、企業決策支援、近似查詢處理、維度災難、深度類神經網路
OLAP (Online Analytical Processing) is an access pattern of using a database. Typical applications of OLAP include decision making support and business intelligence. Since the decision making support is an interactive process, the main requirement of this kind of application is that the response time of answering analytical queries need to be short enough. However, the amount of data OLAP need to process is very huge which lead DBMS to have a long processing time. To deal with this problem, AQP (Approximate Query Processing) answers the approximate answer instead of scanning the whole data to achieve the goal of having a short response time.
The current researches of AQP can already achieve high quality when the data is low dimensional. However, when the data dimension grows high, the volume of the space increasing exponentially, all approaches of AQP will suffer from the curse of dimensionality. The curse of dimensionality will lead to large space requirement and long response time which is conflicting with the goal of AQP.
We propose a DNN-Based Approach which use the deep neural network to overcome the curse of dimensionality and keep the low storage cost, low response time and high accuracy at the same time. In addition, based on DNN-Based Approach, we present Neural Storage, a new type of storage for answering approximate answers. The characteristics of Neural Storage is the high storage compression rate and high answer quality.
In the experiment, we demonstrate the DNN-Based Approach can really overcome the curse of dimensionality. Furthermore, DNN-Based Approach achieves better accuracy with lower storage cost and shorter response time. On the other hand, Neural Storage keeps high accuracy when the data is continued coming and can have better accuracy-storage ratio than other methods.

Keyword: online analytical processing, business decision making support, approximate query process, curse of dimensionality, deep neural network
摘要 2
Abstract 3
致謝 4
目錄 5
第一章 前言 7
第二章 背景 9
第一節 情境 9
第二節 符號表示法 10
第三節 相關研究 11
第四節 深度類神經網路 13
第三章 要旨 15
第一節 DNN 如何克服維度災難 15
第二節 簡單想法 16
第三節 資料中心抽樣法 18
第四節 多樣輸出 19
第四章 神經存儲 21
第五章 評測 23
第一節 實驗設定 23
第二節 維度災難的克服 25
第三節 Range Selection 26
第四節 Range Selection on Partial Columns 31
第五節 Group By 33
第六節 神經存儲 34
第六章 結論 38
參考文獻 39
附錄 41
一、不同維度下不同彙總函式之比較 41
