利用MapReduce軟體架構於Hadoop叢集進行地貌型直接逕流模組演算之研究

No Thumbnail Available

Date

2011

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

台灣由於氣候及地形的因素,一旦下起豪大雨便常常造成河川瞬間水位暴漲,甚至釀成嚴重的災情,因此更彰顯洪水預報系統在台灣的重要性。河川流徑洪水演算是洪水預報系統最重要的一環,目的是計算流域中的各項水文相關資料以判斷流量是否超出警戒線。但河川流徑運算公式複雜,流域的相關資料量又龐大,以傳統交予大型電腦處理或者由客戶端連線至伺服端將工作交給伺服器處理等單一主機運算的方式往往需要消耗許多時間,造成預報不夠即時。 本研究的程式開發借重於Apache軟體基金會所開發的Hadoop開放源碼平台,Hadoop提供大量資料儲存及運算的分散式運算環境,以及提供程式開發者一種專為大量資料處理所設計的軟體架構-MapReduce,以分散式運算提供整合的運算資源加速處理龐大的資料量以減少運算時間。本研究使用MapReduce架構撰寫河川流徑演算程式,將其置於Hadoop叢集上運作,透過5種情境的量測得到最佳河川流徑演算速率可提升至6倍左右,達到提高洪水預報系統的效能、讓預報更即時的目的。
Because of the weather and landform in Taiwan, a heavy rain often cause sudden rising of the runoff of some basins, even lead to serious disaster. That makes flood information system are highly relied in Taiwan especially in typhoon season. Computing the runoff of a basin is the most important module of flood information system for checking whether the runoff exceeds warning level or not. However this module is complicated and data-intensive, it becomes the bottleneck when the real-time information are needed while a typhoon is attacking the basins. The development of applications in this thesis is on "Apache Hadoop"-an open-source software that builds a distributed storage and computing environment, which allows for the distributed processing of large data sets across clusters of computers using a programming model-"MapReduce". We have developed the runoff computing module of a basin by using MapReduce framework on a Hadoop cluster. In our research, to speed up the runoff computing will increase the efficiency of the flood information system. Running our programs in an 18 nodes Hadoop cluster, we have derived the conclusion that it can speed up the execution of runoff computing by 6 times.

Description

Keywords

Hadoop, MapReduce, 分散式運算, 大量資料處理, Hadoop, MapReduce, Distributed Computing, Processing for Large Data

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By