超大型資料倉儲之設計與建置-以電信業固網通聯記錄為例

Abstract

隨著知識與資訊科技的發達,企業組織所面對的已是一個快速變遷的環境,企業組織的經營者或管理者對決策資訊的需要日益殷切。傳統資料庫因為架構及系統擴充性等限制,已漸無法應付使用者多元化且要求時效性的需求,而且當決策分析所需的資料分布在不同且異質性的資料庫時,整合這些資料將是複雜且費時的程序,在凡事皆講求時效的資訊時代中,因為決策分析的缺乏效率,可能使企業組織因而喪失先機並降低競爭力。正因如此,身居電信業龍頭的中華電信不例外,近年來也一直致力於各式資料倉儲的建設-其中,固網通聯記錄原本以分散方式收集於32台營運處的資料庫,適逢汰換年限,擬改集中於中華電信公司北、中、南分公司三處資料庫。 本研究藉由「類似研究及相關文獻資料蒐集」及「現行舊系統的研究」,整理及歸納出影響資料庫效能的因素,目的在: 1.提出一套建置有效率及穩定的超大型資料倉儲之方法,利用中華電信固網通聯當作實驗標的,進行實作及效能驗證。 2.藉由與現行系統比較執行效能,證實即使在龐大的資料量下,處理效能仍能保持。藉以證明所設計的方法,確實可以建置一個有效率及穩定的超大型資料倉儲。 3.經由固網實際通聯記錄龐大資料量的實作考驗及測試數據驗證效能後,期盼能提供建置超大型資料倉儲系統之參考。 資料庫效能的好壞可由新增、異動、刪除、及查詢四方面來鑑定;本研究設計出一套方法,在有限的資源之下,有效地發揮資料倉儲的作用。利用本研究所設計的方法,已在中華電信固網通聯系統實作,並經測試驗證,根據數據顯示新系統效能顯著優於舊系統,證實本研究所提方法能有效達成效能要求。同時,也一併克服其他同類研究所遇到無法處理的問題。 關鍵詞 : 資料倉儲、關聯式資料庫、實體化視域、自我維護、維護成本、超大型資料庫
Living in an age of knowledge and information explosion, the enterprise organizations need to face a changeable environment. More and more decision-making information is needed by the proprietor or manager of the enterprise organizations day by day. Due to the limit of system expandability, the traditional database has become inefficient to deal with such a gradually changeable user demand. Moreover, it will be more complex to integrate the data for decision making that are widely distributed over heterogeneous databases. When efficiency is highly required, if the decision making lacks for efficiency, it will let the enterprise organization lose their competition ability. As a result, Chunghwa Telecom Co., Ltd. (CHT), the leader of telecommunication industry will not be an exception. In recent years, they also continuously devoted themselves to various types of data mining constructions. Previously, the call records of fixed-line network were distributed over 32 business places originally, which happened to approach the time of the equipment replacement, CHT planed to concentrate all those call records to three databases in the north, middle and south sections in Taiwan. According to “similar research and related literature about data collection” and “the research of the previous system”, this research tries to derive the factors of effects for improving the database. The main purposes are: 1. Propose an efficient and stable method to construct a super large data warehouse, and regard Chunghwa Telecom Co., Ltd. fixed-line network call records as the experiment objects, and actually carry on realization and efficiency experiment. 2. By comparing the efficiency with the previous system, it proves and confirms that it can still keep the efficiency even in a huge data quantity. According to the experiments, we can show that our method can really build an efficient and stable super large data warehouse. 3. Under lots of call records measurement and efficiency test, we expect to provide a useful reference for establishing a super large data warehouse. The efficiency of our database can be measured by four operations: add, change, delete, and search. This research proposes an economic method to achieve the efficiency goal of data warehouse under limited resources. The system has already been used for the service of the telecommunication fixed network call records in CHT. The experiments indicate that the efficiency of the new system is notably better than the old one, which confirms that this method can really make it. In the mean time, it also overcomes many problems that other organizations can’t solve. Keywords: Data warehouse, Relational database, Materialized view, Self-maintainability, Maintenance cost, Very large database, VLDB

Description

Keywords

資料倉儲, 關聯式資料庫, 實體化視域, 自我維護, 維護成本, 超大型資料庫, Data warehouse, Relational database, Materialized view, Self-maintainability, Maintenance cost, Very large database, VLDB

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By