Abstract:
Data warehousing is not a trivial task, when dealing with vast amounts of distributed and heterogeneous data. Traditional Data warehouses are not well equipped to deal with the heterogeneous data. To meet the increasing business demands and to overcome the challenges faced by the traditional Data warehousing, the Extraction Transformation and Loading (ETL) technology needs to be extended. We exploit XML as a pivot language in order to unify, model and store heterogeneous data. We will describe an architecture for an XML based Data warehouse that is capable of integrating heterogeneous data into a unified repository.
In this thesis, we will show how XML technologies can be effective in improving the Data warehousing process. The basic idea behind this thesis is to associate XML with Data warehousing. We will focus on the integration process and describe the architecture to integrate heterogeneous data into an XML Data warehouse.
Later in this thesis, we will implement an XML Data warehouse and then validate its performance against traditional Data warehouse in terms of data load time, disk space utilization, data retrieval time and speed of operation.