Data Warehousing Project - Environment Setup




Data Warehousing > Data Warehouse Design > Physical Environment Setup

Task Description

Once the requirements are somewhat clear, it is necessary to set up the physical servers and databases. At a minimum, it is necessary to set up a development environment and a production environment. There are also many data warehousing projects where there are three environments: Development, Testing, and Production.

It is not enough to simply have different physical environments set up. The different processes (such as ETL, OLAP Cube, and reporting) also need to be set up properly for each environment.

It is best for the different environments to use distinct application and database servers. In other words, the development environment will have its own application server and database servers, and the production environment will have its own set of application and database servers.

Having different environments is very important for the following reasons:

  • All changes can be tested and QA'd first without affecting the production environment.
  • Development and QA can occur during the time users are accessing the data warehouse.
  • When there is any question about the data, having separate environment(s) will allow the data warehousing team to examine the data without impacting the production environment.

Time Requirement

Getting the servers and databases ready should take less than 1 week.

Deliverables

  • Hardware / Software setup document for all of the environments, including hardware specifications, and scripts / settings for the software.

Possible Pitfalls

To save on capital, often data warehousing teams will decide to use only a single database and a single server for the different environments. Environment separation is achieved by either a directory structure or setting up distinct instances of the database. This is problematic for the following reasons:

1. Sometimes it is possible that the server needs to be rebooted for the development environment. Having a separate development environment will prevent the production environment from being impacted by this.

2. There may be interference when having different database environments on a single box. For example, having multiple long queries running on the development database could affect the performance on the production database.

Next Section: Data Modeling




Copyright © 2014   1keydata.com   All Rights Reserved     Privacy Policy   About   Contact