Data Warehouse Methodology Summarized

The Seven Major Steps:

Business Analysis/Data Analysis
Database Design
Query Tool Implementation & End User Training
Performance & Tuning
Change Management
Backup & Recovery

Business Analysis

  1. Business Analysis
    1. Define Architecture
      1. Databases – sources and targets.
      2. Hardware platforms for sources and targets.
      3. Hardware platform for ETL (Extraction, Transformation, Load) tool.
      4. Communications (e.g. protocols, ODBC, etc.).
      5. Data access (e.g. ODBC, gateways, file libraries, etc.).
      6. Obtain network schematic.
    2. Define Subject Areas
      1. Interview business users to define subject areas.
        • define major subject groups.
        • define key dimensional attributes.
        • understand questions that need answers (queries) at all levels.
        • understand end-user PC literacy (to choose query tool).
        • understand Web requirements.
      2. Define integrated enterprise data warehouse.
      3. Define incrementally building each subject area.
      4. Define dimensional processes and entities.
      5. Define terminology.
      6. Define conforming dimensions.
      7. Define slowly changing dimensions.
      8. Define reusable elements to ensure integrity across all subject areas.
      9. Define history requirements.
    3. Define Meta Data Requirements
      1. CASE tool requirements.
      2. Extract/Transform/Load requirements.
      3. Query tool requirements.
    4. Analyze Source Data
      1. Determine source data elements.
        • Determine cleansing requirements.
        • Define Change-Data-Capture requirements.
      2. Determine data feasibility.
        • Compare user data requirements to available data sources.
        • Identify potential issues.
          1. Define Data Mart Grain
            1. Define fact table metrics.
              • Define statistical measurements.
              • Define source database coorelations
            2. Determine fact table partitions.
            3. Determine legacy database’s ability to deliver grain.
            4. Define business rules.
            5. Estimate database size.
            6. Estimate hardware requirements
            7. Estimate query speed.
          2. Define Aggregates and Summarizations
            1. Define aggregates for query optimization.
            2. Define aggregates for most highly utilized reports.
          3. Choose Products and Tools
            1. CASE tool.
            2. Cleansing requirements
            3. Extraction/transformation tool.
            4. Query tool.
            5. Data mart management.
            6. Change management.

Database Design

  1. Database Design
    1. Determine the priority of subject areas.
      1. Define priority of data marts.
      2. Define Dimensions for initial data mart.
    2. Implement Business Rules For Dimensions.
    3. Define Model Hierarchy.


    4. Implement Dimensional Attributes.
      1. Implement meta data definitions.
      2. Design dimension tables.
      3. Review SQL scripts for dimension table creation.
    5. Determine Fact Table Grain.
      1. Review query requirements.
      2. Calculate performance of most complex queries.
      3. Define look-up table requirements.
      4. Determine potential performance issues and bottlenecks.
      5. Determine fact table partitions.
    6. Implement Fact Table(s)
      1. Implement meta data definitions.
      2. Implement fact table metrics.
      3. Ensure relational integrity between fact table and dimension tables.
      4. Review fact table SQL scripts for table creation.
    7. Define Load Strategy.
    8. Review Model With Business Users.
      1. Ensure integrity of model.
      2. Ensure business process is being followed.
      3. Ensure semantics are accurate.
      4. Discuss any limitations.
      5. Discuss changes and expected results of implementing changes.
    9. Implement Changes to Model.
    10. Review Model With Users.
      1. Implement any necessary changes to model.
    11. Determine Indexes.
      1. Define indexes in model schema.
    12. Implement Model DDL and indexes in database.




  • Extraction/Transformation/Load
    1. Review CASE design.
    2. Install extract/transform tool.
    3. Read in source meta data and save to Repository Manager.
    4. Read in target meta data and save to Repository Manager.
    5. Determine Jobs.
    6. Map dimension tables.
      1. Implement slowly changing dimensions in transformations.
      2. Implement reusable routines.
      3. Implement transformation specific routines.
    7. Map fact tables.
      1. Implement reusable facts.
      2. Implement transformation specific facts.
    8. Implement load strategy.
    9. Load warehouse.




    Query Tool

  • Query Tool
    1. Install query tool.
    2. Define/review meta data requirements.
    3. Define reusable/common queries.
    4. Define batch query requirements.
    5. Implement batch and reusable queries.
    6. Test all predefined queries.
    7. Test ad-hoc capabilities.
    8. Review with end users.
    9. Implement changes.
    10. Review with end users.
    11. Make any necessary changes.
    12. Review with end users.
    13. Train end users on business analysis tool.


  • Database Performance & Tuning.
  • Implement Backup & Recovery Plans.
  • Implement Change Management Strategy