/* remove this */ Blogger Widgets /* remove this */
Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

Friday, 17 October 2014

DataWareHousing - ETL Project Life Cycle ( Simple to understand )


Warehousing -

Data warehousing projects are categorized into 4 types.

1) Development Projects.
2) Enhancement Projects
3) Migration Projects
4) Production support Projects.
The following are the different phases involved in a ETL project development life cycle.
1) Business Requirement Collection ( BRD )
2) System Requirement Collection ( SRD )
3) Design Phase
a) High Level Design Document ( HRD )
b) Low level Design Document ( LLD )
c) Mapping Design
4) Code Review
5) Peer Review
6) Testing
a) Unit Testing
b) System Integration Testing.
c) USer Acceptance Testing ( UAT )
7) Pre - Production
8) Production ( Go-Live )
Business Requirement Collection :-
  • The business requirement gathering start by business Analyst, onsite technical lead and client business users.
  • In this phase,a Business Analyst prepares Business Requirement Document ( BRD ) (or) Business Requirement Specifications ( BRS )
  • BR collection takes place at client location.
  • The o/p from BR Analysis are

            BRS :- Business Analyst will gather the Business Requirement and document in BRS
            SRS :- Senior technical people (or) ETL architect will prepare the SRS which                                 contains s/w and h/w requirements.
The SRS will includes -
  • O/S to be used ( windows or unix )
  • RDBMS required to build database ( oracle, Teradata etc )
  • ETL tools required ( Informatica,Datastage )
  • OLAP tools required ( Cognos ,BO )

The SRS is also called as Technical Requirement Specifications ( TRS )
Designing and Planning the solutions :-
The o/p from design and planning phase is
  • HLD ( High Level Design ) Document
  • LLD ( Low Level Design ) Document
HLD ( High Level Design ) Document : -
  • An ETL Architect and DWH Architect participate in designing a solution to build a DWH.
  • An HLD document is prepared based on Business Requirement.
LLD ( Low Level Design ) Document : -
  • Based on HLD,a senior ETL developer prepare Low Level Design Document 
  • The LLD contains more technical details of an ETL System.
  • An LLD contains data flow diagram ( DFD ), details of source and targets of each mapping.
  • An LLD also contains information about full and incremental load.
  • After LLD then Development Phase will start
Development Phase ( Coding ) :- 
  • Based an LLD, the ETL team will create mapping ( ETL Code )
  • After designing the mappings, the code ( Mappings ) will be reviewed by developers.
Code Review
  • Code Review will be done by developer.
  • In code review,the developer will review the code and the logic but not the data.
  • The following activities takes place in code review
  • You have to check the naming standards of transformation,mappings of data etc.
  • Source and target mapping ( Placed the correct logic or not in mapping )
Peer Review :-
  • The code will reviewed by your team member ( third party developer )
Testing

The following various types testing carried out in testing environment.
  • Unit Testing
  • Development Integration Testing
  • System Integration Testing
  • User Acceptance Testing
Unit Testing :-
  • A unit test for the DWH is a white Box testing,It should check the ETL procedure and Mappings.
  • The following are the test cases can be executed by an ETL developer.

  1. Verify data loss
  2. No.of records in the source and target
  3. Dataload/Insert
  4. Dataload/Update
  5. Incremental load
  6. Data accuracy
  7. verify Naming standards.
  8. Verify column Mapping
  • The Unit Test will be carried by ETL developer in development phase.
  • ETL developer has to do the data validations also in this phase.
Development Integration Testing -
  • Run all the mappings in the sequence order.
  • First Run the source to stage mappings.
  • Then run the mappings related to dimensions and facts.
System Integration Testing -
  • After development phase,we have to move our code to QA environment.
  • In this environment,we are giving read-only permission to testing people.
  • They will test all the workflows.
  • And they will test our code according to their standards.
User Acceptance Testing ( UAT ) -
  • This test is carried out in the presence of client side technical users to verify the data migration from source to destination.
Production Environment -

  • Migrate the code into the Go-Live environment from test environment ( QA Environment ).


5 comments: