Day One
An Introduction to Data Warehousing Principles
This session provides an introduction to the principles behind data warehousing, the imperatives driving good data quality and modern research in the area. Existing open and closed source data warehouse applications and the rationale behind each will be discussed.
Business Requirements Leading to Data Analysis
The fundamental reasons for deploying a data warehouse will be examined. This begins with developing a business case for data analysis, one of the most time consuming but important, sections of the data warehouse process. The course will then examine tools and techniques to help analyse data, as well as how to plan the integration of that data into the data warehouse. It will cover:
- Identifying potential data sources (databases)
- The business case for each data source
- How to analyse the data meaningfully
- Using existing analysis tools
- Other long term benefits of the analysis - "Owning your own data"
- Why a data warehouse is never complete
Data Warehouse Design
Once the data has been analysed it is time to design the databases to hold information within the data warehouse. A data warehouse is made up of multiple databases of two different types, a consolidation database and data marts. A consolidation database is the central repository of all the extracted information, however given the size and complexity of this database smaller subsets, called data marts, are constructed to hold data for business intelligence and reporting purposes.
In this section we will look at the designing the consolidation database and data marts that make up the data warehouse, including:
- An introduction to relational theory (On demand)
- Differences between data warehouse databases and general transactional databases.
- Star and Snowflake schema
- Tips and tricks to remember during design
- Storing historical data
- Building triggers
- Optimising data marts for rapid queries
Day Two
Extraction and Transformation
One of the most important aspects of data warehousing is the extraction and transformation of data to be inserted into the data warehouse. This section looks at data integrity and validation to give an understanding of its critically importance. The course will cover
- How to write good extraction queries (Garbage In / Garbage Out)
- How can you transform your data
- Data integrity and validation checking during transformation
- Extracting historical information to build a complete image of your data over time
- How to resolve extraction problems
Reporting
The end result of any data warehouse project is having good data allowing for the creation of good reports and contributing to business intelligence. This section considers how to use a data warehouse to improve organisational reporting requirements and so improve overall business intelligence. Participants will learn how to:
- Access data from the data warehouse
- Turn data into information
- Write good reports
- Use a data warehouse to help find invalid or logically inconsistent data
- Use a data warehouse to analyse large data sets
- Use a data warehouse to improve business intelligence and see the business state at a glance
- Schedule reports to run daily or at any appropriate time interval
- Export data into a spreadsheet application for further analysis