Title: User Training Handout
1Welcome
2User TrainingHandout
Version 3.0
3ETL4ALL Training (Version 3.0)
- Training Purpose
- Learn basic and advanced functionalities of
ETL4ALL using sample data - Provide an insight on how ETL4ALL can be
integrated in your organisation
- Training Overview
- First Day
- Introduction
- What is ETL?
- What is ETL4ALL?
- The ETL4ALL Process explained with the SAMPLE
Project - Data lt-gt Metadata
- Creating a simple Transformation Program (SAMPLE
Project) - Starting ETL4ALL / Accessing the ETL4ALL User
Guide - ETL4ALL Opening Screen
- Opening the SAMPLE Project
- Viewing and analysing existing Metadata
- Collecting Metadata
- Creating Metadata manually
- Adding a new Data Source to a Metadata definition
- Viewing and analysing a Procedure
- Creating a Procedure
- Creating a Program
Demonstration and Practice
- Running the Program
- Create a Virtual Data Source inspect data
- Create a Real Data Source redirect data
4ETL4ALL Training Overview (continued)
- Second Day
- Defining more complex Procedures in ETL4ALL
- Relational Functions
- Mathematical Functions
- Sorting Functions
- Statistical Functions
- String Functions
- Constants (with global variables)
- Time Functions
- Conditional Functions
By means of the NorthWind Exercises
- Reporting Solutions
- Output to Excel Worksheets
- Output to JFreeReports
- Preparing the Implementation Workshop how will
YOU use ETL4ALL?
5Introduction
6What is ETL ?
- ETL stands for extracting, transforming and
loading data. - ETL enables companies to
- Extract raw data from multiple sources
- Transform (reformat, filter and combine) the
data - Load the resulting data into data sinks
- Once the data is available in the defined sinks,
it is available for analysis and reporting.
7What is ETL4ALL ?
ETL4ALL is IKANs ETL solution. It is typically
used several times in the sequence of ETL
processes
A Data mart provides an intermediary step between
the operational data and the Reporting process.
The operational data are filtered, sorted and
combined, in order to enhance performance and
surveyability.
8ETL4ALL Process The SAMPLE Project
9Polygons Data lt-gt Polygons Metadata
Tranformations in ETL4ALL are not defined for the
actual Data, but for a description of the Data.
These Data descriptions are called Metadata.
When these Polygons are described in terms of
their ID, Size and AngleNumber,we obtain the
following table
10Creating a simple Transformation Program
... By means of the SAMPLE Project
11Starting ETL4ALL / Accessing the User Guide
- Starting the application
- Start Programs IKAN Software ETL4ALL
ETL4ALL
- Opening the User Guide
- Start Programs IKAN Software ETL4ALL
Documentation User Guide
12Opening Screen
- Work Area
- Contains the Editor Windows of the
- Metadata definitions
- Data sources
- Procedures
- Programs...
- ...you opened in the Virtual Filesystem.
- The Editor Windows allow viewing, defining and
editing the settings.
Feedback window Displays logging messages
13Open the SAMPLE Project
14Opening an existing Metadata definition
Polygons Metadata
Double-click
Metadata Editor Window
15Analysing the Polygons Metadata definition in
the Work Area
Minimize box
Metadata (Plural form)
Record (Singular form)
Fields
Status box
16Creating new Metadata -collect an existing data
source
DRAG
17Creating new Metadata manual creation
Right-click
Double-click
18Verifying the Polygons data source in the
Virtual File System window
Right-click
19Adding an additional Data Source to the Polygons
Metadata definition
DRAG
20Verifying the MakeSquare Procedure
Double-click
Source
Mappings
Target
Expandablefunction categories
21Defining the MakeRed Procedure Creation
Right-click
DRAG
22Defining the MakeRed Procedure Relational
Mapping
2
1
3
4
23Defining the MakeRed Procedure Field Mappings
1
2
5
6
4
24Creating the SAMPLE Final Results Folder
Right-click
25Creating the Houses Sink Metadata
Right-click
26Defining the AddRoofToHouse Procedure - Creation
4
2
1
3
Squares
2
1
3
4
RedTriangles
Right-click
27Creating the AddRoofToHouse Procedure
Relational Mappings
3
1
4
5
7
6
2
8
28Creating the AddRoofToHouse Procedure Field
Mappings
1
4
6
5
2
3
7
29Defining the BuildHouses Program Create
Right-click
30Defining the BuildHouses Program Add Procedures
BuildHouses Program
1
3
4
2
5
6
Action !
31Running a simple Transformation
ProgramMakeHouses
32Running the MakeHouses Program
- Options
- Create a virtual data source and run the
programWhen a Program is run, the results are
stored in a Virtual Data Source. It is called
virtual, because it does not represent a physical
storage entity, such as a file or a database
table. - Create a real data source and redirect the
Virtual Data Source to that real data
sourceWhen you redirect the Virtual Data Source
to a Real data source, the program results are
physically stored in the user-defined file (XML,
CSV, Excel...) or in the user-selected Database
Table.
33Create a virtual data source to run the
MakeHouses program
Right-click
34Opening and inspecting the Virtual Data Source
Right-click
35Opening and redirecting the Virtual Data Source
- Opening and redirecting the Virtual Data Source
means executing the transformation program and
saving the results in a physical file (or
database table).
Step 1 Create a Real Data Source
36Opening and redirecting the Virtual Data Source
Step 2 Redirect the Virtual Data Source to this
file
Double-click
37Creating more complex Procedures and Programs
... Using the NorthWind Sample Database
38Presenting the NorthWind Sample Database
- NorthWind is a Food Sales company with customers
worldwide - Information about these customers, the products,
orders, etc. are stored in the several Database
Tables. - The relations between the Tables are indicated by
red arrows.
39Creating the NorthWind Project (1)
- Creating the Project
-
- Configuring the Project Adding Users
40Creating the NorthWind Project (2)
- Adding existing Folders (not required for this
project) -
- Adding new Folders
Result
Right-click
Result
41Adding the NorthWind Source Tables (1)
- Declaring the NorthWind Database
-
- Create the NorthWind Sources Folder (1)
42Adding the NorthWind Source Tables (1)
- Declaring Tables in ETL4ALL (1)
43Adding the NorthWind Source Tables (2)
- Declaring Tables in ETL4ALL (2)
2. DRAG
3. SELECT
1. SELECT
4. Define Where clausefor each added table
Result
44Exercise 1 The Euro Price List
- The final purpose of this Exercise is to obtain
- A Euro Product Price List
- Listing the Products in alphabetical order of the
Product Names - Listing only the Products that are not
discontinued. - This Price list is generated in several Phases.
- Each Phase is divided in several Steps.
- In this way, new notions and functions are
offered gradually, - in order to avoid information overflow.
45 Phase 1 Basic Euro Product Price List
- We assume that the UnitPrice indications in the
Products Metadata are expressed in US Dollars. We
want to generate a Price List expressed in Euro
and save this Price List in an XML file.
- Steps
- Create the Folder NorthWind Exercises and the
Exercise 1 subfolder. - Create the EuroProductPrices Metadata with the
following fields ProductID, ProductName and
EuroPrice. - Create the CalcEuroPrices Procedure.
- Create the CalcEuroPrices Program.
- Create the EuroPriceList Virtual Data Source.
- Inspect the EuroPriceList Virtual Data Source.
- Add the EuroPriceList.xml Real Data Source.
- Redirect the Virtual Data Source to the Real Data
Source.
46 Phase 1 Step 1 Create the Folder NorthWind
Exercises and a subfolder Exercise 1.
47 Phase 1 Step 2 Create the EuroProductPrices
Metadata
48 Phase 1 Step 3 Create the CalcEuroPrices
Procedure
49 Phase 1 Step 4 Create the CalcEuroPrices
Program
50 Phase 1 Step 5 Create EuroPriceList Virtual
Data Source
51 Phase 1 Step 6 Inspect the EuroPriceList
Virtual Data Source
52 Phase 1 Step 7 Add the EuroPriceList.xml
Real Data Source
53 Phase 1 Step 8 Redirect the EuroPriceList
Virtual Data Source to the EuroPriceList.xml Real
Data Source
54Phase 2 Ordered Euro Product Price List
We want to change the EuroPriceList so that the
Product Names are ordered in alphabetic order.
- Steps
- Create the EuroProductPrices Metadata with the
following fields ProductID, ProductName and
EuroPrice. ALREADY DONE - Modify the CalcEuroPrices Procedure. SEE NEXT
SLIDE - Create the CalcEuroPrices Program. ALREADY DONE
- Create the EuroPriceList Virtual Data Source.
ALREADY DONE - Inspect the EuroPriceList Virtual Data Source.
VERIFY RESULTS
55Phase 2 Step 2 Modify the CalcEuroPrices
Procedure
56Phase 2 Step 5 Inspect the EuroPriceList
Virtual Data Source
57Phase 3 Ordered Euro Product Price
List,Discontinued Articles excluded
- We want to change the Alphabetic EuroPriceList so
that the discontinued Products are excluded.
Discontinued Products have a value 1 in the
Discontinued Field belonging to the Products
Table.
- Steps
- Create the EuroProductPrices Metadata with the
following fields ProductID, ProductName and
EuroPrice. ALREADY DONE - Modify the CalcEuroPrices Procedure. SEE NEXT
SLIDE - Create the CalcEuroPrices Program. ALREADY DONE
- Create the EuroPriceList Virtual Data Source.
ALREADY DONE - Inspect the EuroPriceList Virtual Data Source.
VERIFY RESULTS
58Phase 3 Step 2 Modify the CalcEuroPrices
Procedure
59Phase 3 Step 5 Inspect the EuroPriceList
Virtual Data Source
Compare the results with the reportswithout
exclusions
Etc.
60Exercise 2 The Top Customers Report
- The final purpose of this exercise is to obtain a
report listing the top X customers of the
NorthWind Customers in terms of their total
purchase amounts. The total purchases of all
other Customers (not belonging to the top X) must
be totalled up and printed on a separate line. - This Top Customer Report is generated in several
Phases. Each Phase is divided in several Steps
- Phase 1 Create the Exercise 2 subfolder
- Phase 2 Calculate the ProductTotals per
OrderDetail record - Phase 3 Calculate the OrderTotals
- Phase 4 Filter the OrderTotals for a required
period of time (and other conditions) - Phase 5 Calculate CustomerTotals for the
selected period and matching other conditions - Phase 6 Sort the Customers in decreasing order
of CustomerTotals - Phase 7 Rename all non-Top 10 Customers to
Other - Phase 8 Total up the Purchase Amount for the
non-Top 10 Customers - Phase 9 Sort the final Report
- Phase 10 Redirect the results to Excel and
generate a Pie-Chart (reporting !)
For each Phase, the new Metadata definition, the
Procedure, the Program and the Data Window with
the Results are listed.
61Phase 1 Create the Exercise 2 subfolder
Result
62Phase 2 Calculate the ProductTotals, based on
the OrderDetails (1)
Metadata
Procedure
63Phase 2 Calculate the ProductTotals, based on
the OrderDetails (2)
Program
Data Window(Results)
64Phase 3 Calculate the OrderTotals (1)
Metadata
Procedure
65Phase 3 Calculate the OrderTotals (2)
Program
Data Window(Results)
66Phase 4 Select the OrderTotals for the required
period (1)
Metadata
Note Same Metadata as for Phase 3 !
Procedure
Note Source and Target Metadata are the same !
67Phase 4 Select the OrderTotals for the required
period (2)
Program
Data Window(Results)
68Phase 5 Calculate CustomerTotals for selected
Period (1)
Metadata
Procedure
69Phase 5 Calculate CustomerTotals for selected
Period (2)
Program
Data Window(Results)
70Phase 6 Sort the CustomerTotals in decreasing
order (1)
Metadata
Note Same Metadata as for Phase 5 !
Procedure
71Phase 6 Sort the CustomerTotals in decreasing
order (2)
Program
Data Window(Results)
72Phase 7 Rename the Non-Top X Customers (1)
Metadata
Note Same Metadata as for Phase 5 and 6 !
Procedure
Defines number of Customers to be listed
individually
73Phase 7 Rename the Non-Top 10 Customers (2)
Program
Data Window(Results)
74Phase 8 Total up the Purchase Amount for the
non-Top X Customers (1)
Metadata
Note Same Metadata as for Phase 5, 6 and 7 !
Procedure
75Phase 8 Total up the Purchase Amount for the
non-Top X Customers (2)
Program
Data Window(Results)
76Phase 9 Sort the Final Report (1)
Metadata
Note Same Metadata as for Phase 5, 6, 7 and 8 !
Procedure
77Phase 9 Sort the Final Report (2)
Program
Data Window(Results)
78Phase 10 Redirect the results to Excel and
generate a Pie Chart (1)
Create a blank Excel Workbook outside ETL4ALL and
add formating as indicated below
2. Define Titles, Headers, etc.
3. The data will be pasted on the Data
sheet.Define References where they should
appearon the Report Sheet.
1. Keep two sheets named Report and Data
4. Save the formatted Excel Workbook.
79Phase 10 Redirect the results to Excel and
generate a Pie Chart (2)
Add the formatted Excel Workbook as a Real Data
Source to ETL4ALL.
Add the formatted Excel Workbook as a Real Data
Source to ETL4ALL.
80Phase 10 Redirect the results to Excel and
generate a Pie Chart (3)
Select the Data sheet as target range, without
marking any cells.
81Phase 10 Redirect the results to Excel and
generate a Pie Chart (4)
Select the Data sheet as target range, without
marking any cells.
82Phase 10 Redirect the results to Excel and
generate a Pie Chart (5)
Each time you run the report with other data, the
figures and the pie chartwill be updated
automatically. However, if you generate more data
lines (e.g. 21 instead of 11), you will have to
redefine the Data Range for the Pie chart.
83Exercise 3 The Top Countries Report
- The final purpose of this exercise is to obtain a
report listing the top X countries of the
NorthWind company in terms of the total revenue
generated in these countries. The total revenue
generated in all other Countries (not belonging
to the top X) must be totalled up and printed on
a separate line. - This is a variation on Exercise 2. Try to find
the solution with the hints below
- Phase 1 Create the Exercise 3 subfolder
- Re-Use ProductTotals and OrderTotals from
Exercise 2. - Phase 2 Calculate CountryTotals instead of
CustomerTotals - Following Phases Define Procedures and
Programs in a similar way as for the Top
Customer Report.
The final result is shown on the next slide.
84Exercise 3 Result
RE-USE
Similar to Exercise 2 !
85Phase 1 Create the Exercise 3 subfolder
Result
86Phase 2 Calculate CountryTotals for selected
Period (1)
Metadata
Procedure
87Phase 2 Calculate CountryTotals for selected
Period (2)
Program
Data Window(Results)
88Phase 3 Sort the CountryTotals in decreasing
order (1)
Metadata
Note Same Metadata as for Phase 2 !
Procedure
89Phase 3 Sort the CustomerTotals in decreasing
order (2)
Program
Data Window(Results)
90Phase 4 Rename the Non-Top X Countries (1)
Metadata
Note Same Metadata as for Phase 2 and 3 !
Procedure
Defines number of Countries to be listed
individually
91Phase 4 Rename the Non-Top 10 Customers (2)
Program
Data Window(Results)
92Phase 5 Total up the Purchase Amount for the
non-Top 10 Customers (1)
Metadata
Note Same Metadata as for Phase 2, 3 and 4 !
Procedure
93Phase 5 Total up the Purchase Amount for the
non-Top 10 Customers (2)
Program
Data Window(Results)
94Phase 6 Sort the Final Report (1)
Metadata
Note Same Metadata as for Phase 2, 3, 4 and 5 !
Procedure
95Phase 6 Sort the Final Report (2)
Program
Data Window(Results)
96Phase 7 Redirect the results to Excel and
generate a Pie Chart (1)
Create a blank Excel Workbook outside ETL4ALL and
add formating as indicated below
2. Define Titles, Headers, etc.
3. The data will be pasted on the Data
sheet.Define References where they should
appearon the Report Sheet.
1. Keep two sheets named Report and Data
4. Save the formatted Excel Workbook.
97Phase 7 Redirect the results to Excel and
generate a Pie Chart (2)
Add the formatted Excel Workbook as a Real Data
Source to ETL4ALL.
98Phase 7 Redirect the results to Excel and
generate a Pie Chart (3)
Select the Data sheet as target range, without
marking any cells.
99Phase 7 Redirect the results to Excel and
generate a Pie Chart (4)
Select the Data sheet as target range, without
marking any cells.
100Phase 7 Redirect the results to Excel and
generate a Pie Chart (5)
Each time you run the report with other data, the
figures and the pie chartwill be updated
automatically. However, if you generate more data
lines (e.g. 21 instead of 11), you will have to
redefine the Data Range for the Pie chart.
101Exercise 4 The Monthly Market Penetration Report
- The purpose of this Project is to define a
transformation program which - finds the market penetration per country for the
previous month - runs automatically on a monthly basis
- Phases 2 - 3 introduce the use of a Multi-Level
XML file as additional data source. - Phases 4 - 6 are comparable to Exercise 2 and 3,
but the definition of the of the variable
analysis period is new. - Phase 7 shows how to redirect the Virtual Data
Source and to program a monthly execution of the
program.
102Phase 1 Create the Exercise 4 subfolder
Result
103Phase 2 Declare Continents Multi-Level XML File
(1)
- The NorthWind Database does not contain
information about the population per country.
This information must come from another source.
We assume that you want to use the following
multi-level XML file to provide this information
104Phase 2 Declare Continents Multi-Level XML File
(2)
- Declare the XML File in ETL4ALL by dragging it
into the Exercise 4 Folder
105Phase 2 Declare Continents Multi-Level XML File
(3)
Double-click
106Phase 3 Find Population per Country (1)
Procedure Find Population per Country
107Phase 3 Find Population per Country (1)
Procedure Data Window
108Phase 4 Find OrderTotals for Last Month (1)
- Explanation of the procedure
- we take the System Timestamp.
- We find the YEAR and MONTH substrings for the
month PRECEDING the system Timestamp. - If an OrderDate matches both these substrings,
we know that that order refers to the previous
month.
Procedure
Re-used from Exercise 2 !!
109Phase 4 Find OrderTotals for Last Month (2)
Data Window
Set the System Timestamp to March 2004,the
OrderTotals for February 2004 are listed !
110Phase 5 Find CountryTotals for Last Month (1)
Procedure
111Phase 5 Find CountryTotals for Last Month (2)
Data Window
Set the System Timestamp to March 2004,the
CountryTotals for February 2004 are listed !
112Phase 6 Find Market Penetration for Last Month
(1)
Definition
Procedure
113Phase 6 Find Market Penetration for Last Month
(2)
Data Window
Set the System Timestamp to March 2004,the
Results for February 2004 are listed !
114Phase 7 Redirect the results to Excel and
generate a monthly report with Column Chart (1)
Create a blank Excel Workbook outside ETL4ALL and
add formating as indicated below
2. Define Titles, Headers, etc.
3. The data will be pasted on the Data
sheet.Define References where they should
appearon the Report Sheet.
1. Keep two sheets named Report and Data
4. Save the formatted Excel Workbook.
115Phase 7 Redirect the results to Excel and
generate a monthly report with Column Chart (2)
Add the formatted Excel Workbook as a Real Data
Source to ETL4ALL.
116Phase 7 Redirect the results to Excel and
generate a monthly report with Column Chart (3)
Select the Data sheet as target range, without
marking any cells.
117Phase 7 Redirect the results to Excel and
generate a monthly report with Column Chart (4)
Select the Data sheet as target range, without
marking any cells.
!!!
!!!
118Phase 7 Redirect the results to Excel and
generate a monthly report with Column Chart(5)
The report will be automatically executed each
month on the selected date.The figures and the
chart will be updated automatically. However, if
the linesof data changes, you will have to
redefine the Data Range for the chart.