Title: SQL Server Integration Services Deep Dive
1SQL Server Integration Services Deep Dive
- Warren Stevens-Baytopp
- BI Architect - GijimaAst
- Warren.stevens-baytopp_at_gijima.com
2Integration Services Why ETL Matters
- Merge data from heterogeneous data stores
- ? Text files ?Mainframes
- ? Spreadsheets ? Multiple RDBMS
- Refresh data in data warehouses and data marts
- Cleanse data before loading to remove errors
- High-speed load of data into online transaction
processing (OLTP) and online analytical
processing (OLAP) databases - Send status notifications on success/failure
- Build BI into a data transformation process
without the need for redundant staging
environments - Automate data-administrative functions
3SSIS Overview
- Data sources can be diverse, including custom or
scripted adapters - Transformation components shape and modify data
in many ways. - Data is routed by rules or error conditions for
cleansing and conforming. - Flows can be as complex as your business rules,
but highly concurrent. - And finally data can be loaded in parallel to
many varied destinations.
4So Where to Now
- Data Sources
- Excel
- Common problem - not all data coming through
correctly - By Default Excel will determine the column types
based on a Majority Type rule. - Overcome this by forcing a mixed type in the Data
connector
5Excel Demo
6Data Sources Continued
- Data Sources
- Verifying Connectivity / Availability
- ETL Tasks run through some of the steps and then
fail on connectivity issues. - Why Would you want to check for this?
- Use scripting task.
7Scripting Demo
8Scripting Demo Lets Test it
9Scripting Demo
10Data Sources Continued
- Data Sources
- OLE DB Provider
- Selecting Table or View dropdown as a source.
- So what is the problem with this?
- Replace with what
- Select from TABLENAME not much better or is
it? - Select field list from TABLENAME resource
usage
11OLE DB Provider Demo
- If a table is selected
- SSIS issues an OPENROWSET
- If a SQL statement is used
- SSIS issues sp_executesql.
12Sourcing Data
- Common Requirement
- Get all Data from one table that does not exist
in another - Get all rows from a staging table where the
business key is not in the dimension table - Conventional T-SQL
13Sourcing Data Conventional T-SQL
INSERT INTO DIM_DATE SELECT s. FROM STG_DATE
s LEFT OUTER JOIN DIM_DATE d ON
s.DateID d.DateID WHERE d.DateID IS
NULL INSERT INTO DIM_DATE SELECT s. FROM
STG_DATE s WHERE DateID NOT IN (SELECT
DISTINCT DateID FROM DIM_DATE d)
14Sourcing Data
- Common Requirement
- Get all Data from one table that does not exist
in another - Get all rows from a staging table where the
business key is not in the dimension table - Conventional T-SQL
- Using SSIS
15Sourcing Data Using SSIS
- Merge Join
- Same as first T-SQL Statement
- Requires a Sort and Conditional Split
- Lookup
- Using the SSIS functionality.
- Less Coding
- Uses the error output as the valid records.
- Speed Comparisons
16Lookups
- Exact Matching
- Want data that matches a specific field.
- Normal usage of Lookup
- Range Comparisons
- Want data that falls between 2 values
- The Caching SQL Statement
- Mapping of Parameters
17Lookups Range Comparisons
18Date and Time Handling
- Date formatting
- Construct a date string in the format YYYYMMDD
HHMISS - Get the month name
- Get the formatted month and year in the form -
mmm (yyyy) - Create a file using a date in the form yyyy-mm-dd
- Create a file using a date in the form yyyy-mm-dd
for yesterdays date - A simple yyyymmdd formatted two ways of doing
this
19Date Formatting YYYYMMDD HHMISS
(DT_STR,4,1252)DATEPART("yyyy",OldDate)
RIGHT("0" (DT_STR,2,1252)DATEPART("mm",OldDate)
,2) RIGHT("0" (DT_STR,2,1252)DATEPART("dd",Ol
dDate),2) " " RIGHT("0" (DT_STR,2,1252)DATEP
ART("hh",OldDate),2) RIGHT("0"
(DT_STR,2,1252)DATEPART("mi",OldDate),2)
RIGHT("0" (DT_STR,2,1252)DATEPART("ss",OldDate)
,2)
20Date Formatting Month Name
(MONTH(NewDate) 1 ? "January"
MONTH(NewDate) 2 ? "February"
MONTH(NewDate) 3 ? "March" MONTH(NewDate)
4 ? "April" MONTH(NewDate) 5 ? "May"
MONTH(NewDate) 6 ? "June" MONTH(NewDate)
7 ? "July" MONTH(NewDate) 8 ? "August"
MONTH(NewDate) 9 ? "September"
MONTH(NewDate) 10 ? "October"
MONTH(NewDate) 11 ? "November"
MONTH(NewDate) 12 ? "December"
"InvalidMonth")
21Date Formatting mmm (yyyy)
(MONTH(OldDate) 1 ? "Jan" MONTH(OldDate)
2 ? "Feb" MONTH(OldDate) 3 ? "Mar"
MONTH(OldDate) 4 ? "Apr" MONTH(OldDate)
5 ? "May" MONTH(OldDate) 6 ? "Jun
MONTH(OldDate) 7 ? "Jul" MONTH(OldDate)
8 ? "Aug" MONTH(OldDate) 9 ? "Sep"
MONTH(OldDate) 10 ? "Oct" MONTH(OldDate)
11 ? "Nov" MONTH(OldDate) 12 ? "Dec"
"ERR") " (" DT_WSTR,4)YEAR(OldDate) ")"
22Date Formatting Text file with YYYY-MM-DD
"C\\Temp\\ErrorCodes\\" (DT_WSTR,4)YEAR(NewDate
) "-" RIGHT("0" (DT_WSTR,2)MONTH(NewDate),
2) "-" RIGHT("0" (DT_WSTR,2)DAY( NewDate),
2) ".txt"
23Date Formatting Same but for Yesterday
"C\\Temp\\ErrorCodes\\" (DT_WSTR,4)YEAR(DATEADD
("dd", -1, OldDate)) "-" RIGHT("0"
(DT_WSTR,2)MONTH(DATEADD("dd", -1, OldDate)), 2)
"- RIGHT("0" (DT_WSTR,2)DAY(DATEADD("dd",
-1, OldDate)), 2) ".txt dateadd function In
each portion of the check
24Date Formatting A Simple YYYYMMDD
(DT_WSTR,4)YEAR(OldDate) RIGHT("0"
(DT_WSTR,2)MONTH(OldDate), 2) RIGHT("0"
(DT_WSTR,2)DAY(OldDate), 2) OR (DT_WSTR,8)
((YEAR(OldDate) 10000) (MONTH(OldDate) 100)
DAY(OldDate))
25Some Performance Tuning Tips
- Only Select the Columns that you need
- Use a SQL Server Destination instead of an OLE DB
Destination - If using an OLE DB Destination use the table or
view with fast load data access mode. - Use standardized naming conventions
- Where possible, filter your data in the
Source Adapter rather than using a Conditional
Split transform component - LOOKUP components will generally work quicker
than MERGE JOIN components where the 2 can be
used for the same task - Use caching in your LOOKUP components where
possible. It makes them quicker. Just watch that
you are not grabbing too many resources. - Use Sequence containers to organize package
structure into logical units of work.
26For More Information
- Visit TechNet at www.microsoft.com/technet/prodtec
hnol/sql/2005/technologies/ssisvcs.mspx - Learn more about SSIS on MSDN athttp//msdn2.micr
osoft.com/en-us/sql/aa336312.aspx - Great information available at www.sqlis.com
- http//download.microsoft.com for Whitepapers
and downloads of custom transformations - Jamie Thomsons Blog http//blogs.conchango.com/ja
miethomson/default.aspx - Donald Farmers Blog
- http//sqljunkies.com/WebLog/donald_farmer/defaul
t.aspx
27Thank you to our Partners for their support of
TechDays 2007