SQL Server Integration Services Deep Dive - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

SQL Server Integration Services Deep Dive

Description:

SQL Server Integration Services Deep Dive. Warren Stevens-Baytopp. BI Architect - GijimaAst ... Warren.stevens-baytopp_at_gijima.com. Merge data from heterogeneous ... – PowerPoint PPT presentation

Number of Views:184
Avg rating:3.0/5.0
Slides: 28
Provided by: gofishd
Category:

less

Transcript and Presenter's Notes

Title: SQL Server Integration Services Deep Dive


1
SQL Server Integration Services Deep Dive
  • Warren Stevens-Baytopp
  • BI Architect - GijimaAst
  • Warren.stevens-baytopp_at_gijima.com

2
Integration Services Why ETL Matters
  • Merge data from heterogeneous data stores
  • ? Text files ?Mainframes
  • ? Spreadsheets ? Multiple RDBMS
  • Refresh data in data warehouses and data marts
  • Cleanse data before loading to remove errors
  • High-speed load of data into online transaction
    processing (OLTP) and online analytical
    processing (OLAP) databases
  • Send status notifications on success/failure
  • Build BI into a data transformation process
    without the need for redundant staging
    environments
  • Automate data-administrative functions

3
SSIS Overview
  • Data sources can be diverse, including custom or
    scripted adapters
  • Transformation components shape and modify data
    in many ways.
  • Data is routed by rules or error conditions for
    cleansing and conforming.
  • Flows can be as complex as your business rules,
    but highly concurrent.
  • And finally data can be loaded in parallel to
    many varied destinations.

4
So Where to Now
  • Data Sources
  • Excel
  • Common problem - not all data coming through
    correctly
  • By Default Excel will determine the column types
    based on a Majority Type rule.
  • Overcome this by forcing a mixed type in the Data
    connector

5
Excel Demo
6
Data Sources Continued
  • Data Sources
  • Verifying Connectivity / Availability
  • ETL Tasks run through some of the steps and then
    fail on connectivity issues.
  • Why Would you want to check for this?
  • Use scripting task.

7
Scripting Demo
8
Scripting Demo Lets Test it
9
Scripting Demo
10
Data Sources Continued
  • Data Sources
  • OLE DB Provider
  • Selecting Table or View dropdown as a source.
  • So what is the problem with this?
  • Replace with what
  • Select from TABLENAME not much better or is
    it?
  • Select field list from TABLENAME resource
    usage

11
OLE DB Provider Demo
  • If a table is selected
  • SSIS issues an OPENROWSET
  • If a SQL statement is used
  • SSIS issues sp_executesql.

12
Sourcing Data
  • Common Requirement
  • Get all Data from one table that does not exist
    in another
  • Get all rows from a staging table where the
    business key is not in the dimension table
  • Conventional T-SQL

13
Sourcing Data Conventional T-SQL
INSERT INTO DIM_DATE SELECT s. FROM STG_DATE
s LEFT OUTER JOIN DIM_DATE d ON
s.DateID d.DateID WHERE d.DateID IS
NULL INSERT INTO DIM_DATE SELECT s. FROM
STG_DATE s WHERE DateID NOT IN (SELECT
DISTINCT DateID FROM DIM_DATE d)
14
Sourcing Data
  • Common Requirement
  • Get all Data from one table that does not exist
    in another
  • Get all rows from a staging table where the
    business key is not in the dimension table
  • Conventional T-SQL
  • Using SSIS

15
Sourcing Data Using SSIS
  • Merge Join
  • Same as first T-SQL Statement
  • Requires a Sort and Conditional Split
  • Lookup
  • Using the SSIS functionality.
  • Less Coding
  • Uses the error output as the valid records.
  • Speed Comparisons

16
Lookups
  • Exact Matching
  • Want data that matches a specific field.
  • Normal usage of Lookup
  • Range Comparisons
  • Want data that falls between 2 values
  • The Caching SQL Statement
  • Mapping of Parameters

17
Lookups Range Comparisons
18
Date and Time Handling
  • Date formatting
  • Construct a date string in the format YYYYMMDD
    HHMISS
  • Get the month name
  • Get the formatted month and year in the form -
    mmm (yyyy)
  • Create a file using a date in the form yyyy-mm-dd
  • Create a file using a date in the form yyyy-mm-dd
    for yesterdays date
  • A simple yyyymmdd formatted two ways of doing
    this

19
Date Formatting YYYYMMDD HHMISS
(DT_STR,4,1252)DATEPART("yyyy",OldDate)
RIGHT("0" (DT_STR,2,1252)DATEPART("mm",OldDate)
,2) RIGHT("0" (DT_STR,2,1252)DATEPART("dd",Ol
dDate),2) " " RIGHT("0" (DT_STR,2,1252)DATEP
ART("hh",OldDate),2) RIGHT("0"
(DT_STR,2,1252)DATEPART("mi",OldDate),2)
RIGHT("0" (DT_STR,2,1252)DATEPART("ss",OldDate)
,2)
20
Date Formatting Month Name
(MONTH(NewDate) 1 ? "January"
MONTH(NewDate) 2 ? "February"
MONTH(NewDate) 3 ? "March" MONTH(NewDate)
4 ? "April" MONTH(NewDate) 5 ? "May"
MONTH(NewDate) 6 ? "June" MONTH(NewDate)
7 ? "July" MONTH(NewDate) 8 ? "August"
MONTH(NewDate) 9 ? "September"
MONTH(NewDate) 10 ? "October"
MONTH(NewDate) 11 ? "November"
MONTH(NewDate) 12 ? "December"
"InvalidMonth")
21
Date Formatting mmm (yyyy)
(MONTH(OldDate) 1 ? "Jan" MONTH(OldDate)
2 ? "Feb" MONTH(OldDate) 3 ? "Mar"
MONTH(OldDate) 4 ? "Apr" MONTH(OldDate)
5 ? "May" MONTH(OldDate) 6 ? "Jun
MONTH(OldDate) 7 ? "Jul" MONTH(OldDate)
8 ? "Aug" MONTH(OldDate) 9 ? "Sep"
MONTH(OldDate) 10 ? "Oct" MONTH(OldDate)
11 ? "Nov" MONTH(OldDate) 12 ? "Dec"
"ERR") " (" DT_WSTR,4)YEAR(OldDate) ")"
22
Date Formatting Text file with YYYY-MM-DD
"C\\Temp\\ErrorCodes\\" (DT_WSTR,4)YEAR(NewDate
) "-" RIGHT("0" (DT_WSTR,2)MONTH(NewDate),
2) "-" RIGHT("0" (DT_WSTR,2)DAY( NewDate),
2) ".txt"
23
Date Formatting Same but for Yesterday
"C\\Temp\\ErrorCodes\\" (DT_WSTR,4)YEAR(DATEADD
("dd", -1, OldDate)) "-" RIGHT("0"
(DT_WSTR,2)MONTH(DATEADD("dd", -1, OldDate)), 2)
"- RIGHT("0" (DT_WSTR,2)DAY(DATEADD("dd",
-1, OldDate)), 2) ".txt dateadd function In
each portion of the check
24
Date Formatting A Simple YYYYMMDD
(DT_WSTR,4)YEAR(OldDate) RIGHT("0"
(DT_WSTR,2)MONTH(OldDate), 2) RIGHT("0"
(DT_WSTR,2)DAY(OldDate), 2) OR (DT_WSTR,8)
((YEAR(OldDate) 10000) (MONTH(OldDate) 100)
DAY(OldDate))
25
Some Performance Tuning Tips
  • Only Select the Columns that you need
  • Use a SQL Server Destination instead of an OLE DB
    Destination
  • If using an OLE DB Destination use the table or
    view with fast load data access mode.
  • Use standardized naming conventions
  • Where possible, filter your data in the
    Source Adapter rather than using a Conditional
    Split transform component
  • LOOKUP components will generally work quicker
    than MERGE JOIN components where the 2 can be
    used for the same task
  • Use caching in your LOOKUP components where
    possible. It makes them quicker. Just watch that
    you are not grabbing too many resources.
  • Use Sequence containers to organize package
    structure into logical units of work.

26
For More Information
  • Visit TechNet at www.microsoft.com/technet/prodtec
    hnol/sql/2005/technologies/ssisvcs.mspx
  • Learn more about SSIS on MSDN athttp//msdn2.micr
    osoft.com/en-us/sql/aa336312.aspx
  • Great information available at www.sqlis.com
  • http//download.microsoft.com for Whitepapers
    and downloads of custom transformations
  • Jamie Thomsons Blog http//blogs.conchango.com/ja
    miethomson/default.aspx
  • Donald Farmers Blog
  • http//sqljunkies.com/WebLog/donald_farmer/defaul
    t.aspx

27
Thank you to our Partners for their support of
TechDays 2007
Write a Comment
User Comments (0)
About PowerShow.com