Get the Best Out of Oracle Data Pump Functionality - PowerPoint PPT Presentation

About This Presentation
Title:

Get the Best Out of Oracle Data Pump Functionality

Description:

This presentation contains information proprietary to Oracle Corporation. * * * * * * * * * * * * * Data Masking with Oracle Data Pump: Restrictions Data types must ... – PowerPoint PPT presentation

Number of Views:448
Avg rating:3.0/5.0
Slides: 50
Provided by: Caro1218
Category:

less

Transcript and Presenter's Notes

Title: Get the Best Out of Oracle Data Pump Functionality


1
(No Transcript)
2
Get the Best Out of Oracle Data Pump Functionality
  • Dean Gagne (Oracle)
  • Viljo Hakala (Nokia)

3
Agenda
  • Moving large amounts of data with Transportable
    Tablespaces
  • Filtering metadata using the INCLUDE and EXCLUDE
    parameters
  • Restarting stopped/failed jobs
  • Hear about Nokia Corporation's database
    environment
  • How Nokia uses the REMAP_DATA parameter to
    scramble data
  • How Nokia regenerates primary keys without having
    to use additional software or scripts


4
What is Oracle Data Pump?
  • New feature starting in Oracle Database
    10g Release 1
  • Enables very fast bulk data and metadata movement
    between Oracle databases
  • High-speed, parallel Export and Import utilities
    (expdp and impdp) as well as a Web-based Oracle
    Enterprise Manager interface
  • Jobs can be restarted without loss of data,
    whether or not the stoppage was voluntary or
    involuntary
  • Jobs support fine-grained object selection.
    Virtually any type of object can be included or
    excluded
  • Supports the ability to load one instance
    directly from another (network import) and unload
    a remote instance (network export)

5
Moving Large Amounts of Data With Transportable
Tablespace
6
What is Transportable Tablespace?
  • An Oracle Database feature that allows data file
    transfer from a database to another via a simple
    os copy and a light specific export/import
  • Data Pump will move metadata only
  • Data moves with data file copy
  • Much faster than using direct path or external
    tables

7
Moving Large Amounts of Data With Transportable
Tablespaces
  • Restrictions
  • Tablespaces need to be self contained
  • All dependent objects must be included in
    tablespace set
  • Tablespaces need to be read only for duration of
    export and datafile copying.
  • Not restartable
  • Must be privileged account
  • Notes
  • Use RMAN CONVERT to change endianness (if needed)
  • Data Pump moves only metadata
  • Can use network link
  • For a full list of what is exported
  • select unique seq_num, full_pathfrom
    datapump_pathswhere het_type
    'TRANSPORTABLE_EXPORTorder by seq_num


8
Transportable Tablespaces Self Containment
Check
  • Create table part_tab(id number) tablespace a
    partition by range (id) (partition low value
    less than (100) tablespace b, partition hi
    values less than (1000) tablespace c)
  • Create index part_ind on part_tab.id tablespace
    d
  • Transportable tablespace export requires all 4
    tablespaces
  • (a, b, c, d)
  • DBMS_TTS.TRANSPORT_SET_CHECK procedure to verify
  • accepts a comma separated list of tablespace
    names
  • SELECT FROM transport_set_violations


9
Transportable Tablespace Steps
  • Source system tasks
  • Optionally run DBMS_TTS.TRANSPORT_SET_CHECK
  • Set tablespaces read only
  • Run expdp command with tablespace list
  • Data Pump will list required dumpfiles and
    datafiles in log file
  • Copy datafiles and dumpfiles to target system
  • Optionally set tablespaces read write
  • Target system tasks
  • Run impdp command
  • Optionally set tablespaces to read write

10
Table Mode with Transportablealways
  • Can export/import tables, partitions, and indexes
    regardless of tablespace closure
  • Cant have any storage in system, sysaux, temp
    tablespaces
  • Data Pump export will list tablespaces that need
    to be read only if not already done
  • Data Pump export will list datafiles and
    dumpfiles that need to be copied to target
  • Data Pump import can use filters dont have to
    import the complete dumpfile set
  • Can import only one partition Data Pump creates
    a non-partitioned table
  • Partition_optionsdepartition creates
    non-partitioned tables for every partition

11
Export Table Mode with TransportablealwaysExpdp
scott/tiger tablespt1,pt2 transportablealways
  • Table filter needs to be complete tables or
    partitions from the same table
  • ptab1,ptab2, etc or ptab1lo,ptab1hi
  • ORA-29335 tablespace 'USER1' is not read
    onlyORA-29335 tablespace 'USER2 is not read
    only
  • Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is
    /oracle/work/tab_tts.dmp
  • Datafiles required for transportable tablespace
    USER1 /oracle/dbs/user1.fDatafiles required
    for transportable tablespace USER2
    /oracle/dbs/user2.f

12
Import Using Transportable Table Mode Dumpfile
  • Impdp scott/tiger tablesptab2hi
    rename_tableptab2himy_tab1 transport_datafiles
  • Created table is my_tab1
  • Impdp scott/tiger tablesptab2 partition_optionsd
    epartition transport_datafiles
  • Created tables are ptab2_lo, ptab2_hi
  • Creates all tablespaces even if no objects
    created due to import filters
  • Cleans up any unused tablespace segments that are
    not imported

13
Filtering Metadata
14
Filtering Metadata Overview
  • Perform selective export/import job based on
    object type
  • Exp/imp had limited filtering
  • Grants, index, triggers, statistics, constraints
  • Data Pump has almost complete filtering
    capabilities

15
Filtering Metadata Using INCLUDE and EXCLUDE
Parameters
  • If using exclude parameter, everything else is
    included
  • If using include parameter, everything else is
    excluded
  • Cant use exclude and include in the same Data
    Pump job
  • Specify complete path or partial path. Objects
    matching the specified path will be
    excluded/included.
  • Query to find exclude/include object types
  • select unique seq_num, full_pathfrom
    sys.datapump_pathswhere het_type
    'DATABASE_EXPORT' order by seq_num
  • Job_type het_type
  • FULL
    DATABASE_EXPORT
  • SCHEMA SCHEMA_EXPORT
  • TABLE TABLE_EXPORT
  • TRANSPORTABLE TRANSPORTABLE_EXPORT


16
Exclude Example expdp system/manager
schemahr excludestatistics
vs expdp system/manager
schemahr exclude
SCHEMA_EXPORT/TABLE/STATISTICS
  • select unique seq_num, full_path
  • from sys.datapump_paths
  • where het_type 'SCHEMA_EXPORT' AND full_path
    like 'STATISTICS
  • order by seq_num
  • 77 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS
  • 78 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_S
    TATISTICS
  • 221 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS
  • 222 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTION
    AL_AND_BITMAP
  • 223 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTION
    AL_AND_BITMAP/INDEX_STATISTICS
  • 225 SCHEMA_EXPORT/TABLE/STATISTICS
  • 226 SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTI
    CS
  • 227 SCHEMA_EXPORT/TABLE/STATISTICS/USER_PREF_STAT
    ISTICS

17
Include Examples
  • impdp system/manager tableshr.employees vs
  • impdp system/manager schemashr
    includetable\" \'EMPLOYEES\'\"
  • Same results
  • Includes all objects that have table in the path
  • vs
  • impdp system/manager schemashr
    includetable/table\" \'EMPLOYEES\'\
  • Only includes the table
  • Some of the TABLE object paths
  • SCHEMA_EXPORT/TABLE/TABLE
  • SCHEMA_EXPORT/TABLE/TABLE_DATA
  • SCHEMA_EXPORT/TABLE/GRANT

18
Restarting Stopped/Failed Jobs
19
Restarting Data Pump Jobs
  • Restart jobs intentionally or unintentionally
    stopped
  • User stopped
  • Dumpfile exhausted
  • Resumable wait
  • Optionally change value of PARALLEL parameter
  • Helps to know the job name
  • System generated job names
  • SYS_IMPORT_FULL_01
  • SYS_EXPORT_TABLE_05
  • What jobs are restartable
  • Select from dba_datapump_jobs
  • Select from user_datapump_jobs

20
Restarting Stopped Export JobsInitial Export
Command expdp system/manager
job_namedatapump_exp
  • Restart command
  • expdp system/manager attachdatapump_exp
  • Exportgt start_job
  • Resets dumpfile pointers to last completed object
    type
  • Restarts the job after the last completed object


21
Restarting Stopped Import Jobs initial import
command impdp system/manager
job_namedatapump_imp
  • Restart command
  • Impdp system/manager attachsystem.datapump_imp
  • Importgt Start_jobskip_current
  • Detects the last object created
  • If creating an object and the object exists,
    checks to see if object was in progress in failed
    job
  • Data that was being loaded would have either been
    committed or not. Committed data marked
    complete, uncommitted data will be retried.

22
Restarting Stopped Network Import JobsInitial
Import Command impdp scott/tiger
job_namenet_imp network_linkdbs1
  • Restart command
  • Impdp scott/tiger attachnet_imp
  • Importgt Start_jobskip_current
  • Restarts on object type
  • If creating object and object exists, checks to
    see if object was in progress in failed job
  • Data that was being loaded would have either been
    committed or not. Committed data marked
    complete, uncommitted data will be retried.

23
Hear about Nokia Corporation's Database
Environment and how they use Oracle Data Pump
  • Viljo Hakala (Nokia)

24
Nokia
  • Nokia is a world leader in mobility
  • Head office in Finland RD, production, sales,
    marketing activities around the world
  • Worlds 1 manufacturer of mobile devices, with
    estimated 40 share of global device market in
    2009
  • Mobile device volumes 468 million units
  • Net sales EUR 50.7 billion
  • Operating profit EUR 5.0 billion
  • 128 445 employees at year end (including Nokia
    Siemens Networks)
  • Strong RD presence in 16 countries
  • RD investment EUR 6.0 billion
  • Sales in more than 150 countries

24
25
Oracle _at_ Nokia
  • 20 DBAs, 2300 databases
  • Grid Control used for almost 5 years, 1 tool
    for all DBAs
  • Applications
  • OLTP 50, Hybrid databases 40, DW 10
    Teradata
  • About 1800 Oracle also MySQL 200, MS SQL 300
  • 11g 4
  • 10g 80
  • 9i 15
  • 8i 1
  • Host platforms
  • Solaris 35
  • HP-UX 35
  • Linux 20
  • Windows 10

25
26
Oracle _at_ Nokia
  • Production databases single, HA Oracle, RAC, HA
    MySQL
  • 5 MB 7 TB range
  • Leverage virtually all Oracle database features
  • DBAs share responsibilities
  • Many federated databases Streams, Advanced
    Replication, Materialized views, DB links
  • Common Oracle platform, Common MySQL platform and
    Common MS SQL platform
  • 99 of databases backed up with RMAN
  • 8 RMAN catalog databases
  • 10 PB backed up / month

26
27
What is Data Masking?
27
28
Hints for Effective Data Masking
  • Be sure to mask all data that is not sensitive,
    but can be used to create sensitive data.
  • You should not be able to reverse the masking for
    it to be secure. Never should you be able to
    retrieve original values from masked values.
  • The masked or obfuscated data should follow the
    source data format. Masking should protect data,
    but still allow realistic looking data to be used
    for testing.
  • Integrity of relations in data should be
    followed. When masking a primary key, the
    foreign keys should also be masked using the same
    masking method.
  • You should be able to repeat the masking process
    for it to be an effective day to day process.

29
Data Masking with Oracle Data Pump
  • One of the best uses of Oracle Data Pump is to
    load test and development systems with data from
    production systems for realistic testing
    purposes.
  • Due to company rules or legal regulations,
    sensitive data in the non-production systems has
    to be replaced with realistic looking data to
    enable effective testing.
  • Effective data masking routines should support
    automation and Oracle Data Pump provides several
    ways to automate data imports and exports with
    masking.

29
30
Data Masking with Oracle Data Pump
  • In addition to Data Masking Pack in Grid Control,
    Oracle Data Pump provides a method to mask data
    REMAP_DATA parameter introduced in Oracle
    Database 11g.
  • Oracle Data Pumps REMAP_DATA feature uses a
    remapping function to rewrite data.
  • For example, a column with phone numbers could be
    replaced by a numbers generated by a REMAP_DATA
    function.

30
31
Data Masking with Oracle Data Pump
  • REMAP_DATA allows transformation of columns data
    while exporting (expdp) or importing (impdp) by
    using a remapping function in the database.
  • REMAP_DATA with Data Pump is usually faster than
    a custom UPDATE statement for masking data.
  • To mask multiple columns in the same process and
    command, the REMAP_DATA parameter can be used
    multiple times.

31
32
Data Masking with Oracle Data Pump Quick example
  • REMAP_DATAschema.tablename.column_nameschema.
    pkg.function
  • impdp dumpfiledata.dmp REMAP_DATAscott.orders.cu
    stomer_namescott.maskpkg.mask
  • expdp dumpfiledata.dmp REMAP_DATAscott.orders.cu
    stomer_namescott.maskpkg.mask

32
33
Data Masking with Oracle Data PumpSyntax
  • REMAP_DATA
  • schema1.tablename.column_name
  • schema2.pkg.function
  • schema1 -- the schema with the table to be
    remapped. By default, this is the schema of the
    user doing the export.
  • tablename -- the table which column will be
    remapped.
  • column_name -- which is to be remapped.
  • schema2 -- with the PL/SQL package for remapping
    function. As a default, this is the schema of the
    user doing the export.
  • pkg -- the name of the PL/SQL package with the
    remapping function.
  • function -- the name of the remap function in the
    PL/SQL package

33
34
Data Masking with Oracle Data Pump Restrictions
  • Data types must be same in the table column,
    masking function parameter, and function return
    value.
  • No commits or rollbacks should be done in the
    masking function.
  • No direct path loads can be used in import
    process with REMAP_DATA.
  • Note Operation of long export/import data pump
    processes can be monitored from the
    vsession_longops view, but the estimated values
    do not take into account REMAP_DATA operations.

34
35
Data Masking with Oracle Data Pump Example
  • Create a table in the CUSTOMERS schema called
    phones
  • SQLgt
  • CREATE TABLE CUSTOMERS.PHONES
  • (
  • MODELNAME VARCHAR(20) NOT NULL,
  • PHONENUMBER VARCHAR2(50)
  • )
  • insert into CUSTOMERS.PHONES values(N900,35812
    34567)
  • insert into CUSTOMERS.PHONES values(N8,3589817
    654)
  • insert into CUSTOMERS.PHONES values(N7,3584834
    819)
  • Commit

35
36
Data Masking with Oracle Data PumpExample
  • We then need to create a function for remapping
  • create or replace package customers.maskpkg
  • as
  • function masknumber(phonenumber varchar2)
    return varchar2
  • end
  • /
  • create or replace package body customers.maskpkg
    as
  • function masknumber (phonenumber varchar2)
    return varchar2 is
  • begin return substr(phonenumber,1,4)
    round(dbms_random.value (100,999))
    lpad(round(dbms_random.value (1,9999)),4,'0')
  • end
  • end
  • /
  • The function masknumber will accept a varchar2
    type and returns a random phone number in
    varchar2 type

36
37
Data Masking with Oracle Data PumpExample
  • This example will mask one column phonenumber to
    the export output file
  • expdp customers/manager \
  • tablescustomers.phones \
  • dumpfilephones_masked.dmp \
  • directorydumpdir \
  • remap_data\
  • customers.phones.phonenumber\
  • customers.maskpkg.masknumber
  • Export the data from the customers.phones table
    with expdp utility and use the REMAP_DATA option
    for masking the data in the dump file with the
    function customers.maskpkg.masknumber created
    earlier
  • By default the owner of the remap function is the
    user running expdp/impdp
  • Now the dumpfile phones_masked.dmp can be used
    for testing environments

37
38
Data Masking with Oracle Data PumpExample,
before masking
  • SQLgt select from customers.phones
  • MODELNAME PHONENUMBER
  • ---------- --------------------
  • N900 3591234567
  • N8 3589817654
  • N7 3584834819

38
39
Data Masking with Oracle Data Pump
import/replace table
  • This example will import the masked dump file
  • and replace the existing table with masked column
    data
  • impdp customers/manager \
  • tablescustomers.phones \
  • dumpfilephones_masked.dmp \
  • directorydumpdir \
  • table_exists_actionREPLACE

39
40
Data Masking with Oracle Data PumpExample,
after import
  • SQLgt select from customers.phones
  • MODELNAME PHONENUMBER
  • ---------- --------------------
  • N900 35815499474
  • N8 3584800578
  • N7 3581247839

40
41
Data Masking with Oracle Data PumpImport
  • REMAP_DATA parameter can also be used in the
    import process
  • Use impdp if you have an existing dump file and
    you want to mask the data when loading into the
    database
  • impdp customers/manager TABLE_EXISTS_ACTIONrepl
    ace dumpfilephones.dmp directorydumpdir
    remap_datacustomers.phones.phonenumbercustomers.
    maskpkg.masknumber

41
42
Primary Key Regeneration with REMAP_DATA
  • To change primary keys after development phase of
    an application, use REMAP_DATA parameter.
  • It is possible to use Data Pumps REMAP_DATA to
    change the primary keys within a database by
    writing a remapping function which follows the
    new primary key format.
  • Primary key conflicts can happen when loading
    data to an existing table in a database.
  • With REMAP_DATA, conflicts can be avoided by
    changing the column values during import.

42
43
When not to use Masking
  • It can be very difficult and expensive to mask
    data especially in large BI/DWH systems due to
    the amount of data and refining operations.
  • It is often easier to use a data model in which
    sensitive data can be
  • limited to one location, referenced via surrogate
    keys, and
  • kept safe with strict access policies and
    auditing.
  • But when giving out data for simplified testing
    needs, masking is a good tool!

44
Other Methods to Secure Data in Oracle Database
11g Release 2
  • Oracle Virtual Private Database (VPD aka Fine
    Grained Access Control) feature allows filtering
    of data at row-level for runtime SQL statements
    according to a defined policy.
  • Data Masking Pack for Enterprise Manager.
    Centralized masking solution within Grid Control.
  • Oracle Advanced Security Option for encrypting
    data at tablespace or column level.
  • Oracle Label Security for restricting access to
    data based on policies.
  • Oracle Database Vault for separating roles and
    restricting access to data per roles.

45
Oracle Data Pump at Nokia Summary
  • Oracle Data Pump is used extensively at Nokia to
    move data and metadata between systems.
  • REMAP_DATA option is a great way to mask
    sensitive data for security purposes.
  • REMAP_DATA option can also be used to change
    primary keys.
  • Oracle has many features for securing data at
    different levels.
  • Security for data is not just Obscurity. It is a
    process and requires time.

45
46
Oracle Data Pump Summary
  • Transportable Tablespaces
  • Tablespace mode
  • Table mode
  • Filtering metadata using the INCLUDE and EXCLUDE
    parameters
  • Restarting stopped/failed jobs
  • REMAP_DATA parameter to scramble data
  • How to generate primary keys without having to
    use additional software or scripts

47
Questions?
  • Oracle Data Pump
  • Demogrounds Booth Moscone West W-025

48
(No Transcript)
49
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com