Title: Get the Best Out of Oracle Data Pump Functionality
1(No Transcript)
2Get the Best Out of Oracle Data Pump Functionality
- Dean Gagne (Oracle)
- Viljo Hakala (Nokia)
3Agenda
- Moving large amounts of data with Transportable
Tablespaces - Filtering metadata using the INCLUDE and EXCLUDE
parameters - Restarting stopped/failed jobs
- Hear about Nokia Corporation's database
environment - How Nokia uses the REMAP_DATA parameter to
scramble data - How Nokia regenerates primary keys without having
to use additional software or scripts
4What is Oracle Data Pump?
- New feature starting in Oracle Database
10g Release 1 - Enables very fast bulk data and metadata movement
between Oracle databases - High-speed, parallel Export and Import utilities
(expdp and impdp) as well as a Web-based Oracle
Enterprise Manager interface - Jobs can be restarted without loss of data,
whether or not the stoppage was voluntary or
involuntary - Jobs support fine-grained object selection.
Virtually any type of object can be included or
excluded - Supports the ability to load one instance
directly from another (network import) and unload
a remote instance (network export)
5Moving Large Amounts of Data With Transportable
Tablespace
6What is Transportable Tablespace?
- An Oracle Database feature that allows data file
transfer from a database to another via a simple
os copy and a light specific export/import - Data Pump will move metadata only
- Data moves with data file copy
- Much faster than using direct path or external
tables
7Moving Large Amounts of Data With Transportable
Tablespaces
- Restrictions
- Tablespaces need to be self contained
- All dependent objects must be included in
tablespace set - Tablespaces need to be read only for duration of
export and datafile copying. - Not restartable
- Must be privileged account
- Notes
- Use RMAN CONVERT to change endianness (if needed)
- Data Pump moves only metadata
- Can use network link
- For a full list of what is exported
- select unique seq_num, full_pathfrom
datapump_pathswhere het_type
'TRANSPORTABLE_EXPORTorder by seq_num
8Transportable Tablespaces Self Containment
Check
- Create table part_tab(id number) tablespace a
partition by range (id) (partition low value
less than (100) tablespace b, partition hi
values less than (1000) tablespace c) - Create index part_ind on part_tab.id tablespace
d - Transportable tablespace export requires all 4
tablespaces - (a, b, c, d)
- DBMS_TTS.TRANSPORT_SET_CHECK procedure to verify
- accepts a comma separated list of tablespace
names - SELECT FROM transport_set_violations
9Transportable Tablespace Steps
- Source system tasks
- Optionally run DBMS_TTS.TRANSPORT_SET_CHECK
- Set tablespaces read only
- Run expdp command with tablespace list
- Data Pump will list required dumpfiles and
datafiles in log file - Copy datafiles and dumpfiles to target system
- Optionally set tablespaces read write
- Target system tasks
- Run impdp command
- Optionally set tablespaces to read write
10Table Mode with Transportablealways
- Can export/import tables, partitions, and indexes
regardless of tablespace closure - Cant have any storage in system, sysaux, temp
tablespaces - Data Pump export will list tablespaces that need
to be read only if not already done - Data Pump export will list datafiles and
dumpfiles that need to be copied to target - Data Pump import can use filters dont have to
import the complete dumpfile set - Can import only one partition Data Pump creates
a non-partitioned table - Partition_optionsdepartition creates
non-partitioned tables for every partition
11Export Table Mode with TransportablealwaysExpdp
scott/tiger tablespt1,pt2 transportablealways
- Table filter needs to be complete tables or
partitions from the same table - ptab1,ptab2, etc or ptab1lo,ptab1hi
- ORA-29335 tablespace 'USER1' is not read
onlyORA-29335 tablespace 'USER2 is not read
only - Dump file set for SCOTT.SYS_EXPORT_TABLE_01 is
/oracle/work/tab_tts.dmp - Datafiles required for transportable tablespace
USER1 /oracle/dbs/user1.fDatafiles required
for transportable tablespace USER2
/oracle/dbs/user2.f
12Import Using Transportable Table Mode Dumpfile
- Impdp scott/tiger tablesptab2hi
rename_tableptab2himy_tab1 transport_datafiles
- Created table is my_tab1
- Impdp scott/tiger tablesptab2 partition_optionsd
epartition transport_datafiles - Created tables are ptab2_lo, ptab2_hi
- Creates all tablespaces even if no objects
created due to import filters - Cleans up any unused tablespace segments that are
not imported
13Filtering Metadata
14Filtering Metadata Overview
- Perform selective export/import job based on
object type - Exp/imp had limited filtering
- Grants, index, triggers, statistics, constraints
- Data Pump has almost complete filtering
capabilities
15Filtering Metadata Using INCLUDE and EXCLUDE
Parameters
- If using exclude parameter, everything else is
included - If using include parameter, everything else is
excluded - Cant use exclude and include in the same Data
Pump job - Specify complete path or partial path. Objects
matching the specified path will be
excluded/included. - Query to find exclude/include object types
- select unique seq_num, full_pathfrom
sys.datapump_pathswhere het_type
'DATABASE_EXPORT' order by seq_num - Job_type het_type
- FULL
DATABASE_EXPORT - SCHEMA SCHEMA_EXPORT
- TABLE TABLE_EXPORT
- TRANSPORTABLE TRANSPORTABLE_EXPORT
16Exclude Example expdp system/manager
schemahr excludestatistics
vs expdp system/manager
schemahr exclude
SCHEMA_EXPORT/TABLE/STATISTICS
- select unique seq_num, full_path
- from sys.datapump_paths
- where het_type 'SCHEMA_EXPORT' AND full_path
like 'STATISTICS - order by seq_num
- 77 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS
- 78 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_S
TATISTICS - 221 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS
- 222 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTION
AL_AND_BITMAP - 223 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTION
AL_AND_BITMAP/INDEX_STATISTICS - 225 SCHEMA_EXPORT/TABLE/STATISTICS
- 226 SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTI
CS - 227 SCHEMA_EXPORT/TABLE/STATISTICS/USER_PREF_STAT
ISTICS
17Include Examples
- impdp system/manager tableshr.employees vs
- impdp system/manager schemashr
includetable\" \'EMPLOYEES\'\" - Same results
- Includes all objects that have table in the path
- vs
- impdp system/manager schemashr
includetable/table\" \'EMPLOYEES\'\ - Only includes the table
- Some of the TABLE object paths
- SCHEMA_EXPORT/TABLE/TABLE
- SCHEMA_EXPORT/TABLE/TABLE_DATA
- SCHEMA_EXPORT/TABLE/GRANT
18Restarting Stopped/Failed Jobs
19Restarting Data Pump Jobs
- Restart jobs intentionally or unintentionally
stopped - User stopped
- Dumpfile exhausted
- Resumable wait
- Optionally change value of PARALLEL parameter
- Helps to know the job name
- System generated job names
- SYS_IMPORT_FULL_01
- SYS_EXPORT_TABLE_05
- What jobs are restartable
- Select from dba_datapump_jobs
- Select from user_datapump_jobs
20Restarting Stopped Export JobsInitial Export
Command expdp system/manager
job_namedatapump_exp
- Restart command
- expdp system/manager attachdatapump_exp
- Exportgt start_job
- Resets dumpfile pointers to last completed object
type - Restarts the job after the last completed object
21Restarting Stopped Import Jobs initial import
command impdp system/manager
job_namedatapump_imp
- Restart command
- Impdp system/manager attachsystem.datapump_imp
- Importgt Start_jobskip_current
- Detects the last object created
- If creating an object and the object exists,
checks to see if object was in progress in failed
job - Data that was being loaded would have either been
committed or not. Committed data marked
complete, uncommitted data will be retried.
22Restarting Stopped Network Import JobsInitial
Import Command impdp scott/tiger
job_namenet_imp network_linkdbs1
- Restart command
- Impdp scott/tiger attachnet_imp
- Importgt Start_jobskip_current
- Restarts on object type
- If creating object and object exists, checks to
see if object was in progress in failed job - Data that was being loaded would have either been
committed or not. Committed data marked
complete, uncommitted data will be retried.
23Hear about Nokia Corporation's Database
Environment and how they use Oracle Data Pump
24Nokia
- Nokia is a world leader in mobility
- Head office in Finland RD, production, sales,
marketing activities around the world - Worlds 1 manufacturer of mobile devices, with
estimated 40 share of global device market in
2009 - Mobile device volumes 468 million units
- Net sales EUR 50.7 billion
- Operating profit EUR 5.0 billion
- 128 445 employees at year end (including Nokia
Siemens Networks) - Strong RD presence in 16 countries
- RD investment EUR 6.0 billion
- Sales in more than 150 countries
24
25Oracle _at_ Nokia
- 20 DBAs, 2300 databases
- Grid Control used for almost 5 years, 1 tool
for all DBAs - Applications
- OLTP 50, Hybrid databases 40, DW 10
Teradata - About 1800 Oracle also MySQL 200, MS SQL 300
- 11g 4
- 10g 80
- 9i 15
- 8i 1
- Host platforms
- Solaris 35
- HP-UX 35
- Linux 20
- Windows 10
25
26Oracle _at_ Nokia
- Production databases single, HA Oracle, RAC, HA
MySQL - 5 MB 7 TB range
- Leverage virtually all Oracle database features
- DBAs share responsibilities
- Many federated databases Streams, Advanced
Replication, Materialized views, DB links - Common Oracle platform, Common MySQL platform and
Common MS SQL platform - 99 of databases backed up with RMAN
- 8 RMAN catalog databases
- 10 PB backed up / month
26
27What is Data Masking?
27
28Hints for Effective Data Masking
- Be sure to mask all data that is not sensitive,
but can be used to create sensitive data. - You should not be able to reverse the masking for
it to be secure. Never should you be able to
retrieve original values from masked values. - The masked or obfuscated data should follow the
source data format. Masking should protect data,
but still allow realistic looking data to be used
for testing. - Integrity of relations in data should be
followed. When masking a primary key, the
foreign keys should also be masked using the same
masking method. - You should be able to repeat the masking process
for it to be an effective day to day process.
29Data Masking with Oracle Data Pump
- One of the best uses of Oracle Data Pump is to
load test and development systems with data from
production systems for realistic testing
purposes. - Due to company rules or legal regulations,
sensitive data in the non-production systems has
to be replaced with realistic looking data to
enable effective testing. - Effective data masking routines should support
automation and Oracle Data Pump provides several
ways to automate data imports and exports with
masking.
29
30Data Masking with Oracle Data Pump
- In addition to Data Masking Pack in Grid Control,
Oracle Data Pump provides a method to mask data
REMAP_DATA parameter introduced in Oracle
Database 11g. - Oracle Data Pumps REMAP_DATA feature uses a
remapping function to rewrite data. - For example, a column with phone numbers could be
replaced by a numbers generated by a REMAP_DATA
function.
30
31Data Masking with Oracle Data Pump
- REMAP_DATA allows transformation of columns data
while exporting (expdp) or importing (impdp) by
using a remapping function in the database. - REMAP_DATA with Data Pump is usually faster than
a custom UPDATE statement for masking data. - To mask multiple columns in the same process and
command, the REMAP_DATA parameter can be used
multiple times.
31
32Data Masking with Oracle Data Pump Quick example
- REMAP_DATAschema.tablename.column_nameschema.
pkg.function - impdp dumpfiledata.dmp REMAP_DATAscott.orders.cu
stomer_namescott.maskpkg.mask - expdp dumpfiledata.dmp REMAP_DATAscott.orders.cu
stomer_namescott.maskpkg.mask
32
33Data Masking with Oracle Data PumpSyntax
- REMAP_DATA
- schema1.tablename.column_name
- schema2.pkg.function
- schema1 -- the schema with the table to be
remapped. By default, this is the schema of the
user doing the export. - tablename -- the table which column will be
remapped.
- column_name -- which is to be remapped.
- schema2 -- with the PL/SQL package for remapping
function. As a default, this is the schema of the
user doing the export. - pkg -- the name of the PL/SQL package with the
remapping function. - function -- the name of the remap function in the
PL/SQL package
33
34Data Masking with Oracle Data Pump Restrictions
- Data types must be same in the table column,
masking function parameter, and function return
value. - No commits or rollbacks should be done in the
masking function. - No direct path loads can be used in import
process with REMAP_DATA. - Note Operation of long export/import data pump
processes can be monitored from the
vsession_longops view, but the estimated values
do not take into account REMAP_DATA operations.
34
35Data Masking with Oracle Data Pump Example
- Create a table in the CUSTOMERS schema called
phones - SQLgt
- CREATE TABLE CUSTOMERS.PHONES
- (
- MODELNAME VARCHAR(20) NOT NULL,
- PHONENUMBER VARCHAR2(50)
- )
- insert into CUSTOMERS.PHONES values(N900,35812
34567) - insert into CUSTOMERS.PHONES values(N8,3589817
654) - insert into CUSTOMERS.PHONES values(N7,3584834
819) - Commit
35
36Data Masking with Oracle Data PumpExample
- We then need to create a function for remapping
- create or replace package customers.maskpkg
- as
- function masknumber(phonenumber varchar2)
return varchar2 - end
- /
- create or replace package body customers.maskpkg
as - function masknumber (phonenumber varchar2)
return varchar2 is - begin return substr(phonenumber,1,4)
round(dbms_random.value (100,999))
lpad(round(dbms_random.value (1,9999)),4,'0') - end
- end
- /
- The function masknumber will accept a varchar2
type and returns a random phone number in
varchar2 type
36
37Data Masking with Oracle Data PumpExample
- This example will mask one column phonenumber to
the export output file - expdp customers/manager \
- tablescustomers.phones \
- dumpfilephones_masked.dmp \
- directorydumpdir \
- remap_data\
- customers.phones.phonenumber\
- customers.maskpkg.masknumber
- Export the data from the customers.phones table
with expdp utility and use the REMAP_DATA option
for masking the data in the dump file with the
function customers.maskpkg.masknumber created
earlier - By default the owner of the remap function is the
user running expdp/impdp - Now the dumpfile phones_masked.dmp can be used
for testing environments
37
38Data Masking with Oracle Data PumpExample,
before masking
- SQLgt select from customers.phones
- MODELNAME PHONENUMBER
- ---------- --------------------
- N900 3591234567
- N8 3589817654
- N7 3584834819
38
39Data Masking with Oracle Data Pump
import/replace table
- This example will import the masked dump file
- and replace the existing table with masked column
data - impdp customers/manager \
- tablescustomers.phones \
- dumpfilephones_masked.dmp \
- directorydumpdir \
- table_exists_actionREPLACE
39
40Data Masking with Oracle Data PumpExample,
after import
- SQLgt select from customers.phones
- MODELNAME PHONENUMBER
- ---------- --------------------
- N900 35815499474
- N8 3584800578
- N7 3581247839
40
41Data Masking with Oracle Data PumpImport
- REMAP_DATA parameter can also be used in the
import process - Use impdp if you have an existing dump file and
you want to mask the data when loading into the
database - impdp customers/manager TABLE_EXISTS_ACTIONrepl
ace dumpfilephones.dmp directorydumpdir
remap_datacustomers.phones.phonenumbercustomers.
maskpkg.masknumber
41
42Primary Key Regeneration with REMAP_DATA
- To change primary keys after development phase of
an application, use REMAP_DATA parameter. - It is possible to use Data Pumps REMAP_DATA to
change the primary keys within a database by
writing a remapping function which follows the
new primary key format. - Primary key conflicts can happen when loading
data to an existing table in a database. - With REMAP_DATA, conflicts can be avoided by
changing the column values during import.
42
43When not to use Masking
- It can be very difficult and expensive to mask
data especially in large BI/DWH systems due to
the amount of data and refining operations. - It is often easier to use a data model in which
sensitive data can be - limited to one location, referenced via surrogate
keys, and - kept safe with strict access policies and
auditing. - But when giving out data for simplified testing
needs, masking is a good tool!
44Other Methods to Secure Data in Oracle Database
11g Release 2
- Oracle Virtual Private Database (VPD aka Fine
Grained Access Control) feature allows filtering
of data at row-level for runtime SQL statements
according to a defined policy. - Data Masking Pack for Enterprise Manager.
Centralized masking solution within Grid Control.
- Oracle Advanced Security Option for encrypting
data at tablespace or column level. - Oracle Label Security for restricting access to
data based on policies. - Oracle Database Vault for separating roles and
restricting access to data per roles.
45Oracle Data Pump at Nokia Summary
- Oracle Data Pump is used extensively at Nokia to
move data and metadata between systems. - REMAP_DATA option is a great way to mask
sensitive data for security purposes. - REMAP_DATA option can also be used to change
primary keys. - Oracle has many features for securing data at
different levels. - Security for data is not just Obscurity. It is a
process and requires time.
45
46Oracle Data Pump Summary
- Transportable Tablespaces
- Tablespace mode
- Table mode
- Filtering metadata using the INCLUDE and EXCLUDE
parameters - Restarting stopped/failed jobs
- REMAP_DATA parameter to scramble data
- How to generate primary keys without having to
use additional software or scripts
47Questions?
- Oracle Data Pump
- Demogrounds Booth Moscone West W-025
48(No Transcript)
49(No Transcript)