Title: Apache Hive-Orien IT
1 http//www.orienit.com/
2Database Vs Data Warehouse
- Historical Data is the back bone of any
business for mission critical business decisions.
Data is stored in some form of tables in the
database.So why the Business Intelligence systems
are using Data Warehouse rather than Database to
pull historical data?
What is the difference between Database and DataWa
rehouse while both of them have some tables with
Data, Index and key etc.? Here are Differences.
http//www.orienit.com/
3 DATA BASE
- Used for Online Transactional Processing (OLTP).
This records the data from the user for history. - The tables and joins are complex since they are
normalized. This is done to reduce redundant data
and to save storage space. - Entity Relational modeling techniques are used
for database design. - Optimized for write operation.
- Performance is low for analysis queries.
http//www.orienit.com/
4DATA WAREHOUSE
- Used for Online Analytical Processing (OLAP).
This reads the historical data for the
Users for business decisions. - The Tables and joins are simple since they are
de-normalized. This is done to reduce
the response time for analytical queries. - Data Modeling techniques are used for the Data
Warehouse design. - Optimized for read operations.
- High performance for analytical queries.
http//www.orienit.com/
5General Data Flow (Ex Online Insurance
Registration)
- Customer enters the details in the Online
Registration form. - The details are saved into the Database when the
customer presses the Submit button in the form. - Business Intelligence Team of the Insurance
Company uses an ETL tool to pull the data from
the Database tables to the Data Warehouse tables. - Business Management uses Business Reporting
Tools to pull Data from Data Warehouse tables for
generating business reports. -
http//www.orienit.com/
6 STRING FUNCTIONS IN HIVE
The string functions in Hive are listed below
1.ASCII( string str )
- The ASCII function converts the first character
of the string into its numeric ascii value.
Example1 ASCII('hadoop') returns 104
Example2 ASCII('A') returns 65
2.CONCAT( string str1, string str2... )
- The CONCAT function concatenates all the stings.
Example CONCAT('hadoop','-','hive') returns
'hadoop-hive'
http//www.orienit.com/
73. CONCAT_WS( string delimiter, string str1,
string str2... )
- The CONCAT_WS function is similar to the CONCAT
function. Here you can also provide the
delimiter, which can be used in between the
strings to concat.
Example CONCAT_WS('-','hadoop','hive')
returns 'hadoop-hive'
4. FIND_IN_SET( string search string, string
source_string_list )
- The FIND_IN_SET function searches for the
search string in the source_string_list and
returns the position of the first occurrence in
the source string list. Here the source string
list should be comma delimited one. It returns 0
if the first argument contains comma.
Example FIND_IN_SET('ha','hao,mn,hc,ha,hef')
returns 4
http//www.orienit.com/
85. LENGTH( string str )
- The LENGTH function returns the number of
characters in a string.
Example LENGTH('hive') returns 4
6. LOWER( string str ), LCASE( string str )
- The LOWER or LCASE function converts the string
into lower case letters
Example LOWER('HiVe') returns 'hive'
7. LPAD( string str, int len, string pad )
- The LPAD function returns the string with a
length of len characters left-padded with pad.
Example LPAD('hive',6,'v') returns 'vvhive'
http//www.orienit.com/
98. LTRIM( string str )
- The LTRIM function removes all the trailing
spaces from the string.
Example LTRIM(' hive') returns 'hive'
9. REPEAT( string str, int n )
- The REPEAT function repeats the specified string
n times.
Example REPEAT('hive',2) returns 'hivehive'
10.RPAD( string str, int len, string pad )
- The RPAD function returns the string with a
length of len characters right-padded with pad.
Example RPAD('hive',6,'v') returns 'hivevv'
http//www.orienit.com/
1011. REVERSE( string str )
- The REVERSE function gives the reversed
string.
Example REVERSE('hive') returns 'evih'
12. RTRIM( string str )
- The RTRIM function removes all the leading
spaces from the string.
Example LTRIM('hive ') returns 'hive'
13. SPACE( int number_of_spaces )
- The SPACE function returns the specified number
of spaces.
Example SPACE(4) returns ' '
14. SPLIT( string str, string pat )
- The SPLIT function splits the string around the
pattern pat and returns an array of strings. You
can specify regular expressions as patterns.
Example SPLIT('hivehadoop','') returns
"hive","hadoop"
http//www.orienit.com/
1115. SUBSTR( string source_str, int
start_position ,int length ), SUBSTRING(
string source_str, int start_position ,int
length )
- The SUBSTR or SUBSTRING function returns a
part of the source string from the start position
with the specified length of characters. If the
length is not given, then it returns from the
start position to the end of the string.
Example1 SUBSTR('hadoop',4) returns 'oop'
Example2 SUBSTR('hadoop',4,2) returns 'oo'
16.TRIM( string str )
- The TRIM function removes both the trailing and
leading spaces from the string.
Example LTRIM(' hive ') returns 'hive'
http//www.orienit.com/
1217.UPPER( string str ), UCASE( string str )
- The UPPER or LCASE function converts the string
into upper case letters.
Example UPPER('HiVe') returns 'HIVE'
For More Information
http//www.kalyanhadooptraining.com/
http//www.orienit.com/
13Address Flat no 204, 2nd floor, Annapurna
Block, Aditya Enclave, Ameerpet,
Hyderabad-16. Contact us 040 6514 2345, 91
970 320 2345
Email ID info_at_OrienIT.com
http//www.orienit.com/