Big Data Testing

1 / 15
About This Presentation
Title:

Big Data Testing

Description:

Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques. – PowerPoint PPT presentation

Number of Views:33

less

Transcript and Presenter's Notes

Title: Big Data Testing


1
BIG DATA TESTING
  • By QA InfoTech

2
Scenario
3
OMG!! Did he just asked me to catch rats in a
place full of snakes
4
Agenda
  • What is Big Data
  • Characteristic of Big Data
  • Meaning of BIG DATA to US
  • Hadoop
  • 6. Submitting a Map Reduce Job

5
What is BIG DATA?
  • Big Data is similar to small data, but bigger
    in size
  • Big Data generates value from the storage and
    processing of very large quantities of digital
    information that cannot be analyzed with
    traditional computing techniques.
  • Walmart handles more than 1 million customer
    transactions every hour.
  • Facebook handles 40 billion photos from its user
    base.
  • Decoding the human genome originally took
    10years to process now it can be achieved in one
    week.

6
Three Characteristics of Big Data V3s
7
What BIG DATA TESTING mean to Testers?
  • Take into consideration these 3 perspectives
  • Data
  • Infrastructure
  • Validation Tools

8
Now the questions comes what technology is needed
for handling BIG DATA ?
  • 1.HADOOP

9
Hadoop Its Components
  • Hadoop is an open-source software framework for
    storing and processing big data in a distributed
    fashion on large clusters of commodity hardware.
    Essentially, it accomplishes two tasks massive
    data storage and faster processing.
  • Source http//www.trieuvan.com/apache/hadoop/comm
    on/

10
How is Hadoop Helping?
  • HDFS Java based distributed FS that can run and
    store all kinds of data
  • Map Reduce A software programming model for
    processing large set of data in parallel
  • YARN A resource management framework for
    scheduling and handling resource requests from
    distributed applications

11
This is our Input File Input Sampleset.txt
12
Map Reduce Program For Max Temperature Driver
Class
  • Job job new Job()
  • job.setJarByClass(MaxTemperatureDriver.class)
  • job.setJobName("Max Temperature")
  • FileInputFormat.addInputPath(job, new
    Path(args0))
  • FileOutputFormat.setOutputPath(job, new
    Path(args1))
  • job.setMapperClass(MaxTemperatureMapper.class)
  • job.setReducerClass(MaxTemperatureReducer.class)

13
Mapper Class
  • _at_Override
  • public void map(LongWritable key, Text value,
    Context context)
  • throws IOException, InterruptedException
  • String line value.toString()
  • String year line.substring(15, 19)
  • int airTemperature
  • if (line.charAt(87) '') // parseInt doesn't
    like leading plus
  • // signs
  • airTemperature Integer.parseInt(line.substring(8
    8, 92))
  • else
  • airTemperature Integer.parseInt(line.substring(8
    7, 92))

14
Reducer Class
  • _at_Override
  • public void reduce(Text key, IterableltIntWritablegt
    values,
  • Context context)
  • throws IOException, InterruptedException
  • int maxValue Integer.MIN_VALUE
  • for (IntWritable value values)
  • maxValue Math.max(maxValue, value.get())
  • context.write(key, new IntWritable(maxValue))

15
Thank You
  • For more information, please
  • Contact us at info_at_qainfotech.com
  • Visit us at www.qainfotech.com
  • Read our blog at www.qainfotech.com/blog
  • Follow us on Twitter at www.twitter.com/qainfotech

USA Office
International Headquarters
Noida Uttar Pradesh, India
Farmington Hills Michigan, U.S.A.
Write a Comment
User Comments (0)