MapReduce?? - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

MapReduce??

Description:

Title: PowerPoint Presentation Last modified by: cherio Created Date: 1/1/1601 12:00:00 AM Document presentation format: Other titles – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 45
Provided by: accn
Category:

less

Transcript and Presenter's Notes

Title: MapReduce??


1
?????????????????
  • ???
  • ??????
  • 2008-08-01

2
??
  • ????
  • ????
  • MapReduce??
  • Google?????????
  • ????Hadoop??
  • ???????

3
????
4
????
  • ??????(SMT)??????????
  • ????????????
  • ?????????????????
  • ???????
  • ??????????????????
  • ????????????????????
  • ??????????????????????
  • ????????????????
  • ??(??)???????

5
????
??
???
????
????
SMT???
???
???
????
???
???
????
??
6
????
  • ???????????
  • ????????
  • ????????
  • ????
  • ????
  • ????
  • ????

7
????
  • n-gram??????????
  • ?
  • ?3-gram????
  • P(This is a book .)P(This)P(isThis)P(aTh
    is is)
  • P(bookis
    a)P(.a book)
  • ????
  • ?P(bookis a) C(is a book)/C(is a)

8
????
  • ??????????
  • ????
  • P(I got to Wenchuan .) 0 ?
  • ????
  • ?????????????????????
  • ????????
  • ?WB??

  • wi-n1wi??

9
????
  • ??????????
  • ??????
  • (n-gram1, P1) (n-gramm, Pm)
  • ????
  • ????????
  • ????????,n-gram????
  • n??,n-gram??
  • ??,??????, n??,?????????,????????

10
????
11
????
  • ??
  • ?????????
  • ?????????????
  • ?????
  • ??260M???5?????56G??
  • ?????
  • ??260M???5?????0.51?
  • ?????????????????
  • ?????
  • ?260M????????????2.6G??(SRILM)
  • ?????????
  • Gigaword, FBIS, Web1T
  • ??????
  • ????????????????PC?????
  • ????
  • ???????,??????

12
????
  • ??
  • ????????????
  • ????????????
  • ???????
  • ??????????????????

13
????
14
????
  • ????????????
  • ????
  • ??????????????
  • ??
  • ???????????
  • ????
  • ?????????????????
  • ??
  • ??????,?????
  • ??????????

15
????
  • ????????????
  • ????
  • ?????????????????????????
  • ??
  • ??????
  • ?????
  • ????
  • ?????????????,????????????,??????
  • ??
  • ??????
  • ???????????,?????????

16
????
  • ????????????
  • ?????
  • ???????????????,????????????,??????????????
  • ??
  • ????????
  • ????
  • ???????????????????,???????????????????????????

17
????
  • ????????????
  • ?????
  • ???????????,????????,??????????,????????
  • ??
  • ??????????
  • ??????????

18
????
  • ?????
  • ??????????????
  • ???????????
  • ????????
  • ????????
  • ??????????????
  • ?????????????????

19
MapReduce??
20
MapReduce??
  • ?Google?????
  • ??
  • ???????(gt1TB)
  • ???????????PC?????
  • ??????????????

21
???MapReduce?
  • ???????????
  • ??????????
  • ???PC???????
  • ?????????
  • ????????????

22
Distributed Word Count
Very big data
23
Map Reduce??
R E D U C E
  • Map
  • ?????(?,?)?
  • ????(?,?)?
  • Reduce
  • ????(?,?)?
  • ????(?,?)?

M A P
Very big data
Result
24
Partitioning Function
25
????
  • ?????????
  • ????????????
  • map (in_key, in_value) -gt
  • (out_key, intermediate_value) list
  • reduce (out_key, intermediate_value list) -gt
  • out_value list

26
???
  • map??????,???????????????(?,?)?
  • reduce??????,????????
  • ???????????

27
MapReduce
  • ???????????????????
  • map(string key, string value)
  • //keydocument id value document content
  • for each word w in value
  • EmitIntermediate(w,1)
  • reduce(string key, iterator values)
  • //key a word values a list of counts
  • for each v in values
  • result ParseInt(v)
  • Emit(AsString(result))

28
Google?????????
29
??????????
  • ??????
  • ??MapReduce??
  • ????
  • ?????
  • ??n-gram?
  • ??????

30
??????????
  • ?????
  • map
  • shard
  • reduce

31
??????????
  • ??n-gram?
  • map
  • shard
  • reduce

32
(No Transcript)
33
??????????
  • ??????
  • ??Client-Server??
  • ?????????????
  • ???????????n-gram?????
  • ????
  • ????n-gram????????????
  • ???????????????????

34
(No Transcript)
35
????Hadoop??
36
Hadoop??
  • Hadoop??
  • ???MapReduce??
  • Apache??????
  • ???Doug Cutting
  • ????
  • HDFS Hadoop Distributed File System
  • MapReduce ???????
  • Hbase ???????

37
Hadoop??
  • ???? Java
  • ????
  • Linux, OS/X, Solaris, Windows
  • ????????
  • Java
  • C/C (Streaming??,PIPE??)
  • Python (Streaming??,Jython??)
  • Perl
  • PHP

38
(No Transcript)
39
Hadoop??
  • MapReduce

40
Hadoop??
  • ????
  • Yahoo!
  • ?????2000???
  • ??1PB??
  • ????10000???

41
?????
42
?????
  • Hadoop?????
  • ????????
  • ???????
  • Hadoop?MapReduce??
  • ??streaming???????????
  • ??Pipe??

43
?????
  • ???????MapReduce????
  • ??????
  • ?????
  • ????????Client/Server??
  • ????
  • ????????

44
??!
Write a Comment
User Comments (0)
About PowerShow.com