Title: Telepresence: An Umbrella Research Topic
1Telepresence An Umbrella Research Topic
- Jim Gray
- Microsoft Research
- Gray_at_Microsoft.com
- http//research.Microsoft.com/Gray/
2NSF Nerve Center of ScienceIf its not broke,
dont fix it.But.
- US Science is the engine of progressBUT..
- Best and brightest are spending increasing time
fundraising - Seems excessive to me.
- Venture capital community is richer and more
generous than NSF
3Outline (ambitious!)
- Microsoft Research (census)
- Tele-Presentations (Gordon Bell, Jim Gemmell)
- Microsoft Research initiative on Telepresence
- What if you could record everything you see
hear? - The architecture revolution processing moves to
transducers
4Microsoft Research -- 1991
- Founded in 1991
- Goal pursue strategic technologies for
Microsoft - Original research groups
- Natural Language Processing
- Operating Systems
- Programming Languages
- Overall size lt 20 at the end of 1992
5Microsoft Research -- 1998
- 280 Researchers in 25 areas
- Operating systems to Statistical Physics
- Research lab locations
- Redmond, Cambridge, San Francisco
- Internationally recognized research teams
- Hundreds of publications, presentations
- Leadership roles in professional societies,
journals, conferences
6MS Research Areas
- Operating systems, languages, compilers, virtual
machines, networking, wireless computing,
fault-tolerance, large scale servers, security - Natural language, speech, vision, graphics,
decision theory, information retrieval, UI,
collaboration, statistics, signal processing - Cryptography, statistical physics and discrete
mathematics
7Growing Fast
- Grew 4x from 94 to 97
- Decided in 97 to grow by a 3x in 3 years
- 200 in FY97 gt 600 in FY00, primarily in Redmond
- Major impact on MS products
- Virtually all MS products shipped today use
technology from MS Research - Critical role in MS growth
- Pioneering research in software that allows
computers to see, hear, speak and understand
8Microsoft Research Philosophy
- University organizational model
- Flat structure, critical mass groups
- Open research environment
- Aggressive publication of research results in
literature and on world wide web - Frequent visitors, daily seminars
- Over 70 visiting professors and interns in 1997
- Over 110 visiting researchers in 1998
9Some Key Senior Researchers
- Systems
- Rick Rashid, Butler Lampson, Gordon Bell
- Anoop Gupta, Roger Needham, Chuck Thacker
- Databases Data Mining
- David Lomet, Jim Gray, Usama Fayyad
- Graphics
- Jim Kajiya, Jim Blinn, Alvy Ray Smith, Michael
Cohen - Speech Language
- Karen Jensen, George Heidorn, X.D. Huang, Alex
Acero, Hsiao-Wuen Hon, Scott Meredith
10Some Key Senior Researchers
- UI Design, Intelligent Systems, IR
- George Robertson, Linda Stone, Susan Dumais,
David Heckerman, Eric Horvitz, Jack Breese - Computer Vision Signal Processing
- Steve Shafer, Rick Szeliski, P. Anandan, Rico
Malvar - Cryptography Theory
- Yacov Yacobi, Jennifer Chayes, Christian Borg,
Michael Freedman - Languages Compilers
- Daniel Weise, Chris Fraser, Amitabh Srivastava,
Luca Cardelli, David Hanson, Charles Simonyi,
Todd Proebsting
11Microsoft Research
- 1997 BusinessWeek Poll of Academia
- Voted 7 lab (overall) in Computer Science
- Voted 3 industrial research lab (after Bell
Labs and IBM Research) - Voted 2 most desirable lab to work (after
Stanford)
12Outline (ambitious!)
- Microsoft Research (census)
- Tele-Presentations (Gordon Bell, Jim Gemmell)
- Microsoft Research initiative on Telepresence
- What if you could record everything you see
hear? - The architecture revolution processing moves to
transducers
13Gordon Bell on Tele Presentations
http//research.microsoft.com/barc/GBell/
14MotivationTelepresentations
- Presenter and/or audience telepresent
NOT meeting or collaboration settings Forget the
nasty social issues!
Mostly one-way
15TelepresentationElements
- Slides
- Audio
- Video
- Script, text comments, hyperlinks,etc.
16TelepresentationsThe Essentials
- Slide and audio a must
- Add some video (low quality) to make us feel
good - Storage and transmission costs low
17TelepresentationsThe Killer App
- Increased attendance lower travel costs
- Practical and low-cost NOW
- e.g. ACM97 - 2,000 visitors in real space, 20,000
visitors on Internethttp//research.microsoft.com
/acm97
18TodaysExperiment
- Would you like to pause, rewind, browse?
- Do you wish you could have seen this
- At home?
- At another time?
- How much does a present speaker add? How much
would you pay for real presence?
19Outline (ambitious!)
- Microsoft Research (census)
- Tele-Presentations (Gordon Bell, Jim Gemmell)
- Microsoft Research initiative on Telepresence
- What if you could record everything you see
hear? - The architecture revolution processing moves to
transducers
20Changing role of computation
- Past Computers for
- computing (Cray)
- business data processing (IBM)
- document creation (PC)
- Future Computers for
- understanding learning
- communicating
- consuming entertaining
- Requires new User Interface to machines
21Flows
22Making Flows a Reality
- Computer Graphics
- Creating realistic looking environments, people
- Computer Vision
- Analyzing posture, gaze, gestures
- Speech input/output
- Natural Language
- Analysis, IR
- Implicit requests for information
23Building life-like human characters
24Recognizing gestures
25Generating life-like speech from textual data
- Data-driven stochastic speech
- Natural sounding
- Rapid, automatic customizability
- Examples
- Synthetic voice w/ transplanted speech contours
26Artificial singing
- ATT Voder, 1962, by Homer Dudley
- Daisy (Inspiration for HALs voice in 2001)
- Microsoft Research Whistler, 1997
- Scarborough Fair
27Analyzing language
- Language recognition shipped in Word 97
- General purpose text-critiquing, summarization,
Japanese word-breaking
28Inside The Office Grammar Checker
29Understanding language MindNet
- A huge language knowledge base
- Automatically created from dictionaries
- Words (nodes) linked by relationships
- Millions of links
- Recently added (Encarta) encyclopedia knowledge
30chicken
Is_a
Typ_obj
Purpose
Is_a
Quesp
Typ_0bj_of
hen
Is_a
Is_a
Typ_obj
Purpose
Cause
Typ_subj
Is_a
egg
Means
Not_is_a
Typ_subj
Is_a
Is_a
Is_a
Is_a
Is_a
make
Typ_obj
Part
Is_a
Is_a
wing
Is_a
Is_a
Typ_subj_of
Means
Is_a
Is_a
Part
Part_of
Is_a
Typ_obj
Typ_subj_of
Is_a
Is_a
Typ_subj
Locn_of
Is_a
31Changing balance between user software systems
- Yesterday
- Applications were single programs running in
isolation - Users used to (more or less) understand systems
that they used - Today
- Componentized applications operate in concert
- Sophisticated users understand only small
percentage of systems they use
32Tomorrows Systems and Applications
- Users will not be able to predict
- where computations will be performed,
- when they will be performed or
- by what software components
- Gap between system capabilities and user
understanding will grow to the point that the
only way user will be able to use system is
through assisting agents
33Examples of user agents implicit actions
- Lumiere (Office 97)
- Monitoring user and program events to provide
user help and assistance - Implicit queries
- Inferring information needs from browsing
- Lookout/SpamKiller
- Monitoring mail activity to auto-categorize it
34User Modeling
- Models of a users informational goals
- Users query (when available)
- Users background
- Acute and long-term search activity
- Acute actions with objects and documents
- Program data structures
- Explicit and implicit information access and
display
35Outline (ambitious!)
- Microsoft Research (census)
- Tele-Presentations (Gordon Bell, Jim Gemmell)
- Microsoft Research initiative on Telepresence
- What if you could record everything you see
hear? - The architecture revolution processing moves to
transducers
36Some Tera-Byte Databases
Kilo Mega Giga Tera Peta Exa Zetta Yotta
- The Web 1 TB of HTML
- TerraServer 1 TB of images
- Several other 1 TB (file) servers
- Hotmail 7 TB of email
- Sloan Digital Sky Survey 40 TB raw, 2 TB
cooked - EOS/DIS (picture of planet each week)
- 15 PB by 2007
- Federal Clearing house images of checks
- 15 PB by 2006 (7 year history)
- Nuclear Stockpile Stewardship Program
- 10 Exabytes (???!!)
37Info Capture
Kilo Mega Giga Tera Peta Exa Zetta Yotta
A letter
A novel
- You can record everything you see or hear or
read. - What would you do with it?
- How would you organize analyze it?
A Movie
Library of Congress (text)
LoC (image)
All Disks
Video 8 PB per lifetime (10GBph) Audio 30 TB
(10KBps) Read or write 8 GB (words) See
http//www.lesk.com/mlesk/ksg97/ksg.html
All Tapes
38Kilo Mega Giga Tera Peta Exa Zetta Yotta
A letter
A novel
A Movie
Library of Congress (text)
LoC (image)
LoC (sound cinima)
All Photos
All Disks
All Tapes
All Information!
39Michael Lesks Points www.lesk.com/mlesk/ksg97/ks
g.html
- Soon everything can be recorded and kept
- Most data will never be seen by humans
- Precious Resource Human attention
Auto-Summarization Auto-Searchwill be a key
enabling technology.
40Outline (ambitious!)
- Microsoft Research (census)
- Tele-Presentations (Gordon Bell, Jim Gemmell)
- Microsoft Research initiative on Telepresence
- What if you could record everything you see
hear? - The architecture revolution processing moves to
transducers
41Put Everything in Future (Disk)
Controllers(its not if, its
when?)AcknowledgementsDave Patterson
explained this to me a year ago
Kim Keeton Erik Riedel
Catharine Van Ingen
Helped me sharpen these arguments
42Remember Your Roots
43Technology Drivers Disks
Kilo Mega Giga Tera Peta Exa Zetta Yotta
- Disks on track
- 100x in 10 years 2 TB 3.5 drive
- Shrink to 1 is 200GB
- Disk replaces tape?
- Disk is super computer!
44Data Gravity Processing Moves to Transducers
- Move Processing to data sources
- Move to where the power (and sheet metal) is
- Processor in
- Modem
- Display
- Microphones (speech recognition) cameras
(vision) - Storage Data storage and analysis
45Its Already True of PrintersPeripheral
CyberBrick
- You buy a printer
- You get a
- several network interfaces
- A Postscript engine
- cpu,
- memory,
- software,
- a spooler (soon)
- and a print engine.
46All Device Controllers will be Cray 1s
- TODAY
- Disk controller is 10 mips risc engine with 2MB
DRAM - NIC is similar power
- SOON
- Will become 100 mips systems with 100 MB DRAM.
- They are nodes in a federation (can run Oracle
on NT in disk controller). - Advantages
- Uniform programming model
- Great tools
- Security
- economics (CyberBricks)
- Move computation to data (minimize traffic)
Central Processor Memory
Tera Byte Backplane
47Basic Argument for x-Disks
- Future disk controller is a super-computer.
- 1 bips processor
- 128 MB dram
- 100 GB disk plus one arm
- Connects to SAN via high-level protocols
- RPC, HTTP, DCOM, Kerberos, Directory
Services,. - Commands are RPCs
- Management, security,.
- Services file/web/db/ requests
- Managed by general-purpose OS with good dev
environment - Apps in disk saves data movement
- need programming environment in controller
48The Slippery Slope
Nothing Sector Server
- If you add function to server
- Then you add more function to server
- Function gravitates to data.
Something Fixed App Server
Everything App Server
49Why Not a Sector Server?(lets get physical!)
- Good idea, thats what we have today.
- But
- cache added for performance
- Sector remap added for fault tolerance
- error reporting and diagnostics added
- SCSI commends (reserve,.. are growing)
- Sharing problematic (space mgmt, security,)
- Slipping down the slope to a 2-D block server
50Why Not a 1-D Block Server?Put A LITTLE on the
Disk Server
- Tried and true design
- HSC - VAX cluster
- EMC
- IBM Sysplex (3980?)
- But look inside
- Has a cache
- Has space management
- Has error reporting management
- Has RAID 0, 1, 2, 3, 4, 5, 10, 50,
- Has locking
- Has remote replication
- Has an OS
- Security is problematic
- Low-level interface moves too many bytes
51Why Not a 2-D Block Server?Put A LITTLE on the
Disk Server
- Tried and true design
- Cedar -gt NFS
- file server, cache, space,..
- Open file is many fewer msgs
- Grows to have
- Directories Naming
- Authentication access control
- RAID 0, 1, 2, 3, 4, 5, 10, 50,
- Locking
- Backup/restore/admin
- Cooperative caching with client
- File Servers are a BIG hit NetWare
- SNAP! is my favorite today
52Why Not a File Server?Put a Little on the Disk
Server
- Tried and true design
- Auspex, NetApp, ...
- Netware
- Yes, but look at NetWare
- File interface gives you app invocation interface
- Became an app server
- Mail, DB, Web,.
- Netware had a primitive OS
- Hard to program, so optimized wrong thing
53Why Not Everything?Allow Everything on Disk
Server(thin clients)
- Tried and true design
- Mainframes, Minis, ...
- Web servers,
- Encapsulates data
- Minimizes data moves
- Scaleable
- It is where everyone ends up.
- All the arguments against are short-term.
54The Slippery Slope
Nothing Sector Server
- If you add function to server
- Then you add more function to server
- Function gravitates to data.
Something Fixed App Server
Everything App Server
55Disk Node
- has magnetic storage (100 GB?)
- has processor DRAM
- has SAN attachment
- has execution environment
Applications
Services
DBMS
File System
RPC, ...
SAN driver
Disk driver
OS Kernel
56Technology Drivers System on a Chip
- Integrate Processing with memory on one chip
- chip is 75 memory now
- 1MB cache gtgt 1960 supercomputers
- 256 Mb memory chip is 32 MB!
- IRAM, CRAM, PIM, projects abound
- Integrate Networking with processing on one chip
- system bus is a kind of network
- ATM, FiberChannel, Ethernet,.. Logic on chip.
- Direct IO (no intermediate bus)
- Functionally specialized cards shrink to a chip.
57Technology Drivers What if Networking Was as
Cheap As Disk IO?
- Disk
- Unix/NT 8 cpu _at_ 40MBps
- TCP/IP
- Unix/NT 100 cpu _at_ 40MBps
58Technology Drivers The Promise of SAN/VIA10x
in 2 years http//www.ViArch.org/
- Today
- wires are 10 MBps (100 Mbps Ethernet)
- 20 MBps tcp/ip saturates 2 cpus
- round-trip latency is 300 us
- In the lab
- Wires are 10x faster Myrinet, Gbps Ethernet,
ServerNet, - Fast user-level communication
- tcp/ip 100 MBps 10 of each processor
- round-trip latency is 15 us
59SAN Standard Interconnect
Gbps Ethernet 110 MBps
- LAN faster than memory bus?
- 1 GBps links in lab.
- 100 port cost soon
- Port is computer
PCI 70 MBps
UW Scsi 40 MBps
FW scsi 20 MBps
scsi 5 MBps
60Technology Drivers100 GBps Ethernet replaces
SCSI
- Why I love SCSI
- Its fast (40MBps)
- The protocol uses little processor power
- Why I hate SCSI
- Wires must be short
- Cables are pricey
- pins bend
61Functionally Specialized Cards
P mips processor
Today P50 mips M 2 MB
ASIC
M MB DRAM
In a few years P 200 mips M 64 MB
ASIC
ASIC
62Technology DriversPlug Play Software
- RPC is standardizing (DCOM, IIOP, HTTP)
- Gives huge TOOL LEVERAGE
- Solves the hard problems for you
- naming,
- security,
- directory service,
- operations,...
- Commoditized programming environments
- FreeBSD, Linix, Solaris, tools
- NetWare tools
- WinCE, WinNT, tools
- JavaOS tools
- Apps gravitate to data.
- General purpose OS on controller runs apps.
63Basic Argument for x-Disks
- Future disk controller is a super-computer.
- 1 bips processor
- 128 MB dram
- 100 GB disk plus one arm
- Connects to SAN via high-level protocols
- RPC, HTTP, DCOM, Kerberos, Directory
Services,. - Commands are RPCs
- management, security,.
- Services file/web/db/ requests
- Managed by general-purpose OS with good dev
environment - Move apps to disk to save data movement
- need programming environment in controller
64Summary
- Microsoft Research (census)
- Tele-Presentations (Gordon Bell, Jim Gemmell)
- Microsoft Research initiative on Telepresence
- What if you could record everything you see
hear? - The architecture revolution processing moves to
transducers