Title: MyLifeBits
1MyLifeBits
- Jim GemmellFebruary, 2005
2Conclusion
- We have entered an era of virtually unlimited
storage, enabling the lifetime store - To make the store useful we need annotation,
typed links, and database features - More capture, more correlation less work by the
user
3Collaborators
- Chief inspiration guinea pig Gordon Bell
- Software development lead Roger Lueder
- MSR Collaborators Lyndsay Williams, Ken Wood,
Kentaro Toyama, Ron Logan, Steve Drucker, Curtis
Wong, Mary Czerwinski, Brian Meyers - Interns Josh Blumenstock, Evan Salomon, Aleks
Aris
4Outline
- What is MyLifeBits
- History/Motivation
- MyLifeBits system outline
- Demo
- Future work
5MyLifeBits is
- An experiment in lifetime storage
- Digitizing Gordon Bells past
- Capturing more of his future
- A software system
- Capture
- Storage retrieval
- Organization annotation
- Minimum requirement fulfill Vannevar Bushs 1945
Memex vision
6MemexAs We May Think, Vannevar Bush, 1945
- A memex is a device in which an individual
stores all his books, records, and
communications, and which is mechanized so that
it may be consulted with exceeding speed and
flexibility - Full-text search, text audio annotations, and
hyperlinks
7I am data
8The guinea pig
- Has now scanned virtually all
- Books written (and read when possible)
- Personal documents (correspondence including
memos and email, bills, legal documents, papers
written, ) - Photos
- Posters, paintings, photo of things (artifacts,
medals, plaques) - Home movies and videos
- CD collection
- And, of course, all PC files
- Now recording phone, radio, TV (movies), web
pages conversations and meetings to come - Paperless throughout 2002. 12 scanned, 12
discarded. - Only 44 GB, incl. 10 wma, 14 SQL!!! Video
o(100) 500 mov
9The 1 TB Life
- 1TB gives you 65 years of
- 100 email messages a day (5KB each)
- 100 web pages day (50KB each)
- 5 scanned pages a day (100KB each)
- 1 book every 10 days (1 MB each)
- 10 photos per day (400 KB JPEG each)
- 8 hours per day of sound - e.g. telephone,voice
annotations, and meeting recordings (8 Kb/s) - 1 new music CD every 10 days (45 min each at
128 Kb/s) - It will take you 5 years to fill up your 80 GB
drive - Want video? Buy more cheap drives (1 TB/year lets
you record 4 hours/day of 1.5 Mb/s video)
10Trying to fill a terabyte in a year
- Gordons lifetime collection lt 30 GB (12 GB is
music CDs)
Item Per TB Per day
Photo (400 KB JPEG) 2.7M photos 7.3K photos
1 MB document 1.0M docs 2.9K docs
128 kb/s audio 18.6K hours 51 hours
256 kb/s video 9.3K hours 26 hours
1.5 Mb/s video 1.6K hours 4 hours
11yet if the user inserted 5000 pages of material
a day it would take him hundreds of years to fill
the repository, so that he can be profligate and
enter material freely
12So youve got it now what do you do with it?
- Can you find anything?
- Can you organize that many objects?
- Once you find it will you know what it is?
- Once youve found it once, could you find it
again?
13- A record if it is to be useful must be
continuously extended, it must be stored, and
above all it must be consulted - The difficulty seems to be, not so much that we
publish unduly but rather that publication has
been extended far beyond our present ability to
make real use of the record - - Vannevar Bush
14MyLifeBits Software
MyLifeBits store
database
15Entities Links
Photo of Event
Caller in Phone Call
Annotates
Transcludes
16MyLifeBits Schema(simplified)
Event types
Relation-ship types
Images
Music
Event log
Relation-ships
Phone calls
Events
Resources
Tasks
People
Notes
Resource entities
Email Messages
Saved searches
Entity types
17DEMO
18Future work new capture modes/devices
Deja View
Body Media
SenseCam
Quindi
19Future work Visualizations
- Don't give me a little card image and say,
"That's all you've got, because that's what I
thought you should want for your virtual
shoebox." There have got to be multiple
modalities and the designers have to be able to
deal with that. don't metaphor me in, don't
give me only one way of looking at things. - -Andy van Dam, Hypertext '87 Keynote Address
U. Maryland
IN-SPIRE
Next Media
20Future work UI
- UI Improvements
- User studies
21Future workContent analysis Data Mining
Creative thought and essentially repetitive
thought are very different things. For the latter
there are, and may be, powerful mechanical aids
Vannevar Bush
- Is MyLifeBits just enough rope to hang yourself
with? - MyLifeBits must become MyPersonalAssistant
- Content analysis and data mining
- Doc similarity clean living
- Document meta-data extraction
22Future work scaling
- Just starting to hit performance problems
- Stress tests design modifications
23www.MyLifeBits.com
- http//research.microsoft.com/CARPE2004
24BONUS SLIDES
25Everything goes in a database
- You need all the features of a database(Consisten
cy, Indexing, Pivoting, Queries,
Speed/scalability, Backup, replication) - If you dont use one, you will find yourself
creating one! - Files as blobs, also sync with file system for
legacy apps
SQL
26CARPE 04The First ACM Workshop onContinuous
Archival Retrieval of Personal Experiences
- October 15th 2004Columbia University, New York,
NY, USA
27Dear Appy, How committed are you? Signed, Lost
and Forgotten Data
By Gordon Bell http//research.microsoft.com/gbe
ll
- Dear Appy,
- I'm having trouble with long-term commitment --
not on my end, heaven knows, but from the apps
that created me and with whom I like to
associate. Over time, these pesky apps evolve and
they simply don't recognize the data that they
once helped create! But, we data progeny -- and
there are lots of us -- feel that as our
creators, these apps should be responsible for
eternal support. - But the little problem with recognition isn't
the worst of it sometimes the apps even
disappear altogether. I ask you, is it expecting
too much for 20-something year old data like me
to be interpretable by my app (e.g. Acrobat, DB2,
Draw, Eudora, Office, Quicken, or RealNetworks),
or am I just associating with irresponsible apps?
- If things continue on their current path, it
seems I will be completely un-interpretable
within 20 to 50 years! My apps will move to other
platforms, or evolve to be more Internet- or
Next-Big-Thing-centric...
28A Storocratic Oath
- Do no harm to dates(File creation, Photo taken)
- Do no harm to device created other meta-data.
- Camera data location data are sacred.
- Support aid the creation of critical meta-data.
- When/how the user feels like it
- Auto-magically!
- Maintain user confidentiality
29Classification wish list
- Download classifications rather than build them
- Definitions synonyms should help find what I
want - Today it is too expensive to manually classify my
scanned paper. E.g. right time meta-data is
critical! - Next year I hope the system can classify papers
and other documents e.g. bills - In 10 years I expect all documents to appear
electronically classified with a little help
from me
30Personal Search is notProfessional or Web search
- System sees every entry access
- Everything, not just a professional life
- Limited to SIS, not an infinite amount, covers a
profession personal life
Professional user
MyLifeBits
Depth e.g. information item types coverage
Web as seen by search engines
Knowledge breadth e.g. Dewey classification