Towards Successful Ph.D. Research in Database Systems and Data Mining - PowerPoint PPT Presentation

About This Presentation
Title:

Towards Successful Ph.D. Research in Database Systems and Data Mining

Description:

Data collection and dissemination: sensors, digital cameras, Web ... Careful and thorough thinking should go before implementing and testing ... – PowerPoint PPT presentation

Number of Views:166
Avg rating:3.0/5.0
Slides: 25
Provided by: jiaw201
Category:

less

Transcript and Presenter's Notes

Title: Towards Successful Ph.D. Research in Database Systems and Data Mining


1
(No Transcript)
2
Towards Successful Ph.D. Research in Database
Systems and Data Mining
  • Jiawei Han
  • Department of Computer Science
  • University of Illinois at Urbana-Champaign
  • www.cs.uiuc.edu/hanj
  • November 27, 2020

3
Outline
  • Database and data mining highly promising themes
  • Long history of strong and successful research
  • Lots of new challenges
  • Lots of research themes
  • Selection of promising directions and promising
    topics
  • Making your research bigger impact
  • Discussing, debating, and active brain-storming
  • Capturing and harvesting the sparks of thought
  • Towards highly productive research
  • Learning from others reviews and judgment
  • Collaborations and team work

4
DB and DM Long History of Strong Successful
Research
  • Necessity is the mother of invention
  • Coming from the real application demand
  • Constantly seeking new and extended applications
  • Developing core technologies for information
    systems
  • A long history of success
  • Real systems, numerous applications, and big
    industry
  • Relational database systems ? application-oriented
    DBMS (spatiotemporal, CRM, banking, health info,
    ) ? data warehouses ? data mining ? Web
    search Google
  • In-depth and thoroughness in research
  • Constant search for new, innovative methodologies
    and algorithms
  • In-depth study of implementation, optimization,
    and user needs
  • Scalability, uncertainty, approximation,
    streaming, ranking, aggregation, privacy, and
    security

5
Still Challenging and Promising
  • Huge amount of data is mounting up rapidly
  • Giga-bytes ? terabytes ? peta-bytes in very fast
    pace
  • Data collection and dissemination sensors,
    digital cameras, Web
  • Database and data mining Various new
    applications
  • Data streams, RFID, sensor networks, video/audio
    data, text and Web, computer/software systems,
    social networks, biological data, and
    science/engineering data
  • Searching, ranking, mining, uncertainty, noise,
    privacy, security
  • Database and data mining are still flourishing
  • Scalable statistical and machine learning methods
  • Pattern analysis methods
  • Integrated with database systems, data
    warehouses, and Web as a natural, hidden process
  • Still many open research problems and multiple
    research frontiers

6
Research Frontiers in Data Mining
  • Information network analysis
  • Stream data warehousing data mining
  • Pattern mining, pattern usage, and pattern
    understanding
  • Warehousing, and mining of moving object data,
    RFID data, and data from sensor networks
  • Spatiotemporal and multimedia data mining
  • Biological data mining
  • Text and Web mining
  • Data mining for software engineering and system
    analysis
  • Data cube-oriented multidimensional online
    analysis
  • Classification and ranking everywhere databases,
    Web, documents, and knowledge

7
A Multidimensional View of Research Themes
  • Data view
  • relational data, transactional data, information
    network data, stream data, spatial, temporal,
    multimedia (video/audio), moving object data,
    RFID data, sensor data, biological data, text and
    Web data, software engineering and system data
  • Issue view
  • modeling, management, indexing, retrieval
    (query), update, integration, warehousing,
    mining, data cube computation, multidimensional
    online analysis, security, privacy,
  • Methodology view incremental, parallel,
    distributed
  • For mining statistical, machine learning,
    decision-tree, MDL, HMM, Naïve-Bayes,
  • Application view Different industries,
    governments, science engr.
  • Adding dimensions time, space,
  • Relaxing assumptions approximation, uncertainty,

8
Outline
  • Database and data mining highly promising themes
  • Long history of strong and successful research
  • Lots of new challenges
  • Lots of research themes
  • Selection of promising directions and promising
    topics
  • Making your research bigger impact
  • Discussing, debating, and active brain-storming
  • Capturing and harvesting the sparks of thought
  • Towards highly productive research
  • Learning from others reviews and judgment
  • Collaborations and team work

9
Selection of Promising Directions
  • Read survey papers, proceedings, etc., discuss
    with your friends and professors, and use your
    own reasoning
  • Is the direction likely to be much needed and
    have a bright future?
  • Do I have sufficient background to work on it?
  • Am I truly interested in it?
  • Does the direction attract long-term
    investigation?
  • It is OK to change it or adjust it?
  • May need to constantly adjust your research
    directions
  • Ex. Myself, from deductive DBs (recursive query
    processing) to data mining

10
Making Your Research Bigger Impact
  • Necessity is the mother of invention
  • What is the most needed in the next several
    years?
  • Will it have long term impact or fade out soon?
  • Innovative and thorough research
  • Is your approach fresh, innovative, somewhat
    ground-breaking?
  • Have you examined it systematically? Have you
    considered alternative or previously studied
    methods?
  • Can it be further improved?
  • Two kinds of research topics creative vs.
    improvement
  • Find new themes (new patterns, new methodologies,
    new directions)
  • Improve the existing solutions
  • Never be tied with the existing solutions
  • First think on it independently, and work out
    independently
  • Believe always can find new ways to improve it!

11
Discussions, Sparks, and Technical Meat
  • Watch before you leap
  • Careful and thorough thinking should go before
    implementing and testing
  • Form small groups instead of working alone
  • Slides, emails, and weekly theme-based meetings
    or teleconferences
  • Questions on slides, related work, new design,
    proposed algorithms, try to find ways to improve
    it
  • Capture and harvest the sparks of thought
  • Many good ideas may come from a weak spark of
    thinking
  • Capture the sparks timely and do not let it slip
    away

12
Case 1 ICDE07 Best Student Paper Award
  • Feida Zhu, Xifeng Yan, Jiawei Han, Philip S. Yu,
    and Hong Cheng, Mining Colossal Frequent
    Patterns by Core Pattern Fusion, in Proc. 2007
    Int. Conf. on Data Engineering (ICDE'07),
    Istanbul, Turkey, April 2007 (the BEST STUDENT
    award)
  • Identifying the problem that the current
    technology cannot solve and its applications
  • Colossal patterns, bio-applications
  • How the paper was generated? Progressive
    refinement
  • slides ? discussions ? algorithms ? discussions ?
    experiments ? new slides
  • Smart ideas and technical innovation

13
Case 2 ICDE06 Best Student Paper Award
  • Hector Gonzalez, Jiawei Han, Xiaolei Li, and
    Diego Klabjan, Warehousing and Analysis of
    Massive RFID Data Sets, in Proc. 2006 Int. Conf.
    on Data Engineering (ICDE'06), Atlanta, Georgia,
    April 2006.
  • Necessity is the mother of invention
  • Working on a key problem RFID data warehousing
  • The key solution deep compression
  • How deep is deep? Maximal sharing of bulky
    movements
  • Multiple designs, refinements, testing and
    refinement again
  • slides ? discussions ? algorithms ? discussions ?
    experiments ? new slides
  • Constant brain-storming

14
Outline
  • Database and data mining highly promising themes
  • Long history of strong and successful research
  • Lots of new challenges
  • Lots of research themes
  • Selection of promising directions and promising
    topics
  • Making your research bigger impact
  • Discussing, debating, and active brain-storming
  • Capturing and harvesting the sparks of thought
  • Towards highly productive research
  • Learning from others reviews and judgment
  • Collaborations and team work

15
Learning from Others Reviews and Judgment
  • A very important task for training Ph.D. is the
    judgment judging others as well as judging
    yourself
  • A good researcher should be first a good judge on
    research
  • Reading a good research paper First read the
    problem and try it by yourself
  • Be active at serving as a reviewer See how
    others evaluate the work and learn from the good
    judges
  • Read survey papers and write your own simple
    surveys on the problems you intend to work on

16
Putting All the Eggs in One Basket?
  • Working on several research problems or only on
    one?
  • Initially, more than one theme may help test the
    water and settle down a promising theme that
    matches you
  • Even after you have been focused on one theme, it
    is good to try slight different problems
  • Productivity, alternative thoughts, adjustable
    solutions, and research collaborations
  • Working with your friends and colleagues
  • Complement each other on strength and expertise

17
Seminar Course Continuous Training/Education
  • Advanced seminars for DAIS and DM group
  • Constantly running in every semester
  • Presenting your own work and get feedbacks from
    the group
  • Mostly are recently accepted conference papers
  • Requiring only one page summary/abstract
  • Presenting good papers from recent, top
    conferences selecting only SIGKDD, SIGMOD, VLDB,
    ICDE, ICDM, SDM, WWW, , conference papers
    published in the last 12 months.

18
Conference and Journal Reviews
  • Volunteering on conference and journal
    coordination
  • For each conference we served as a PC member, we
    have one Ph.D. student volunteering as conference
    coordinator
  • S/he will communicate with the group members to
    select papers, collect reviews, and I will have
    one or more rounds of thorough discussions with
    the coordinator to make sure the reviews are not
    biased, comprehensive and in high quality
  • Also, the reviews will be relatively ranked and
    balanced
  • A good exercise for all the participants
  • Similar exercises for journals and proposal
    reviews

19
Semester Summary and Awards
  • Award summary as a way to promote excellence on
    research
  • Summary meeting at each semester
  • Summary on each students Webpage and
    presentation
  • Award voting with multiple grades Gold, silver,
    bronze and honorable mentioning
  • Vote after the major conference evaluation
    results are out
  • Publish the award voting summary
  • Presents and web publicity
  • Award competition also promotes collaborations

20
Questions
21
Thanks and Questions
22
Create a Productive Research Group
  • Selection of promising students
  • Training and selection of students from classes
  • Test run with research problems
  • Watch on sparks and working attitude
  • Written qualifications vs. oral ones
  • Team organization
  • CS591 vs. meetings (start ending meetings)
  • Use students expertise, strength, and interests
  • Division of group work Everyone is in charge
  • Theme-based dynamic small research groups
  • Encouraging students on their progress papers,
    etc.
  • Semester summary, web-pages
  • Award competition

23
Group Administration/Public Relation Work (Sept.
06-Aug.'07)
  • Group Webmaster (news, group Web page, pictures,
    etc.) Tianyi Wu
  • Web-based research reference collections Hong
    Cheng
  • Hardware, equipment, and software master Sang
    Kim
  • TKDD Information Director Xiaoxin Yin
  • DAIS seminar coordinator Deng Cai
  • DAISY System administrator Hector Gonzalez
  • IlliMine project coordinator Xiaolei Li
  • Industry/visitor coordinator Chao Liu
  • Conference and journal review coordinator (3)
    Dong Xin, Jing Gao and Chen Chen
  • Research proposal coordinator (2) Feida Zhu and
    Jianlin Feng
  • Social activity coordinator Jaegil Lee, Ok-ran
    Jeong

24
Work on Promising Research Topics
  • Selection of promising research topics
  • Select topics based on your strength and interest
  • Putting all the eggs in one basket ?? may work on
    2-3 topics at the same time
  • Discussion, debate, and active brain-storming
  • Capture and harvest the sparks of thought
  • Two kinds of research topics creative vs.
    improvement
  • Find complete new theme (new patterns, new
    methodologies, new directions)
  • Improve the existing solutions
  • Never be tied with the existing solutions
  • First think on it independently, and work out
    independently
  • Believe always can find new ways to improve it!
Write a Comment
User Comments (0)
About PowerShow.com