Title: Getting Started Writing a Thesis/Dissertation
1Getting Started Writing aThesis/Dissertation
- Dr. Karen C. Davis
- Electrical Computer Engineering Dept.
2Graduation
3ECES 877 Advanced Data Modelsand Query
Optimization
- query optimization
- logical
- physical
- advanced data models
- object-relational
- data warehouse
- XML
Spring 2007 coming to a classroom near you!
4Relational Algebra Query Trees
Sujan Turlapatys thesis defense Performance
Analysis of Self-Maintainable Data Warehousing
Algorithms, 11/99
5 Multiple View Processing Plan (MVPP)
- view chromosome 101100010100001
- index chromosome 1100110
- Fitness sum of query processing costs of
individual queries using the views and indexes
selected
thesis defense of Sirisha Machiraju Space
Allocation for Materialized Views and Indexes
Using Genetic Algorithms, June 2002
6BH System Architecture
Michael Brant, Binding Hash Technique for XML
Query Optimization, 2006
7My Students
Ph.D. (2) Satish Venkatesan, 1996, Database
Modeling for Electronic Design Automation
Environments, awarded ECECS Outstanding
Dissertation Award, 1996. Yunsong Zhan, XML-based
Data Integration for Application
Interoperability, 2002. M.S. (24) Lun Ye, A
Compiler Cooperative Dynamic Memory Management
System for C, 1993. Ron Meade, EasyOpt A
Design Optimization Interface Package, 1994. Rao
Seshagiri Kasinadhuni, Design and Performance
Issues of Client-Server DBMS Architectures,
1994. Samir Nigam, Transformation-based Semantic
Query Optimization for Object-Oriented Databases,
1994. Baskaran Dharmarajan, The Property Map A
Theoretical Foundation and Query Optimization
Algorithms, 1997. Mala Rajamani, Reduction and
Maintenance of Self-maintainable Views for Data
Warehousing, 1997. Veena Pandiri, A Global
Framework for Distributed Agent-based Systems,
1997. Radha Ganapathy, Selection of
Self-Maintainable Views to Materialize in a Data
Warehouse, 1998. Vishal Sheth, Extended Property
Maps An Efficient Access Mechanism for Retrieval
from Large Data Sets, 1998. Gayathri Krishnan,
Physical Schema Design for Object Databases,
1998. Shobha Ravishankar, Object-Oriented Index
Selection and Integration, 1998. Ji Qin, Access
Plan Generation for Property Maps and
Multidimensional Indexes, 1999. Sujan Turlapaty,
Performance Analysis of Self-Maintainable Data
Warehousing Algorithms, 1999. Unmi Tina Kang,
Path Inherited Dictionary Index (PIDI) An
Integrated Object-Oriented Database Index,
2000. Jennifer Grommon-Litton, Heuristic Design
Algorithms and Evaluation Methods for Property
Maps, 2000. Rajeswari Malladi, Applying Multiple
Query Optimization in Mobile Databases,
2001. Xioaming Du, Dynamic Channel and Broadcast
Disk Organization in Mobile Databases,
2001. Krishnamoorthy Janakiraman, Entity
Identification Using Data Mining Techniques,
2001. Casie Phipps, Migrating an Operational
Database Schema to Data Warehouse Schemas,
2002. Ashima Gupta, Performance Comparison of
Property Map Indexing and Bitmap Indexing for
Data Warehousing, 2002. Sirisha Machiraju, Space
Allocation for Materialized Views and Indexes
Using Genetic Algorithms, 2002. Ravi Darira, A
Design Framework for Property Maps,
2006. Micheale Brant, Binding Hash Technique for
XML Query Optimization, 2006. Janet Rajan, A
Framework for Medical Acronym Disambiguation,
2007.
8Thesis/Dissertation Organization
- title page, abstract, dedication, table of
contents, list of figures, list of tables - Introduction
- Related Research
- Foundations
- Results (may be several chapters)
- Conclusions and Future Work
- Appendices
9Sample Table of Contents
10Introduction
- introduce the general topic area
- narrow the focus to specific topic
- motivate the research
- why is it needed?
- who will benefit from the research?
- conclude with a clear statement of the problem
- give a statement of the work
- provide an overview of the thesis (one sentence
per chapter)
11Sample Introduction
12Research Objectives
- general research objective one sentence
describing what you hope to accomplish (not how!)
13Parallel Sections Statement of the Work
- specific research objectives partition the
general objective into sub-goals - research plan/methodology/tasks/approach revisit
the objectives - your approach to solving the problem
- each objective has an associated task or approach
to satisfy the objective - expected contributions revisit the methodology
- what will you know or have when youve done the
task? - potential impact of your work
14Sample Parallel Sections
15Related Research
- focused around your topic not a tutorial!
- compare/contrast to your approach
- tables with features/research efforts are
concise, readable way to summarize
16Examples of Summary Tables
17Foundations
- work you build on (your own or someone elses)
- definitions, theorems, models, system
18Research
- Discuss conventions, setup, hypotheses of
experiments, proofs - why did you do it?
- what did you learn from it?
- Presenting
- figures
- algorithms
- tables
- graphs
- Sample! Dont do a dump of everything put
everything in appendices and discuss
representative results in the body of the thesis
or dissertation
19Example Experiment Setup
- Goals
- What are the comparative storage and retrieval
cost of REBSI and PMaps in different scenarios? - How is individual and relative performance
affected by parameters such as blocksize,
database size, selectivity of queries and
cardinality of attributes, kind of queries,
property ordering? - Can PMaps design and performance be improved
using this knowledge? - In what conditions is it better to use either
index?
Query Set
PMap
PMap Performance
16
properties
and Storage Cost Word Size (ws)
(pu, pstring)
16, 32
Tuple size (t)
1,000,000, 50,000
Blocksize (SB)
2048, 4096, 8192
REBSI Performance
and Storage Cost
Scaling Factor (sf)
min, , 10
20Example Presentation of Results
- number figures (e.g., Figure 3.2)
- refer to the figures in text
- In Figure 3.2, results for the HCAQS are
shown. - Figure 3.2 shows HCAQS results.
- explain the conventions
- The x-axis shows individual queries and the
y-axis shows index pages retrieved. The queries
are ordered by decreasing cardinality. - offer observations to help the reader see what is
important or interesting - REBSI performance improves as the cardinality
decreases. - discuss possible reasons for the observed results
- give general conclusions
21Conclusions and Future Work
- revisit objectives
- what was accomplished?
- what was learned?
- topics for future work
- extensions
- open questions
22Conclusions
- BH method work well for deeply nested queries
with few branches (non-bushy) - BH Indexing technique requires further
optimization - BindingCollection is a flexible data structure
- Can be used in to generate witness trees for
processing embedded Xpath expressions - Used to process Xpath expression directly
- Can use a different indexing schemes
Future Work
- Modify indexing technique to increase performance
and perform inequality matching - Expand Post-order Traversal to support more TAX
pattern tree features e.g., value-based joins - Expand more extensive performance study
23Citations
- allow the reader to follow up on the topic
- fill in background information
- judge what youve said by reading original
sources - relieve you of the burden of going over all
territory on a subject - strengthen/justify your point
- respect your peers by acknowledging their
contributions
vL78
24Citations
- not a part of speech!
- not a part of speech!
- not a part of speech!
- 11 never, ever, use a bracketed number as if it
were the name of an author or a work vL78 - BAD In 23, algorithms are presented
- GOOD Jones presents algorithms 23.
25Writing Style
- avoid vague words (e.g., deals with, handles)
- avoid contractions
- be consistent in spelling, punctuation,
capitalization style - use the same grammatical style for items in a
list - develop flow/transitions between paragraphs,
sections, chapters - avoid empty sections
- merge/eliminate single item sublists or
subsections - place punctuation inside quotes
- avoid second person (you )
- try to write in only one verb tense, preferably
the present tense - use including instead of etc.
- use such as instead of like
- put math in definitions, theorems, proofs
explain in English to build the readers
intuition - Use instead of - in technical writing
- Space after ) and (not before!)
- Use that instead of which when not counting
things
26References
- s99 Strunk, The Elements of Style, New York
bartleby.com, 1999, http//www.bartleby.com/141. - vL78 van Leunen, M.-C., A Handbook for
Scholars, Alfred A. Knopf, 1978.
27Current Research Work
- Sandipto Banerjee, Ph.D.
- Bartley Richardson, Ph.D.
- Lydia Fitzgerald, M.S.
- Bill Nicholson, M.S./Ph.D