Title: Drawing (Complete) Binary Tanglegrams
1Drawing (Complete) Binary Tanglegrams
Hardness, Approximation, Fixed-Parameter
Tractability
Utrecht U, NL TU Eindhoven, NL Karlsruhe U,
DE Tokio Inst. Tech., JP
Kevin Buchin Maike Buchin Jaroslaw Byrka Martin
Nöllenburg Yoshio Okamoto Rodrigo I.
Silveira Alexander Wolff
2- Tanglegram
- 2 trees
- leaves matched 1-to-1
3Application example
Pocket gopher drawings from The Animal Diversity
Web (http//animaldiversity.org)
grandis
castanops
bursarius
neglectus
4Outline of this talk
- Introduction
- 2-approximation algorithm
- Algorithm
- Approximation factor
- Conclusions
5Comparing pairs of trees
- Comparing trees
- Visually
- Applications
- Software visualization
- Hierarchical clustering
- Phylogenetic trees
6Problem statement TL (Tanglegram Layout)
- Input 2 trees S, T
- With leaves in 1-to-1 correspondence
- Output plane drawings of S and T
- Minimizing inter-tree crossings
S
T
6 inter-tree crossings
4 inter-tree crossings
5 inter-tree crossings
3 inter-tree crossings
7Related work
- 2-sided crossing minimization problem
- Introduced by Sugiyama et al.
- Several differences
- Arbitrary degree
- Any ordering allowed
8Previous work
- Holten and Van Wijk (08)
- Visual Comparisonof Hierarchically Organized
Data
9Previous work (contd)
- Dwyer and Schreiber (04)
- 2.5D drawings of stacked trees
- One sided (binary) version, O(n2 log n) time.
- Fernau, Kaufmann and Poths (05)
- TL is NP-hard
- 1 (binary) tree fixed O(n log2 n) time.
- FPT algorithm O(ck), for c1024
10Our results
- We study 2 versions of TL
- We show
- binary TL is NP-hard to approximate within any
constant - complete binary TL is NP-hard
- complete binary TL has 2-APX algorithm
- complete binary TL has O(4kn2)-time FPT algorithm
binary TL
complete binary TL
under widely accepted conjectures
112-approximation algorithm
- Simple recursive approach
- Try each of 4 combinations, and recurse
Drawing Complete Binary Tanglegrams
12Initial algorithm
?
- Algorithm
- Try each of the 4 combinations
- Count crossings
- Return the best one
- Cant count all crossings!
?
Drawing Complete Binary Tanglegrams
13Types of crossings
- Lower-level
- Created by recursive calls
- Nothing to do about them
- Current-level
- Can be avoided at this level
- What about ?
Drawing Complete Binary Tanglegrams
14Need to remember more
Problematic situation
Good situation ?
Drawing Complete Binary Tanglegrams
15Use labels
- To preserve this knowledge
Initial layout
Drawing Complete Binary Tanglegrams
16Use labels
- Using labels, we can count more crossings
Problematic situation only if labels are
equal (indeterminate crossing)
Drawing Complete Binary Tanglegrams
17Algorithm
- For each way of arranging the subtrees
- Assign labels to some leaves
- Solve recursively
- gives lower-level crossings
- Compute current-level crossings
- Return best of 4 combinations
- Running time T(n)?8T(n/2) O(n)O(n3)
Drawing Complete Binary Tanglegrams
18Approximation factor
- Mistakes from indeterminate crossings
- We cannot count them
- How many can we have?
- We show that IND ? copt
- Therefore calg ? 2 copt
indeterminate crossings
crossings in optimal drawing
crossings in algorithm drawing
Drawing Complete Binary Tanglegrams
19Approximation factor (2)
- Obs Indeterminate crossings used to be good
- Upperbound IND by of these crossing
- Use that trees are complete
- We know exactly how many edges each subtree has
Drawing Complete Binary Tanglegrams
20Conclusions
- Studied binary TL / complete binary TL
- binary TL has no constant factor apx.
- complete binary TL remains NP-hard
- complete binary TL has simple FPT algorithm
- 2-approximation algorithm for complete binary TL
- In practice, useful for non-complete trees too
21Other remarks
- The factor 2 is tight
- For non-complete trees
- In theory, no guarantee
- In practice, experiments show good results
- Average factor well below 2
- Generalization to d-ary trees
- O(n12log_d(d!)) time
- factor 1(d choose 2)
Drawing Complete Binary Tanglegrams
22Ribosomal DNA sequencing
- rDNA genotypic identification procedure
- Whats the difference between these
- Involves the amplification of a phylogenetically
informative target, such as the small-subunit
(16S) rRNA gene
23(No Transcript)