Protein Encoding Optimization - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Protein Encoding Optimization

Description:

Protein Encoding Optimization. Student: Logan Everett. Mentor: Endre Boros ... of DIMACS. Mentor: Endre Boros RUTCOR. Logan Everett DIMACS REU 2004 ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 15
Provided by: dimaxR
Category:

less

Transcript and Presenter's Notes

Title: Protein Encoding Optimization


1
Protein Encoding Optimization
  • Student Logan Everett
  • Mentor Endre Boros
  • Funded by DIMACS REU 2004

2
Project Overview
  • Model Biological Scoring Matrices
  • Weighted Binary Hamming Space
  • Optimize Using Linear Programming
  • Accurate Random Generation

3
Scoring Matrices
A Q M K R H A R M I F L 4 1 5 3 3
-3
4
Encode To Binary Strings
  • Hamming Distances
  • Easy to Approximate on Binary Strings
  • Statistically Proven Methods
  • More Efficient
  • How Do Similarity and Distance Relate?
  • Inverse Relationship
  • First Create Real Distance Vector D

5
Precise Problem Distortion
  • Dij(1e) ? lhfi,fj ? Dij(1e)
  • ? unique pairs i,j (nC2)
  • s.t. 0 ? e ? 1 and 0 ? l

6
Encoding Scheme as Vector
  • C 0110101101010101010
  • S 1010110101011010101
  • T 0110101010011011010
  • P 1010101011010100110
  • A 1011010101011011010
  • G 1010101100111010101

y2 y1
7
Modified Inequality
  • D(1e) ? Ax ? D(1e)
  • s.t. 0 ? e ? 1 and 0 ? l
  • Let x ly

8
Linear Programming Problem
  • Need All Linear Expressions
  • D(1 e) ? Ax and Ax ? D(1 e)
  • -Ax De ? -D and Ax De ? D
  • All xi, e ? 0
  • Goal Minimize e
  • Solve with CPLEX

9
Problem Size
  • Number of Constraints (Rows)
  • 2(nC2) 380
  • Number of Variables (Columns)
  • 2n-1 524,288
  • Total Size App. 2x108
  • CPLEX App. 1 Minute

10
Linear Programming Solution
  • Solution Contains
  • Min Value of e
  • Scaled Weight Vector x
  • Non-Integral Values in x
  • Convert to p Vector
  • X Sxi
  • pi xi / X

11
Random Encodings
  • Randomly Select Cross Sections
  • Based on Percent Weights
  • Can Scale For Any N-Length Encoding
  • Longer Encodings Should Approach Minimum
    Distortion

12
Results
13
Courtesy of DIMACS
  • Mentor Endre Boros RUTCOR
  • Logan Everett DIMACS REU 2004

14
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com