Heirarchical Mixtures of Experts HME - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Heirarchical Mixtures of Experts HME

Description:

Heirarchical Mixtures of Experts (HME) Chris Weed. Nov. 12, 2003. Two ... Expert Networks. Expert Network Models. Regression: Gaussian linear regression model ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 14
Provided by: weed
Category:

less

Transcript and Presenter's Notes

Title: Heirarchical Mixtures of Experts HME


1
Heirarchical Mixtures of Experts (HME)
  • Chris Weed
  • Nov. 12, 2003

2
Two-level HME model
Gating Network
g2
g1
Gating Network
Gating Network
g11
g21
g12
g22
Expert Network
Expert Network
Expert Network
Expert Network
P(yx,?11)
P(yx,?21)
P(yx,?12)
P(yx,?22)
3
General HME Model
  • Tree splits are probabilistic
  • Splits can be multi-way
  • Splits can be probabilistic linear combinations
    of inputs

4
HME Definition
  • Gating Networks
  • Expert Networks

5
Expert Network Models
  • Regression Gaussian linear regression model
  • Classification linear logistic regression model

6
Total Probability
Model Parameters
  • Mixture Model
  • Maximize log-likelihood to find ?
  • Calculate using EM
  • Branching decisions are the latent variables

7
Computational Requirements
  • For N observations and p predictors
  • Inexpensive to fit at M-step
  • Np2 for regressions
  • Np2K2 for K-class logistic regressions
  • May converge slowly

8
(No Transcript)
9
Missing Data
  • Chris Weed
  • Nov. 12, 2003

10
Missing Data Examples
  • A patients measurement was not taken because the
    doctor felt he was too sick
  • Missing information on a census questionaire
  • Faulty sensor

11
Model Missing Data
  • y is the response vector
  • X is the (N x p) matrix of inputs
  • Xobs is the observed entries in X
  • Z (y,X) and Zobs (y, Xobs)
  • R is an indicator matrix (0given,1missing)

12
MAR and MCAR
  • Data Missing at Random (MAR)
  • ? is any parameters in R

P(RZ,?)P(RZobs,?)
  • Data Missing Completely at Random

P(RZ,?)P(R?)
  • MCAR is a stronger assumption and most common
  • Usually chosen based on data collection process
  • Can code missing for categorical data

13
How to proceed
  • Discard observations w/ missing data
  • Learning algorithm deals w/ missing data in
    training phase
  • Impute all missing values
Write a Comment
User Comments (0)
About PowerShow.com