CS 2750 Project Report - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

CS 2750 Project Report

Description:

Sensor readings from 11 different people walking in a controlled ... Used multiple train/test splits to train 3 models with bagging (voting) Indirect Learning ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 25
Provided by: admi49
Category:

less

Transcript and Presenter's Notes

Title: CS 2750 Project Report


1
CS 2750 Project Report
  • Jason D. Bakos

2
Project Goals
  • Data
  • Sensor readings from 11 different people walking
    in a controlled environment
  • An accelerometer records floor vibration data
    from footfalls
  • A microphone records sounds from footballs
  • This data is recorded 10 times for each person
  • Data gathered from 11 different people

3
Project Goals
  • Use this data to perform multiple classification
  • Human gait analysis
  • Eventually want to determine if a person is in
    duress
  • Most important aspect learn the nature of the
    data to determine how best to classify it

4
Data Preprocessing
  • Data size
  • Data is collected at 15KHz for approximately 10
    seconds
  • 150,000 samples
  • Must get data out of time domain
  • Must capture a walk as a single data point
  • Time series gt cross sectional

5
Data Preprocessing
  • Extract the largest intensity step from the data
  • Closest to sensors
  • Transform data to frequency domain
  • Fourier transform
  • Used MatLab FFT output is real array
  • Integrated over time
  • Bin resultant data into bins
  • These are now the features

6
Data Preprocessing
  • Extracting footstep
  • Method 1
  • Find max value in time-domain
  • Center fixed window around data
  • 2000, 4000, 6000
  • Method 2
  • Actively find footstep
  • Create new vector by recording sliding abs
    mean-window
  • Extract largest hill (using gradient descent and
    threshold)
  • Index from meanarray into data array
  • Meanwindow sizes 1000, 2000, 3000

7
Data Preprocessing
Mean window of 1000
8
Data Preprocessing
Mean window of 2000
9
Data Preprocessing
Mean window of 3000
10
Analysis of Preprocessed Data
  • Cluster analysis
  • Unsupervised learning
  • 3 steps
  • Distance calculation
  • Linkage analysis
  • Clustering

11
Analysis of Preprocessed Data
  • Distance Calculation
  • 4 distance measures
  • Euclid
  • Standard distance
  • Standardized Euclid
  • Shorter distance between points who have
    relatively smaller variances
  • City Block
  • Similar to Euclid, used for comparison
  • Minkowski
  • Another way to measure distance, used for
    comparison
  • Result is array, distance from each point to
    every other point

12
Analysis of Preprocessed Data
  • Linkage Analysis
  • Hierarchically link datapoints
  • Methods
  • Shortest distance
  • Average distance
  • Uses center points of clusters
  • Centroid distance
  • Draws sphere around center point, uses furthest
    point as radius use distance from edges of
    sphere
  • Incremental sum-of-squares
  • Similar to centroid, used for comparison
  • Result is matrix

13
Analysis of Preprocessed Data
  • Clustering
  • Force datapoints into a fixed number of clusters
  • Result is cluster vector and dendrogram

14
Analysis of Preprocessed Data
  • How to judge how well the clustering worked?
  • My answer
  • Since there is exactly 10 samples from 11 people,
    define uniformity as a metric

15
Analysis of Preprocessed Data
16
Analysis of Preprocessed Data
  • Checked all 12 charts
  • fix2000, fix4000, fix6000, win1000, win2000,
    win3000 for vibration and audio
  • Euclid/Sum-of-squares is best for vibration and
    audio
  • win3000 is best for vibration
  • fix2000 is best for audio

17
Analysis of Preprocessed Data
18
Indirect Learning
  • Used parametric Naïve Bayes model to do multi-way
    classification
  • 11 classes
  • Used 50-bin data
  • Assumed data was multivariate Gaussian
  • Chose class based on maxium posterior of C
  • Used multiple train/test splits to train 3 models
    with bagging (voting)

19
Indirect Learning
20
Indirect Learning
  • Bad results
  • Worse than random predictor
  • Conclusion
  • Data is not Gaussian

21
Direct Learning
  • Trained neural network with same data
  • Used softmax network to perform multiway
    classification
  • 1000 epochs, log sigmoid, gradient descent
  • Tried different parameters for neural network

22
Direct Learning
Vibration
Audio
23
Direct Learning
  • No improvement after 50 neurons per level (vib
    and aud)
  • 4 levels is best (including output level)
  • Results terrible for test sets

24
Conclusion
  • Need
  • Better feature extraction
  • Better classifiers
  • Or maybe different sensors are needed
  • Video
Write a Comment
User Comments (0)
About PowerShow.com