Title: Genreanalyysimenetelm
1 2Introduction
- This slide set contains
- Very easy and supervised introductory material
for MATLAB - Homework (Task 1 and Task 2 in the last two
slides) - If you are already expert in MATLAB then you can
skip the introduction and start to work with the
homeworks - By returning homeworks (a short report about the
accomplished tasks) one can get credits for the
final exam (total 3 x 2points, exam max is 4x6
24 points) - The reports for the tasks of Demo 1 must be
returned not later than 10.12.2007 - The reports can be returned by e-mail
(sami.ayramo_at_jyu.fi), or to the office room (Ag
C416.2), or to mailbox (which can be found two
meter away from the office door) - Some additional codes that may be useful for the
homework can be found at http//users.jyu.fi/sami
ayr/DM/demot
3Requirements
- Basic computer skills (e.g., starting
applications, opening, closing and saving files,
cutting and pasting text, directory structures,
) - Know how to use a text editor, such as Windows
Notepad, that you can use to write MATLAB
programs (MATLAB also has its own built-in text
editor which you can use) - Basic algebra and trigonometry
- Knowledge of basic linear algebra (i.e., concepts
such as matrix, vector, inverse etc.) would also
be very helpful - While you are following this introduction, have
MATLAB running in a separate window and perform
and experiment with the examples - This introduction is extracted, modified and
compressed from the one available at the
Mathworks Student Center - http//www.mathworks.com/academia/student_center/t
utorials/
4Facts about MATLAB
- MATLAB is a computer program
- for solving the sorts of mathematical problems
frequently encountered, for example, data mining,
data analysis, statistics, simulation,
engineering, mathematical modelling - Built-in features of MATLAB to enables effortless
solving of a wide variety of numerical problems - from the very basic, such as a system of 2
equations with 2 unknowns - X 2Y 24
- 12X - 5Y 10
- to the more complex, such as factoring
polynomials, fitting curves to data points,
making calculations using matrices, performing
signal processing operations such as Fourier
transforms, and building and training neural
networks. - MATLAB can be used to plot many different kinds
of graphs, enabling the visualization of complex
mathematical functions and laboratory data - The three images below have been created using
MATLAB plotting functions
Images are taken from www.mathworks.com
5Starting MATLAB
- You can start MATLAB by double-clicking on the
MATLAB icon - The MATLAB Desktop will then pop-up
6Entering commands in MATLAB
- gtgt is the command prompt,
- Type a command at a command prompt and MATLAB
executes the command you typed in, and then
prints out the result - Ex1 Enter a simple MATLAB command date to see
how it works - Ex2 Try also the clc command (clear command
window) - Ex3 To exit MATLAB can just enter quit at the
MATLAB command prompt - To get a good feel for the kinds of things you
can use MATLAB for, also many different demos are
provided, all accessible from a demo window that
is popped up when you type, demo, at the command
prompt
7Getting help
- MATLAB has an extensive help system built into
it, containing detailed documentation and help
information on all of the commands and functions
of MATLAB - To obtain help on a given function there are
three main functions help, helpwin (short for
help window) or doc (short for documentation). - help and helpwin give you the same information,
but in a different window, the doc command
returns an HTML page with a lot more information - Ex4 Find help on the date function using the
different functions - Another source of help is the MATLAB help browser
- you can invoke the MATLAB help browser by
- typing helpbrowser at the MATLAB command prompt
- clicking on the help button ?
- by selecting Start-gtMATLAB-gtHelp from the MATLAB
desktop - Tutorials and documents can also be found at
www.mathworks.com in large amounts
8Working with variables
- Variables are a fundamental concept in MATLAB and
are used all the time - In its simplest mode of use, MATLAB can be used
just like a pocket calculator - MATLAB supports all the basic arithmetic
operations , -, , /, , etc. and you can
group and order operations by enclosing them in
parentheses - Ex5 Try the following calculator-like operations
with MATLAB by typing - 4 10
- 5 10 6
- (6 6) / 3
- 92
- What is ans? In short ans is short for "answer",
and is used in MATLAB as the default variable
name when none is specified - Ex6 Check the value of ans by typing ans
- Ex7 Try to change the value of ans by typing ans
6 - You can also define and use your own variables
- Ex8 Create three variables, chech the value of
the first one, and calculate the average. For
instance, enter the following commands - a 10
- b 20
- c 30
- a
- the_average (a b c) / 3
9Working with variables
- If you have defined a lot of different variables,
you probably can't remember all the variable
names you have defined. Therefore, it is nice to
get a list of all the variables currently
defined. Simply typing whos at the command prompt
will return to you the names of all variables
that are currently defined. - Ex9 Try the sequence of the following commands
- clear
- a 5
- b 6
- whos
- Typing clear at the command prompt will remove
all variables and values that were stored up to
that point. - Ex10 For example, continue from the above
example - whos
- clear
- whos
10Working with variables
- If a command is followed by a semicolon () then
MATLAB evaluates the expression and store the
result internally, but will not print put the
result - The user is mainly concerned only with some final
result in your MATLAB sessions, which will be
calculated by combining many temporary,
intermediate variables and by appending a
semicolon to the expressions that assign values
to the temporary, intermediate variables causes
their results to not be printed - Ex11 Compare the following expressions
- a 4 5
- b 5 6
11Working with variables
- In MATLAB, there are some specific rules for what
you can name your variables - Only use primary alphabetic characters (i.e.,
"A-Z"), numbers, and the underscore character
(i.e., "_") in your variable names. - You cannot have any spaces in your variable
names - For example, using "this is a variable" as a
variable name is not allowed, but
"this_is_a_variable" is fine - MATLAB is case sensitive.
- For example, "A_VaRIAbLe", "a_variable",
"A_VARIABLE", and "A_variablE" would all be
considered distinct variables in MATLAB - Using single quotes one can also assign pieces of
text to variables - Ex12 For example, try
- some_text 'This is some text assigned to a
variable!' - some_text
- Be careful not to mix up variables that have text
values with variables that have numeric values in
equations
12MATLAB the matrix laboratory
- Three fundamental concepts in MATLAB, and in
linear algebra, are - A scalar is simply just a fancy word for a
number (a single value) - A vector is an ordered list of numbers
(one-dimensional) - In MATLAB they can be represented as a row-vector
or a column-vector - A matrix is a rectangular array of numbers
(multi-dimensional) - In MATLAB, a two-dimensional matrix is defined by
its number of rows and columns - Both scalars and vectors can be considered a
special type of matrix. - A scalar is a matrix with a row and column
dimension of one (1-by-1 matrix) - A vector is a one-dimensional matrix one row
and n-number of columns or n-number of rows and
one column - All calculations in MATLAB are done with
"matrices". Hence the name MATrix LABoratory.
13MATLAB the matrix laboratory
- In MATLAB matricies are defined inside a pair of
square braces () - A comma (,) and semicolon () are used as a row
separator and column separator, respectfully - Note you can also use a space as a row
separator, and a carriage return (the enter key)
as a column separator as well - Ex13 Try the examples to see how a scalar, and
row and column vectors, can be created - my_scalar 3.1415
- my_vector1 1, 5, 7
- my_vector2 1 5 7
14MATLAB the matrix laboratory
- What about a two dimensional matrix?
- Ex14 Create a 4-by-3 matrix called my_matrix
with the numbers 8, 12, and 19 in the first row,
7, 3, 2 in the second row, 12, 4, 23 in the third
row, and 8, 1, 1, in the fourth row by typing the
following command - my_matrix 8, 12, 19 7, 3, 2 12, 4, 23 8,
1, 1 - You can also combine different vectors and
matrices together to define a new matrix - Remember that the output needs to be a valid
rectangular matrix - Ex15 Construct a matrix from row vectors by
typing the following lines - row_vector1 1 2 3
- row_vector2 3 2 1
- matrix_from_row_vec row_vector1
row_vector2 - Ex16 Construct a matrix from column vectors by
typing the following lines - column_vector1 13
- column_vector2 28
- matrix_from_col_vec column_vector1
column_vector2 - Ex17 Construct a matrix from a 4x3 matrix by
typing the following lines - my_matrix 8, 12, 19 7, 3, 2 12, 4, 23 8,
1, 1 - combined_matrix my_matrix, my_matrix
15Indexing vectors and matrices
- Once a vector or a matrix is created you might
needed to extract only a subset of the data, and
this is done through indexing. - In a row vector the left most element has the
index of one. - In a column vector the top most element has the
index of one. - Ex17 Create vectors my_vector1 and
my_vector2 and try to index into its values - my_vector1 1 5 7
- my_vector2 1 5 7
- my_vector1(1)
- my_vector2(2)
- my_vector1(3)
- my_vector2(1)
- my_vector2(2)
- my_vector2(3)
- The process is much the same for a
two-dimensional matrix. The only difference is
that you have to specify both the row and column
indices. - Ex18 Access the value of 4 in my_matrix
- my_matrix 8, 12, 19 7, 3, 2 12, 4, 23 8,
1, 1 - my_matrix(3,2)
- Note The row number is first, followed by the
column number.
16Indexing vectors and matrices
- You can also extract any contiguous subset of a
matrix, by referring to the row range and column
range you want. - Ex19 Try the following examples
- mat 1 3 2 3 5 6 5 7 4 8 1 2 3 4 3 2 8 4 7 3
2 3 2 3 4 1 4 2 - mat(24,47)
- mat(12,13 56)
- You can change a number in a matrix by assigning
to it - Ex20 Try to change the value of an element by
the following commands - mat 1 3 2 2 3 4 7 3 2 1 4 2
- mat(2,2) 999
17Element-by-element operations
- Element-by-element operations are performed on
two vectors or matrices of the same size to get
the result of the same size - For example, "element-by-element multiplication"
of two vectors 1 2 3 and 4 5 6 would give you
4 10 18. - The element-by-element operators in MATLAB are as
follows - element-by-element multiplication "."
- element-by-element division "./"
- element-by-element addition ""
- element-by-element subtraction "-"
- element-by-element exponentiation "."
- Ex21 Try the following operations (which of
these works?) - a1 2 3
- b4 5 6
- c6 7 8
- d6 7 8
- a.b
- a.c
- c.d
- c.d
18Element-by-element operations
- An additional note about element-by-element
operators is that you can use them with scalars
and vectors together - Ex22 Try the following operation
- a 1 2 3 4 5 6
- b a . 2
- You can similarly use ".", "", and "-" with a
vector and scalar. - Ex23 Try some examples
- c a . 2
- d a 2
- e a 2
- The reason that element-by-element multiplication
and exponentiation operators have "." appended to
the front of them, while the element-by-element
addition and subtraction operators do not, is
that there are other kinds of multiplication,
division, and exponentiation operators (denoted
by "" , "/"and ") for matrices, which are not
element-by-element
19Matrix operations
- Element-by-element operations allow us to compute
things on an element-by-element basis, but matrix
operations allow us to perform matrix-based
computation. - For example, the multiplication of two matrices,
represented by "", performs a dot product of the
two matrices. What the dot product does is that
it first multiplies the corresponding elements
(i.e., same position elements) of the two
vectors, similar to what element-by-element
multiplication does, and then adds up all the
results of these multiplications to get a single,
final number as the answer. - Ex24 Try the following matrix multipilication
- a 1 2 3
- b 4 5 6
- a b
- To get the answer "32", what MATLAB first
performs the multiplications of the corresponding
elements of the two vectors "14 4", "2510",
and "3618". Then, to get the final answer of
"32", MATLAB adds all these multiplications
together "4101832". - The length of vectors and the size of matrices
can be found by length and size functions - Ex25 Try the following examples
- a 1 2 3
- length(a)
- mat 1 3 2 2 3 4 7 3 2 1 4 2
- size(mat)
20Plotting
- The most basic plotting command in MATLAB is the
plot command. The plot command, when called with
two same-sized vectors X and Y, makes a
two-dimensional line plot for each point in X and
its corresponding point in Y. In other words, it
will draw points at (X(1),Y(1)), (X(2),Y(2)),
(X(3),Y(3)), etc., and then connect all these
points together with lines. - Ex26 Try a very simple example to illustrate
what the plot command does - simple_x_points 1 2 3 4 5
- simple_y_points 25 0 20 5 15
- plot(simple_x_points, simple_y_points)
- The ordering of the vectors in the plot command
is important - Ex27 Try the reversed order for the previous
simple example - plot(simple_y_points, simple_x_points)
21Plotting
- To add text to a plot, you need to keep the
figure window open (i.e., type the commands in
the MATLAB command window while the figure window
is still open). - The xlabel/ylabel command prints out a text
string describing the x-axis/y-axis The title
command prints out a title for your plot. Typing
"grid on" at the command prompt, the grid lines
will be added to the open figure window (typing
"grid off" will get rid of the grid lines). - Ex28 Try to use these commands on the previous
plot - simple_x_points 1 2 3 4 5
- simple_y_points 25 0 20 5 15
- plot(simple_x_points, simple_y_points)
- xlabel('this is text describing the x-axis')
- ylabel('this is text describing the y-axis')
- title('this is text giving a title for the
graph') - grid on
22Plotting a parabola
- Ex29 Let's look at a more practical example of
plotting. First you need to create a vector of
regularly spaced points and a vector of function
values at those points for some function. Do this
for the function "y x2" (i.e., a parabola) for
x values between -5 and 5 and with regular
spacing of .1 - x_points -5 .1 5
- y_points x_points . 2
- Then plot the x_points against the y_points,
and get the familiar plot of a parabola - plot(x_points,y_points)
- xlabel('x-axis') ylabel('y-axis') title('A
Parabola') - grid on
- Note The result is very smooth you can't really
see any of the individual line segments like you
could for the simple example previously. That is
because the points are so close together (at
regular spacings of .1) --- MATLAB is still
drawing line segments between the points, but
your eye just can't see them because they are so
small, and so the result seems to be a smooth
curve.
23Multiple plots
- Using the hold command, you can add multiple
plots in the same figure window, to compare the
plots for example. (Normally, when you type a
plot command, any previous figure window is
simply erased, and replaced by the results of the
new plot.) - If you type "hold on" at the command prompt, all
line plots created after that will be
superimposed in the same figure window and axes.
Like wise the command "hold off" will stop this
behavior, and revert to the default (i.e., new
plot will replace the previous plot). - Ex30 Try the following example of how to plot
several different exponential functions in the
same axes (you need to define the points on
x-axis only once) - x_points -10 .05 10
- plot(x_points, exp(x_points))
- grid on
- hold on
- plot(x_points, exp(.95 . x_points))
- plot(x_points, exp(.85 . x_points))
- plot(x_points, exp(.75 . x_points))
- xlabel('x-axis') ylabel('y-axis')
- title('Comparing Exponential Functions')
24Subplots
- In order to have multiple plots in the same
window, but each in a separate part of the window
(i.e., each with their own axes), you use the
subplot command. If you type subplot(M,N,P) at
the command prompt, MATLAB will divide the plot
window into a bunch of rectangles --- there will
be M rows and N columns of rectangles --- and
MATLAB will place the result of the next "plot"
command in the Pth rectangle (where the first
rectangle is in the upper left). - Ex31 Try this example of a line plot, a
parabola, an exponential, and the absolute value
function into four rectangles in the same figure
window - x_points -10 .05 10
- line 5 . x_points
- parabola x_points . 2
- exponential exp(x_points)
- absolute_value abs(x_points)
- subplot(2,2,1)plot(x_points,line)
- title('Here is the line')
- subplot(2,2,2)plot(x_points,parabola)
- title('Here is the parabola')
- subplot(2,2,3)plot(x_points,exponential)
- title('Here is the exponential')
- subplot(2,2,4)plot(x_points,absolute_value)
- title('Here is the absolute value')
25Line Plots in Three Dimensions
- MATLAB cover two different kinds of
three-dimensional plots you can do in MATLAB, 1)
three-dimensional line plots and 2) surface mesh
plots. - The three-dimensional line plots are analagous to
the two-dimensional line plots created with the
plot command. The only difference is that the
command has a "3" added to it, plot3, and that it
requires an extra input, Z, for the third
dimension. - Ex32 A simple example of using the plot3
command, and the resulting output figure window
(notice that you can also here use hold and
subplot in the same way too) - X 10 20 30 40
- Y 10 20 30 40
- Z 0 210 70 500
- plot3(X,Y,Z) grid on
- xlabel('x-axis') ylabel('y-axis')
zlabel('z-axis') - title('Pretty simple')
26Three-Dimensional Surface Mesh Plots
- The mesh and meshgrid commands can be used to
create surface mesh plots, which show the surface
of three-dimensional functions, such as "z x2
y2" - The way it works is that
- Generate a grid of points in the xy-plane using
the meshgrid command - Evaluate the three-dimensional function at these
points - Create the surface plot with the mesh command
- Ex33 Try to generate the meshgrid and generate
the surface mesh plot - x_points -10 1 10
- y_points -10 4 10
- X, Y meshgrid(x_points,y_points)
- Z X.2 Y.2
- mesh(X,Y,Z)
- xlabel('x-axis')
- ylabel('y-axis')
- zlabel('z-axis')
27MATLAB scripts
- A MATLAB script is an ASCII text file that
contains a sequence of MATLAB commands - the commands contained in a script file can be
run, in order, in the MATLAB command window
simply by typing the name of the file at the
command prompt - Any text editor, such as Microsoft Windows
Notepad, or wordprocessor, such as Microsoft
Word, can used to create scripts, but the scripts
must always be saved as simple text documents
(i.e., in the "Save As" dialogue box, choose
"Text Document" or its equivalent for "Save as
type"). - It is easiest to create scripts using MATLAB's
built-in text editor, which automatically just
saves files as ASCII text files for you. - When naming script files, you need to append the
suffix ".m" to the filename, for example
"my_script.m". - Scripts in MATLAB are also called "M-files"
because of this, and the ".m" suffix tells MATLAB
that the file is associated with MATLAB.
28Creating MATLAB script
- Ex34 Create a simple script that calculates the
average of five numbers that are stored in
variables. Start with typing edit
average_script.m after the command prompt. Then
add the following contents of the script file
"average_script.m" in the MATLAB's built-in text
editor - a simple MATLAB m-file to calculate the average
of 5 numbers. - first define variables for the 5 numbers
- a 5
- b 10
- c 15
- d 20
- e 25
- now calculate the average of these and print it
out - five_number_average (a b c d e) / 5
- five_number_average
- NOTE! Save the above script for the later use!
- The text in green (i.e., the lines starting with
--- all comment lines must start with ) are
comments.
29Running MATLAB script
- If you saved the above script "average_script.m"
into the present working directory, then it can
be run simply by typing average_script at the
MATLAB command prompt. - Ex35 Try to run it using the following sequence
of commands in the command prompt - clear
- whos
- pwd
- dir
- average_script
- whos
30Saving variables 1
- The save command can be used to save all or only
some of your variables into a MATLAB data file
type called MAT-file - If you want to choose the name of the file
yourself, you can type save followed by the
filename you want to use. - MATLAB will then save all currently defined
variables in a file named with the name you chose
followed by the suffix ".mat" (for example, if
you chose the name my_variables MATLAB would save
as "my_variables.mat" in your present working
directory). - Before saving you should change your present
working directory to one of your own directories
(such as some directory on your floppy diskette),
or specify the complete path to where you want
MATLAB to save your variables (for example
"a\my_variables\my_vars"). - Ex36 Try this example of using save
- clear
- who
- cd c\my_variables (replace this with your own
folder) - pwd present working directory
- a 10
- b 20
- c 30
- d sqrt((a b c)/pi)
- who
- save my_chosen_filename (replace this with your
own filename) - dir
- clear
- who
31Saving variables 2
- The above use of the save command saved all the
MATLAB workspace defined variables. If you just
want to save some of your variables, you simply
list the variables you want to save after typing,
save and the filename. - Ex37 Try to save only the variables a and c
- clear
- who
- a 10
- b 20
- c 30
- who
- pwd
- save some_of_my_variables a c (replace this with
your own filename) - dir
- clear
- who
32Loading variables 1
- The load command is used for loading variables
back in later to use them again. Typing load
followed by a filename (without the ".mat"
suffix) will search the MATLAB path (refer to the
next lesson regarding the MATLAB path) for the
file, "filename.mat", and load all the variables
saved in that file (for example, typing load
my_vars would cause MATLAB to search for
"my_vars.mat" and load the variables saved in
it). - Ex38 Try this example of loading variables back
into MATLAB - clear
- who
- cd c\my_variables (replace this with your own
folder) - dir
- load my_chosen_filename (replace this with your
own filename) - who
- a
- clear
- who
- load some_of_my_variables (replace this with
your own filename) - who
- c
33Loading variables 2
- You can also choose to load in only some of the
variables that are saved in a MATLAB data file
(MAT-file). To load only some of the variables
saved in a file back into MATLAB, just type the
names of the variables you want loaded back in
after typing load and the filename (without
".mat") at the command prompt. - Ex39 Assuming that variables "a", "ans", "b",
"c", and "d" are all saved in a file, you can use
the load command to load only "a" and "c" back
in - who
- dir
- whos -file my_chosen_filename (replace this with
your own filename) - load my_chosen_filename a c (replace this with
your own filename) - who
- a
34Working with Files, Directories and Paths
- In general, files are managed, organized, and
accessed in MATLAB in the same way as in
Microsoft Windows, that is, in a hierarchical
file system. - How MATLAB Finds Files?
- MATLAB always look inside your present working
directory (type pwd at the MATLAB command prompt
to see your present working directory) - If the file is not located in the present
working directory MATLAB will also search in
other directories that are stored in the MATLAB
path (The present working directory can also be
thought of as part of the MATLAB path) - Ex40 To print out the current MATLAB path type,
matlabpath or path, at the command prompt - If you want to store your MATLAB files in some
directory that does not exist in the matlabpath,
add the complete path to your directory to the
MATLAB path. - Ex41 There are two ways you can append your own
paths to the MATLAB path - use the addpath command - type addpath followed
by the complete path to your directory - use the path tool of MATLAB - type pathtool at
the command prompt, or select File-gtSet Path - addpath a\my_stuff\letters
- matlabpath
35Useful functions
- pwd - present working directory
- dir, or ls - List directory
- what - List MATLAB-specific files in directory
- cd - Change current working directory
- path, or matlabpath - List the MATLAB search path
- addpath - Add directory to search path
- pathtool - Invoke the path tool interface
- help general - List of general MATLAB commands
36Exercises
- Download the well-known Iris data to your working
directory from http//www.ics.uci.edu/mlearn/MLSu
mmary.html - Import the data into MATLAB by choosing from
menu File-gtImport data-gt - Perform some explorative DM for the Iris data
set. - Make a global summarization for the Iris data
(for example, compute the mean, median, variance
and range of the variables) - Explore data by plotting 2-dimensional scatter
plots for each pair of variables (e.g.,
plotmatrix) - Find the two most correlating variables in the
data (corrcoef) - Plot histogram (using, for example, 10 bins) for
each variable (hist) - Compute attribute means and medians for each
class - Compute the variance of all the variables (var)
- Compute the covariance matrix of the whole data
(cov) - Construct histograms of 10 bins for each Iris
variable - Make 2-dimensional scatter plots for each pair of
variables on Iris data. Use different markers
with different colors for different classes
Use help commands and documentation at
http//www.mathworks.com/access/helpdesk/help/tech
doc/matlab.html!!!
37Task 1
- Load the Iris data set from UCI repository
http//www.ics.uci.edu/mlearn/MLSummary.html - A short description of the data is found from the
lecture slides, Tan et al. Chapter 3 Exploring
data, slide number 4. - Expect that you do not know the class names, the
number of classes etc. of the Iris data set. What
you know is that you have some data about flowers
and the attribute names. Then, without prior
assumptions and knowledge, analyse the data set
using the available (or self-implemented)
explorative and summarizing MATLAB tools.
Document and explain all you can learn from the
data by exploring. The documentation should
contain figures and interpretation of useful and
interesting visual views (different plots,
colors, histograms,... see the techniques in the
lecture slides Tan et al. Chapter 3 Exploring
data). For example - Can you determine the number of classes by
exploring (assumed to be unknown)? How? - Behavior of attributes (correlations, scatters
(variance/MAD/covariance), ranges, ). What kind
of preprocessing might be needed? Are there
redundant attributes? Outliers? And so on.. - other findings?
- Explain what you can learn about data (that
represents the three flower types). Describe
carefully your findings and compare your results
with respect to the known class labels of the
flowers. Did you find the classes from the data?
38Task 2
- Load the synthetic cluster data set from the file
clusterdata1.data. The data set contains a set of
generated 7-dimensional clusters. Try to find the
best possible prototypes for the clusters using
the MATLAB implementation of the dckmeans.m.
Before this, exploring and preprocessing the data
set (see Task 1), try to find all possible
information for the clustering step (for example,
data may contain errors, noise, redundancies,
moreover, you must determine the number of
clusters and so on). You may also modify the
dckmeans code (e.g., replace the sample mean
estimate wih a more robust one such as median).
Remember that K-means is a local seach method
(results depend on the initial prototypes, you
may find the good ones). You can also utilize the
PCA code in exploration and/or clustering. - Document and explain all the steps and all the
significant facts you can learn from the data by
exploring, summarization, visualization,
clustering etc. The documentation should contain
plots, histograms, etc. with interpretations. If
you do some prepocessing, transformations,
scaling for data, report and explain them
carefully. The most important thing is to
document the final clustering results (prototypes
and clusterlabels) that is your refinement for
the data set. - Remember not to only report the findings, but
also how did you proceed (your mining process)!
Exploit frequently the help commands and
documentation at http//www.mathworks.com/access/h
elpdesk/help/techdoc/matlab.html!!!