Title: SPSS Tutorial 1 What is SPSS
1SPSS Tutorial 1What is SPSS?
- SPSS is a computer program for organizing,
analyzing and graphing statistical data. - SPSS stands for Statistical Package for the
Social Sciences. - However, it can be used in any area of
application.
2SPSS Tutorial 1Getting started
- Start the program
- First window is the data editor window.
- As the name suggests, this is where you enter
your data. - You can type data in directly.
- Try this.
- For the purposes of this tutorial, we will load
an existing database. - File-gtOpen-gtData
- Open folders Tutorial\sample_files
- Open file demo.sav or demo
- The data from demo will appear in the data
editor.
3SPSS Tutorial 1Viewing data with labels
- Ultimately, all SPSS numeric variable are encoded
with numbers. - However, it is sometimes useful to associate
meaningful labels to these numbers. - Try toggling between the two modes.
- View-gtValue Labels
- Labels are easier to read, but number codes may
be easier during data entry. - In the Labels mode, try clicking on a label.
- A menu arrow appears.
- See what other values the variable can have.
- Change it, and then change it back.
4SPSS Tutorial 1Analyzing data
- Data means nothing without some analysis.
- Analysis functions can be found under the Analyze
menu! - Frequency table example
- Analyze-gtDescriptive Statistics-gtFrequencies
- In the Frequencies dialog box, notice the list of
variables on the left. - Try running your cursor over the variable names
to get a more complete name to pop up. - The variable name is in square brackets.
- The long version is called the label.
5SPSS Tutorial 1Analyzing data
- A ruler indicates a numeric variable.
- A graph indicates an ordinal scale.
- An two circles and an A indicates a nominal
string. - Some types of analyses are only appropriate for
numeric variables. - Select some variables to analyze.
- Click each then move them to the box on the right
using the arrow in the middle. - Select Gender and Income category for analysis.
6SPSS Tutorial 1Analyzing data, continued
- Click OK
- A viewer window will appear.
- On the left is an outline pane.
- The outline pane allows you to jump to any part
of the output. - Try clicking on Income category under Frequency
Table. - This will take you to the Income category
frequency table. - Shown is the number of people in each category,
by absolute numbers and in percent. - Valid percent handles the case where some data is
missing. - We will discuss Cumulative Percent in chapter 2.
7SPSS Tutorial 1Creating charts
- Charts are sometimes created by an analysis
procedure. - You can also create them directly, using the
Graphs menu. - Lets create a chart to show the relationship
between the cell phone and PDA ownership. - Graphs-gtBar
- Choose a chart type, say Clustered.
- Click Define.
- Transfer Wireless service into the Category Axis
box. - Transfer Owns PDA into the Define Clusters box.
- Click OK to produce the chart.
- Scroll down to see chart or use outline pane by
clicking Graph. - Notice that people with cell phones are more
likely to own PDAs.
8SPSS Tutorial 1Help
- Do not hesitate to use the Help menu as needed.
- Check it out.
- Topics
- Tutorial
- Case Studies
- Statistics Coach
- Etc.
- Also, most dialog boxes have their own Help
button. - Pull down Graphs-gtBar
- Notice the Help button.
- Check it out.
9SPSS Tutorial 1Data entry
- Pull down Window-gtdemo.save
- You can type data directly into the data window
or import it. - Import sources
- SPSS data files ( like demo.sav )
- Excel spreadsheets
- Microsoft Access database files
- Text files
- Data in an SPSS data file is organized into cases
(the rows) and variables (the columns). - A case refers to a single measurement or instance
of a random variable. - See data view.
- In the demo example, a case is a single person
responding to a survey. - Variables represent each question asked in the
survey.
10SPSS Tutorial 1Data entry, continued
- SPSS data files have a .sav suffix.
- How to control the display of file suffixes (in
Windows, not SPSS) - With any desktop window open
- Tools-gtFolder Options
- Click View tab.
- Select or deselect Hide Extensions
- We have already opened an SPSS data file, so we
know how to do it. - Excel example
- File-gtOpen-gtData
- At the bottom of the dialog box, select
Excel(.xls) file type. - Open demo.xls
- In an Excel file, where is SPPS going to find the
variable names? - This is not built into Excel.
- Select the check box for Read variable names from
the first row of data if desired. - The entire worksheet is imported unless a range
of cells is specified in Range.
11SPSS Tutorial 1Data entry, continued
- In our case, we will read the entire spreadsheet.
- Click OK.
- Click No to save contents question.
- Excel data appears in data window.
- Notice that the structure is not as detailed as
with the SPSS file. - Which databases can exchange data with SPSS?
- Answer Databases that use Open Database
Connectivity. - See help tutorial and other help facilities for
more information on databases.
12SPSS Tutorial 1Data entry, continued
- Text files, delimited by commas or tabs, can be
imported. - Use a tab or comma between each variable (column)
and - Use a carriage return to separate cases (rows)
- Go to Windows Start-gtMy Computer
- Open folder containing the demo file.
- It should be in the programs folder.
- Open demo.txt. It should open in Notepad or
some such program. - Notice the delimited format.
- Return to SPSS
13SPSS Tutorial 1Data entry, continued
- SPSS text example
- File-gtRead Text Data
- Select file type Text(.txt)
- Open demo.txt
- You are now in the Text Import Wizard
- The Text Import Wizard allows to specify how your
text data is to be interpreted. - Select No for predefined format
- Click Next
- We are tab delimited, so select Delimited
- The top row in our file has variable names, so
select Yes for variable names. - Click Next.
14SPSS Tutorial 1Data entry, continued
- Data begins on line 2.
- Click Each line represents a case.
- Click All the cases.
- Check the data preview to see if things look
right. - Select Tab as the only delimiter.
- None for text qualifier.
- The next dialog box permits you to edit any names
that have been truncated by SPSS - You can also define data types.
- Text files do not have data type information.
- Select Income in the Data Preview window
- Select Dollar from the Data Format list.
- Now Income is in dollars.
- Select No, No in the next dialog box and Finish.
15SPSS Tutorial 1Introduction to the Data Editor
- The Data Editor has two views
- Data view - columns represent variable, rows
cases - Variable view - each row is a variable, each
column is a variable attribute - The tabs at the bottom allow you to switch back
and forth. - Try it.
- Thinking in terms of a survey, each case is a
respondent and each variable is a question. - Data view is for entering and editing data.
16SPSS Tutorial 1Defining variables in Variable
View
- Click Variable View tab.
- Lets define three variables
- Age
- Marital status
- Income
- Type these names in the name column.
- However, note that length is limited and spaces
are not allowed. - Notice that defaults are given for the variable
attributes. - Type is numeric.
17SPSS Tutorial 1Entering data in the Data View
- Click Data View tab.
- Notice that our variable names now appear at
column heads - Age
- Marital status
- Income
- Enter 2 cases
- 55 1 72000
- 53 0 153000
18SPSS Tutorial 1Changing variable attributes
- We prefer integer formats for age and marital.
- Return to Variable view.
- Set the number of decimals to 0 for age and
marital in the Decimals column. - See the change in data view.
19SPSS Tutorial 1Changing variable attributes
- Add a new variable.
- Go to variable view.
- Type sex in the name column.
- Notice that the default type is Numeric.
- We want to designate sex as m or f (string type).
- Select the cell in the Type column, sex row.
- Click the button in the right half of the cell.
- Select String as the Variable Type.
- Click OK.
- In the same way, change type of income to dollar.
- Dollar types permit a choice of format.
- Choose ,,
- Click OK.
20SPSS Tutorial 1Variable descriptive labels
- Each name is short.
- However, for use in statistical reports and
charts, we want a more descriptive name or label. - Go to variable view.
- In the Label column for age, type Respondents
Age. - In the Label column for marital row, type Marital
Status. - In the Label column for income row, type
Household. - In the Label column for sex row, type Gender.
21SPSS Tutorial 1Value labels
- By default, SPSS stores most values as numbers or
numeric codes. - It is nice to give these numeric codes meaningful
labels. - For example, 0 and 1 are not very meaningful
values for marital status. - Adding value labels can fix this problem.
- Go to Variable View.
- Select the Values column, marital row cell.
- Click the button in the right side of the cell.
- The Value Labels window pops up.
22SPSS Tutorial 1Value labels
- Type 0 in the Value field.
- Type Single in the Value Label field.
- Click Add button.
- In the same way give the 1 code a label of
Married. - Click OK.
- Go to Data View to see how this affects the
married variable. - Toggle View-gtValue Labels on and off, to see how
numeric codes are labeled. - Repeat this process for the sex variable, using
the labels Female and Male for code F and M. - Enter sex data for each case.
- Note that string codes are case sensitive (m?M).
23SPSS Tutorial 1Missing data
- In research, it commonly happens that data is
missing. - It is important to handle these situations
carefully. - SPSS converts empty cells into system missing,
designated by . - It is important to note the specific reasons why
data is missing. - Example data missing because
- Respondent refused to answer.
- Respondent found question objectionable.
24SPSS Tutorial 1Missing data
- Go to Variable View.
- Click the age row, Missing column cell.
- Click the button in the right half of the cell.
- The Missing Numbers dialog box pops up.
- You can select up to 3 discrete missing values or
a range of missing values plus one discrete
value. - Select Discrete missing values.
- Type 999 into the first cell.
- Leave other 2 empty.
- Click OK.
25SPSS Tutorial 1Missing data
- 999 is now a code for a particular type of
missing data. - Select (age,Values) cell.
- Then right button in cell.
- Assign No Response to the code 999.
26SPSS Tutorial 1Missing data
- Missing strings are not designated as system
missing. - The are interpreted as an empty string.
- Format missing data for a string.
- Open the Missing Values dialog box for the sex
variable. - Select Discrete missing values.
- Type NR in the first text box, for no response.
- Dont forget that NR is different from nr.
- Since NR means no response, give NR a label
of No Response using the Values column.
27SPSS Tutorial 1Reusing variable formats
- If you have multiple variables with the same
format, that format can be reused. - In variable view, create a new variable called
agewed. - In the Label column, type Age Married.
- Click the Values cell in the age row.
- Edit-gtCopy.
- Click Values cell in age row.
- Edit-gtPaste.
- Since age and agewed are both ages, it makes
sense that they share the same Values format.
28SPSS Tutorial 1Reusing variable formats
- You can reuse all the attributes from a variable.
- Click the row number for the marital row.
- Copy using the Edit menu.
- Select the row number of the first empty row.
- Edit-gtPaste.
29SPSS Tutorial 1Defining variable properties for
categorical variables
- A categorical variable is a variable on a nominal
scale. - It is important for SPSS to know what kind of
scale to use for each variable. - SPSS will use different formulas for different
scale types. - Open demo.sav.
- Data-gtDefine Variable Properties.
- Move all the variables starting with Owns into
the Variables to Scan box. - The default type of variable is scale (i.e.
ratio), indicated by the little yellow rulers. - The selected variables should be nominal
(categorical).
30SPSS Tutorial 1Defining variable properties for
categorical variables
- Click Continue.
- In the Scanned Variable List, select ownpc.
- Set the Measurement Level to Nominal.
31SPSS Tutorial 1Working with output
- File-gtOpen-gtOutput.
- Open viewertut.spo.
- This is the output Viewer.
- The outline pane is on the left.
- The contents pane is on the right.
- You can access various parts of the output by
either scrolling or by clicking an item in the
outline. - An open book icon in the outline pane indicates
that part of the output is open. - You can open or close items in the contents pane
by double clicking on the items icon in the
outline pane. - Try this with Marital Status.
32SPSS Tutorial 1Working with output
- You can also hide or display any section of the
output by using the /- signs in the output pane. - Try this with the first Frequencies section in
the outline pane. - Collapse and expand the section.
- Notice the effect in the contents pane.
33SPSS Tutorial 1Pivot tables
- The results of most SPSS computations are
displayed as pivot tables. - Statistical terms appear in the pivot tables.
- Definitions of these terms can be accessed
directly in the Viewer. - Click on Owns PDAGenderInternet Crosstabulation
in the outline pane. - Double click the table.
- Notice the mysterious term, Expected count.
- Left click it, then
- Right click it.
- A popup menu will appear.
34SPSS Tutorial 1Pivot tables
- Select Whats this? from the menu.
- The definition appears.
- The Viewer permits you to transpose, reformat and
otherwise rearrange data in the pivot tables. - We will not cover this now because we wont be
using this capability much in this course. - However, if you need to manipulate these pivot
tables for inclusion in a paper say, refer to
Help-gtTutorials Working with output for details.
35Chapter 2A
- Frequency tables, graphs, and distributions
36Unsorted data
Salaries
37Sorted data
Salaries
- Trends become apparent
- Range
- How many of each
- We can make how many of each explicit with a
frequency table.
38Frequency table
- For any given salary, we can see how many people
earn that much.
39Relative frequency table
- Say 3 people earn 55,000
- Is that a lot of people?
- We want a relative measure.
- Relative to what?
- Why not 1?
- rf f / Total
40Cumulative relative frequency table
- Add what you have accumulated so far to the next
relative frequency.
41Cumulative relative frequency table
- Add what you have accumulated so far to the next
relative frequency.
42Cumulative relative frequency table
- Add what you have accumulated so far to the next
relative frequency.
43Cumulative relative frequency table
- Add what you have accumulated so far to the next
relative frequency.
44Cumulative relative frequency table
- And so on.
- Should eventually accumulate to 1
45Cumulative percentage frequency table
- Shows the rank of a salary relative to others.
- Percentile.
- The person earning 60,000 is earning as much or
more than 75 of the other people in the sample. - How to calculate Multiply crf by 100.
46Exercises
- Page 31 2 ac, 5 a,b,c
- You may skip cf columns
47Grouped frequency distributions
- What if we had slightly different data?
- 33,013 is almost the same as 33,014.
- Does it make sense to count them in separate
bins? - Meanwhile, there is a large gap between 70,000
and 200,000. - How can we represent these aspects of the data?
48Grouped frequency distributions
- Answer Grouped Frequency Distributions
- Place the data into equally sized (20,000) bins.
- Notice that the 33,000 salaries are lumped
together with similar salaries. - The 200,000 salary is really outstanding.
49Grouped frequency distributions
- What size bin is the right size?
- Different size bins will show different aspects
of the data. - Try different sizes.
- Explore.
50Histograms
- The mode is the peak value
- Histograms can be unimodal, bimodal, or
multimodal - Some cases may be ambiguous, depending on the bin
structure
51Distributions
- Histograms are a kind of distribution
- Distributions from small samples tend to have
large deviations from the true population
distribution - Not smooth
- Theoretical distributions are usually assumed to
be from an infinite population and so are
perfectly smooth
52Distributions
- By dividing each count by the total sample size,
one gets a relative frequency distribution - The vertical axis now becomes a probability
- The sum of all probabilities is 1
- A relative frequency distribution from an
infinite population is called a probability
density function - The area under the curve is 1
53Exercises
- Page 43 1 a b (use bin size of 5), 2 a b