SPSS Training Manual (Z)
SPSS Training Manual (Z)
MAY, 2017
OUTLINE OF THE TRAINING
➢ However, it works for any science or art when statistical analysis is required.
Opening SPSS
• Depending on how the computer you are working on is structured, you can open
SPSS in one of two ways.
1) If there is SPSS shortcut like on the desktop, simply put the cursor on it
and double click the left mouse button.
2) Click the left mouse button on the button on your screen, then put
your cursor on Programs or All Programs and left click the mouse. Select SPSS
16 or 20 for Windows by clicking the left mouse button. Either approach will
launch the program.
Introduction (Cont.)
Exiting SPSS
• To close SPSS, you can either left click on the close button located on the
upper right hand corner of the window or select Exit from the File menu.
• A dialog box like the one below will appear for every open window asking you if you
want to save it before exiting.
• Click No for each dialog box if you do not have any new files or click Yes if you
want to save files.
Introduction (Cont.)
➢SPSS software has three windows:
figure (6).
Figure 5: Variable labeling window
Columns and Align: is a column where you can give value for the width of
the column in the data view and Align tell us about the alignment of the value
in the cell.
Measure: is a place where you define the measurement scale(Nominal,
Ordinal and Scale) of the variable
B. Data View Window
❖ Data view window is a place where you will see the variables defined on the
variable view window listed on the first row of the spread sheet.
❖ And then if its your first time you are supposed to type the values of each variable.
❖ However, if you already have the data, you will see them here.
❖ Remember every column in this window shows the variable values and each row
shows the observation or individual information for each variable. Look image B in
figure (2).
C. Output Window
❖ This is a window where you will see all the results of the analysis you made.
❖ You can copy your result to Microsoft word or to other place from here or you can
even edit the outputs before you copy. Look image C in figure (2)
SPSS Menus and Icons
• Edit includes typically cut, copy, and paste commands which allows you to
specify various options for displaying data and output
• View allows you to select which toolbars you want to show, select font size, add or
remove the gridlines that separate each piece of data, and to select whether or not to
display your raw data or the value labels.
• Data allows you to select several options ranging from displaying data that is sorted
by a specific variable to selecting certain cases for subsequent analyses.
SPSS Menus and Icons (Cont.)
• Transform includes several options to change current variables. For example, you
can recode a given variables to the same or different variable, change scores into
rank scores, add a constant to variables, etc.
• Analyze includes all of the commands to carry out statistical analyses.
• Graphs includes the commands to create various types of graphs including box
plots, histograms, line graphs, and bar charts.
• Utilities allows you to list file information which is a list of all variables, there
labels, values, locations in the data file, and type.
• Window can be used to select which window you want to view (i.e., Data Editor,
Output Viewer, or Syntax).
• Help has many useful options including a link to the SPSS homepage, a statistics
coach, and a syntax guide. Using topics, you can use the index option to type in any
key word and get a list of options, or you can view the categories and subcategories
available under contents.
Entering and Saving Data in SPSS
• To enter data, simply beginning by typing information into each cell. If you did so,
SPSS would give each column a generic label such as var00001.
• Clearly this is not desirable because you would have no way of identifying what
var00001 meant later. Instead, you have to specify names for our variables.
• To do this, you can simply click on Variable View on the bottom left hand corner of
your window and specify the variable name, type, width, decimal, label and so on.
Sort Variable: you can sort variables by their name, type, label ,…
• Example to sort by variable name Data -> Sort variable ->name
Sort Cases
• Click Sort Cases under the Data menu
• In the dialog box, select participant ID or Name and move it into the Sort
by box by clicking the arrow.
• Select Ascending or Descending for Sort Order then click Ok.
Data Editing (Cont.)
Merging Files
a) Adding Cases
• Sometimes data that are related may be in different files and you would like to combine
or merge it in on file.
• In this case, each file contains the same variables but different cases.
• To combine these files, if one of the data files is open, then left click on Merge Files on
the Data menu and select Add Cases.
• Then specify the file from which the new data will come and click Open.
• A dialog box will appear showing you which variables will appear in the new file. see it,
and if all seems in order, click OK. The two files will be merged.
b) Adding Variables
• In other cases, you might have different data on the same cases or participants in
different files
• Be sure that variables on the same participants end up in the correct row, that is, you
need to match the cases (number of observation)
• Click on Merge Files in the Data menu, and select Add Variables indicate the file that
the new variables are coming from then click ok.
Data Editing (Cont.)
Split file
• We can split a given file in to two or more groups based on the categories/values of
a given variable
• For example, we can split a given data by male and female
Select Cases
• By selecting cases, the researcher can select only certain cases for analysis
• Click Data, click Select Cases, click Random Sample of Cases then select your
preferences
• You can also use ‘if condition selected’ option.
Data Transformation
Compute
• Compute is used to get a new variable by manipulating other variable(s).
• It uses conditional operators (If condition), logical operator and other mathematical
functions
• Click ‘Transform’ and then click ‘Compute Variable…’
• Example: Create new variable named ‘lnheight’ which is the natural log of height
• Type in lnheight in the ‘Target Variable’ box. Then type in ‘ln(height)’ in the
‘Numeric Expression’ box. Click OK
• A new variable ‘lnheight’ is added to the file
Data Transformation (Cont.)
Count
• It counts the occurrence of a value in a given variable
• To do this, click on Transform and then click on count values within cases
• Select the variable to the variables box, write the name and full description of
target variable in target variable and target label box respectively.
• Click on Define Values, write the value to be counted in the values box and then
add it in to values to count box by clicking add option.
• Click on Continue
• Then click OK
Data Transformation (Cont.)
Recode a Variable
• Recoding allows you to create a new or the same variable with a different value
• You can use the command ‘Recode into the Same Variables’ or ‘Recode into
Different Variables’
• For example to Recode into Different Variables,
• Click Transform, click ‘Recode into Different Variables’, move the old variable to
the right, give Name and Label for the Output Variable, click Change, click on
Old and New Values, insert Old and New values (Click Range to create ranges of
old values), click add for each values, click continue and then click ok.
Data Transformation (Cont.)
Ranking Cases
• The Transform menu is used to make changes to selected variables in the data file
and to compute new variables based on the values of existing ones.
• Ranking or recoding of data can also be done.
• The steps are simply click on Transform , click on Ranking Cases, select the
variable to be ranked, click on Smallest value or largest value from ‘Assign Rank
1 to’ option then click ok.
Importing Data (Reading Data In From Other Sources)
Opening data from EXCEL
• SPSS can also recognize data from several other sources
• For example, you can open data from Microsoft EXCEL in SPSS
• You should follow all of the variable name guidelines specified by SPSS (e.g.,
character length, no numbers beginning a name, etc.)
• Save your Excel file (with xls file extension ) then close it.
• Open SPSS and select Read Text Data from the File menu.
• A dialog box will appear, under Files of type select Excel, under Look in select
your file name then click Open
• Select Read variable names from the first row of data, because that is where the
names appear in the Excel file. Then, click Ok.
Importing Data (Cont.)
Text Data
• A text data file can be created in any word processing program or in Notepad or any
other text editor
• Be sure to save the file with the .txt or .dat file extension.
• If you have collected data from 4 people and typed it in the following format
012345 The first two digits are the ID number. The next digit is gender. Digits 4, 5,
021123 and 6 are the responses to the first 3 questions on a survey. No characters
031234 or spaces separate the variables. The data are on your disk in
042345 simpletextdata.txt
• Open SPSS.
• In the SPSS File menu, click Read text data.
• Select simpletextdata under Files of type Text and click Open.
• In the next dialog box, click No for “Does your text file have a predefined format” and
click Next.
• In the next dialog box, select Fixed width under “How are your variables arranged,”
then select No for “Are variable names included in the top of your file.” Then click
Next.
Importing Data (Cont.)
• In the next dialog box, indicate that the data starts on line 1, 1 line represents a case, and
you want to import all cases, then click Next. The following dialog box will appear. We
need to tell SPSS where to insert breaks for variables.
• The next dialog box will show you a draft of what your new data file will look like.
Notice, the variables will have generic names like v1, v2, etc. Then click Next.
• At the next dialog box, you can click Finish and your new data file will appear. You
could then specify variable names, types, and labels as illustrated above.
Importing Data (Cont.)
• Let’s take one more example where the text file is tab delimitated (a tab was inserted between
each variable) and has variable names at the top.
ID Gender Q1 Q2 Q3
• Below, is an example of the first two lines of from this text file. 01 2 3 4 5
• On the File menu, click Read text data.
• Select tabtextdata under Files of type Text, then click Open.
• On the next dialog box, you will see a draft that shows a box . between each variable to
represent the tabs. Are they in the right place? Select No for predefined format and then click
Next.
• Select Delimited for file arrangement and Yes for variable names at the top of the file, then
click Next.
• In the next dialog box, indicate that the data starts on line 2, Each line represents a case, and
you want to import all cases, then click Next.
• In the next dialog box, check Tab as the type of delimiter and then click Next.
• You will see a draft of your data file. Review it, and then click Next.
• Click Finish at the next dialog box and your new data file will appear.
• The difference between these two examples, is the second included the variable names at the
top of the file
Exporting Data from SPSS
• Copying is not the only option to take SPSS outputs in to Microsoft word or to
other program.
• But it is possible to export SPSS results into a Microsoft package.
• From file menu click on Export
• In the Objects to Export box, select All, All Visible or selected
• Under Document Type select Word/RTF (*.doc) from the drop down menu.
• In the File Name box type click on browse and specify the location and give file
name
• Click OK.
2. DESCRIPTIVE STATISTICS
A. Frequency
❖ In addition by clicking charts you can draw suitable graphs like histogram, pie chart
for the variable/s selected. See figure (8) for each window.
❖ Remember the first job during the above analysis is to select the variables listed on
image A and take them to image B.
❖ That means all the analysis selected will be done for the variables found on image B,
see figure (8).
B. Measure of Central Tendency and Variability
❖ By clicking Analyze → Descriptive Statistics → Descriptive you can select the
variable/s from image A in to image B of figure (9) to calculate mean, sum, dispersion
measures, shape of the data (skewness and kurtosis) and sorting.
❖ Remember by clicking the option box, you can chose the one you wanted to calculate
rather than the defaults. See figure (9).
Figure 8: Window for frequency
Figure 9: Window for Descriptive
DESCRIPTIVE STATISTICS (Cont.)
C. Cross Tabulation
❖Cross tabulation is used when you are interested to construct a two-way table.
❖As before all the available variable options will be listed on image A and from these
you select the row variable in to image B and the column variable in to image C.
❖If you want additional categorizing variable select it in to image D. On image D
(layer option) you can add more variables. see figure (10).
❖Based on the variable type you can construct a suitable graph like Bar graph(chart), Pie
Chart, Histogram etc using SPSS. Command: Graphs → Legacy Dialogs.
i. Bar graph(Chart) :In SPSS software, you have three Bar graph options:
❖ image C: Select the main categorizing variable and put it on category axis box.
❖ image M: Select the representation for the bar in the chart. The options are like number of
cases, percent of cases,... and other.
❖ If you select the other option you are supposed to select one numeric variable and then
the bars will represent the summary(click Change Statistic to select mean, median
variance or any ) of the numeric variable.
❖ Image T: click this to write the title and caption of the graph.
❖ Image P: this Panel by option will give you options to draw more than one graph on
single display.
❖ Enter a new variable (categorical) on Rows option to display the main plot categorized
row wise. And Enter a new variable (categorical) on Columns option to display the main
plot categorized column wise.
DESCRIPTIVE STATISTICS (Cont.)
❖ Image I: This is used for clustered graph. You have to select the clustering
variable and put it here. Look figure 14 and 15.
❖ Image N: Select two or more numeric variables to image N and then by clicking
change statistic below image N you can select the summary that is going to be
calculated like mean, median variance skewness etc.
❖ Image V: same purpose as image N.
❖ Image C: Select the clustering (categorizing) variable here.
Figure 14: Simple Bar graph: Summaries of Separate variables
Figure 15: Clustered Bar graph: Summaries of Separate variables
DESCRIPTIVE STATISTICS (Cont.)
ii. Pie-chart
❖ The procedures are the same as the steps needed to construct different simple bar
graphs.
Graphs → Legacy dialogs → Pie.
❖ Then decide which graph that you want to construct (summaries for groups of cases,
summary of separate variables and values of individual cases).
❖ Then the rest steps are the same.
iii. Histogram: (Look figure16)
❖ Image P: This Panel by option will give you a chance to draw more than one
graph on single display.
❖ Enter a new variable (categorical) on Rows option to display the main plot
categorized row wise. And Enter a new variable (categorical) on Columns option to
display the main plot categorized column wise.
❖ Image V: Insert the variable which histogram is going to be constructed here.
❖ Image T: click here and write the title and caption for the graph.
Figure 16: SPSS Window for Histogram
3. INFERENTIAL STATISTICS
❖After all steps click OK , you will see an out put looks like Table 3.
Example: Using the world95 data from the software let us test whether the population
average life expectancy of male is different from female population. Assume the two
variables be independent.
❖ Based on the Levene’s test output result in Table 4 the p-value (0.135) is greater than
0.05 (5 percent) level of significance.
❖ Thus, at 5 percent level we do not reject the null hypothesis which means the two
variables have equal population variance.
❖ Next, Based on the variance test result we stick with the first row information for the
mean test which assumes the equality of the population variance.
❖ P-value for the test of independence is 0.00 which is less than 5 percent, that results
rejection of the null hypothesis. Therefore, the average life expectancy of male and
female is not equal.
Table 4: SPSS out put for the Independent Samples T-test example
3.1.3. Paired Samples T-test
❖Paired Samples T-test is used to compare the population mean of groups that are related
in some way.
❖The Null (Ho ) and Alternate hypothesis (H1 ) for this test are:
Ho: There is no statistically significant difference b/n the population mean of the 2 groups.
H1: There is statistically significant difference b/n the population mean of the two groups.
❖After you select the paired sample in the mean comparison option you will get the a
window which looks like figure 19.
SPxy
r= SS x SS y
Simple Linear Correlation Analysis(Cont.)
❖ The correlation coefficient is always between –1 and 1, that is -1≤r ≤ 1
❖ r = -1 implies perfect negative linear relationship between the variables under consideration
❖ r = 1 implies perfect positive linear relationship between the variables under consideration
❖ r = 0 implies there is no linear relationship between the two variables
❖ but there could be a non-linear relationship between them.
❖ In other words, when two variables are uncorrelated, r = 0, but when r = 0, it is not necessarily
true that the variables are uncorrelated.
❖ To analyze the command
Analyze → Correlate → Bi-variate, this will produce the dialog box seen in figure :
❖ You will get the fist window and then enter the variables on the left window to the variables
window and then click on the Pearson box
❖ Then you will have the out puts looks like Table 8
O e 2
2 r c
ij IJ
for all ij
e
i 1 j 1 ij
Where Oij and eij be the (i, j)th observed and expected cell frequency respectively in the
crosstab formed by the two categorical variables.
❖ Note that i th
Row Total j Column Total
th
e
ij
Grand Total
Test of Association (Cont.)
❖ Finally, we will accept or reject our null hypothesis by comparing x2 (chi-square
computed) and x2α,,df (chi-square tabulated)
where df = (no. of row −1) * (no. of column−1) and α be the level of significance or
we can a compare p-value with the level of significance (α).
❖ To analyze use the the command
Analyze → Descriptive Statistics →Cross tab,
❖ You will get the fist window and then enter one of the variables on the row option
and the other on column option
❖ After we click on the Statistics box we will get new box
❖ Tick the chi-square box, click continue and OK
❖ Then you will have the out puts looks like Table 8
❖ Look figure 21 for more information
Test of Association (Cont. )
Where k-be the total number of independent variables (covariates) in the model,
Y -be the response (dependent) variable and
x1, ...xk be the independent variables.
❖ If k = 1 the model is called Simple linear regression model and if k > 1 the model is
called Multiple Linear Regression Model.
❖ To have a linear regression output, go to Analyze→ Regression → linear,
❖then you will get the upcoming window, figure 22 where you will enter your data and
get the possible outputs.
Figure 23: Regression Analysis Window
Regression Analysis (Cont. )
❖ Example : let us use world 95 data on the package. From this data let
❖ Then if you follow the formal procedure with the method Enter, you will get the
upcoming outputs in Table 11,12 and 13.
❖ Note that the hypothesis for the ANOVA table (out put figure (33))for our example is
H0:β1= β2= β3 = 0 VS H1: at least one i is different from zero. Where i=1,2,3
❖ Based on our enter method of variable selection the fitted model given as follow,
❖ Remember to check all the model assumptions before we interpret and use the estimates.
❖ During simple or multiple linear regression analysis the dependent variable must be
continuous variable.
❖ However, there are cases where the dependent variable is categorical.
❖ If the dependent variable is categorical what ever the independent variable is/are
we can fit Logistic Regression model.
❖ Also there are different types of categorical variables too, like Binary (Two
possible value) or Multilevel(greater than two possible values).
❖ Again the Multilevel categorical variables can be categorized as Ordinal, nominal
categorical variable.
❖ If you need more you can read further about multilevel logistic regression, but we
are intended to discuss
❖ If the dependent variable has only two categories we use a Binary Logistic
Regression.
❖ Let P = Prob(y = 1/x = xi) where Y be the Binary response variable of values 0 and
1, and x be the independent variable/s (covariate variable/s).
Binary Logistic Regression (Cont.)
Then the Binary Logistic regression model becomes:
❖ Omnibus Tests of Model Coefficients will be used to test the significance of included
covariates in the model
❖ Model comparison will done using Likelihood ratio test and Chi-square test statistics.
❖ In addition, Cox and Snell R2 and Nagelkerke R2 will be interpreted to see the quality
of the final model. (Need reading)
❖ Command: Analyze →Regression → Binary Logistic.... Look figure (35) for the detail
steps.
Binary Logistic Regression (Cont.)
Example: Let us consider Table 9 and model the impact of Gender and Educational
level on the saving status of 20 house holds. Where
Dependent variable:
Saving: 0=Yes, and 1= No)
Independent variables:
Gender: 0=Male, and 1= Female
Educational Level: 0=Elementary or less, 1=High school and 2=Above High School
Based on the Binary logistics analysis out put the final model is:
where,