MATRIX: Fix test for very large matrix.
[pspp.git] / examples / t-test.sps
blob9064d2574bd8e675ae08eafedca4171a398ab42d
1 * T-TEST example pspp code
3 * Generate an example dataset for male and female humans
4 * with weight, height, beauty and iq data
5 * Weight and Height data are generated as normal distributions with
6 * different mean values. iq is generated with the same mean value (100).
7 * Beauty is only slightly different.
8 * Every run of the program will produce new data
9 input program.
11 * Females have gender 0
12 * Create 8 female cases
13 loop #i = 1 to 8.
14 compute weight = rv.normal (65, 10).
15 compute height = rv.normal(170.7,6.3).
16 compute beauty = rv.normal (10,4).
17 compute iq = rv.normal(100,15).
18 compute gender = 0.
19 end case.
20 end loop.
22 * Males have gender 1
23 loop #i = 1 to 8.
24 compute weight = rv.normal (83, 13).
25 compute height = rv.normal(183.8,7.1).
26 compute beauty = rv.normal(11,4).
27 compute iq = rv.normal(100,15).
28 compute gender = 1.
29 end case.
30 end loop.
32 end file.
33 end input program.
35 * Add a label to the gender values to have descriptive names
36 value labels
37 /gender 0 female 1 male.
39 * Plot the data as boxplot
40 examine
41 /variables=weight height beauty iq by gender
42 /plot=boxplot.
44 * Do a Scatterplot to check if weight and height
45 * might be correlated. As both the weight and the
46 * height for males is higher than for females
47 * the combination of male and female data is correlated.
48 * Weigth increases with height.
49 graph
50 /scatterplot = height with weight.
52 * Within the male and female groups there is no correlation between
53 * weight and height. This becomes visible by marking male and female
54 * datapoints with different colour.
55 graph
56 /scatterplot = height with weight by gender.
58 * The T-Test checks if male and female humans have
59 * different weight, height, beauty and iq. See that Significance for the
60 * weight and height variable tends to 0, while the Significance
61 * for iq should not go to 0.
62 * Significance in T-Test means the probablity for the assumption that the
63 * height (weight, beauty,iq) of the two groups (male,female) have the same
64 * mean value. As the data for the iq values is generated as normal distribution
65 * with the same mean value, the significance should not go down to 0.
66 t-test groups=gender(0,1)
67 /variables=weight height beauty iq.
69 * Run the Code several times to see the effect that different data
70 * is generated. Every run is like a new sample from the population.
72 * Change the number of samples (cases) by changing the
73 * loop range to see the effect on significance!
74 * With increasing number of cases the sample size increases and
75 * the estimation of mean values and standard deviation becomes better.
76 * The difference in beauty becomes visible only with larger sample sizes.