1 * T
-TEST example pspp code
3 * Generate an example dataset for male and female humans
4 * with weight
, height
, beauty and iq
data
5 * Weight and Height
data are generated as
normal distributions with
6 * different mean values. iq is generated with the same mean
value (100).
7 * Beauty is only slightly different.
8 * Every run of the program will produce
new data
11 * Females have gender
0
12 * Create
8 female cases
14 compute weight
= rv.
normal (65, 10).
15 compute height
= rv.
normal(170.7,6.3).
16 compute beauty
= rv.
normal (10,4).
17 compute iq
= rv.
normal(100,15).
24 compute weight
= rv.
normal (83, 13).
25 compute height
= rv.
normal(183.8,7.1).
26 compute beauty
= rv.
normal(11,4).
27 compute iq
= rv.
normal(100,15).
35 * Add a
label to the gender values
to have descriptive names
37 /gender
0 female
1 male.
39 * Plot the
data as boxplot
41 /variables
=weight height beauty iq
by gender
44 * Do a Scatterplot
to check if weight and height
45 * might be correlated. As both the weight and the
46 * height for males is higher than for females
47 * the combination of male and female
data is correlated.
48 * Weigth increases with height.
50 /scatterplot
= height with weight.
52 * Within the male and female groups there is no correlation between
53 * weight and height. This becomes visible
by marking male and female
54 * datapoints with different colour.
56 /scatterplot
= height with weight
by gender.
58 * The T
-Test checks
if male and female humans have
59 * different weight
, height
, beauty and iq. See that Significance for the
60 * weight and height
variable tends
to 0, while the Significance
61 * for iq should not
go to 0.
62 * Significance
in T
-Test means the probablity for the assumption that the
63 * height (weight
, beauty
,iq
) of the two
groups (male
,female
) have the same
64 * mean
value. As the
data for the iq values is generated as
normal distribution
65 * with the same mean
value, the significance should not
go down
to 0.
66 t
-test groups
=gender(0,1)
67 /variables
=weight height beauty iq.
69 * Run the Code several times
to see the effect that different
data
70 * is generated. Every run is
like a
new sample
from the population.
72 * Change the number of
samples (cases
) by changing the
73 * loop
range to see the effect
on significance
!
74 * With increasing number of cases the sample
size increases and
75 * the estimation of mean values and standard deviation becomes better.
76 * The difference
in beauty becomes visible only with larger sample sizes.