The Data Mining of the 2001 NEMP year 8 Mathematics Dataset
Using Cluster Analysis
 

CLUSTER ANALYSIS OF NEMP A SAMPLE ON THE 0-1 NUMBER LINE TASKS

The graphs and figures that illustrate the derivation of a hierarchical cluster profile based on the 0-1 number line placement A Sample task follows.

The initial cluster solution is illustrated below in Figure 11 shaded at the 6 cluster solution.


The heirarchical cluster solution was based upon the flexible cluster procedure with beta set to .25. Six cases were removed from the analysis as they became irreconcilable single case clusters due to the extent of missing data. The scree slope of the pseudo t statistic which led to the decision to utilise the six cluster solution is illustrated in Figure 12 below. The flattening out of the pseudo-t line past the 7 cluster point may be because the amalgamations between clusters past that point do not represent joinings between profoundly different clusters.


(Click to enlarge)

Some support for the six cluster solution is found in the visual depiction of cluster solutions in Figure 13 below. Coloured versions of the figure showed that in general there tended to be a higher colour density within clusters than between clusters. This is particularly evident for the densities between clusters 4 and the other clusters. The colour map of proximities makes it clear that Cluster 5 is not well-defined with respect to the other clusters (particularly with regard to clusters 4 and 6).

In terms of validation of the non-randomness of the cluster solution at 6 clusters, CLUSTAN GRAPHICS allowed the comparison of the series of fusion coefficients (distances) of the actual sample cluster pairs compared to a series of Bootstrap-derived sample cluster pairs. The results of this comparison is detailed in Figure 14 below. The largest deviations of the Bootstrap derived samples from the other samples is at 7 clusters. Based upon these results, the decision to utilise the 6 cluster solution in further analysis has justification.


Qualitative Summary of cluster groups
A cross tabulation of each question in the 0-1 number line task by the cluster groups was carried out. The tables from these are found in Appendix V. For most tasks there was a significant chi-square statistic indicating non-chance differences in the distributions of task responses between groups. A qualitative summary of the cluster group profiles obtained by the cross tabulation of Nemp 0-1 number line task by cluster group follows:

Cluster group 1 succeeded only on the task question of placing 100% on the 0-1 number line.

Cluster group 2 did not succeed in any of the placement questions.

Cluster group 3 succeeded on the task questions of placing .1, 50% and 100% on the 0-1 number line.

Cluster group 4 succeeded on all of the placement questions.

Cluster group 5 succeeded on all task questions except those of placing .25 and .5 on the 0-1 number line.

Cluster group 6 succeeded on all task questions except those of placing 1/10 and 4/5 on the 0-1 number line.

Global summary
The analysis revealed that a minority of students (26%) had no problems with this task. Two of the cluster groups with profound problems comprised almost half the sample (44%). The remaining students tended to be competent with placement of quantities in one form of representation and not with others. Clearly the latter students had not achieved a relational understanding of number representation.

The analysis above exemplifies the power of cluster analysis when applied to well-designed tasks to uncover empirically and conceptually distinct groups from data spaces with a potentially large range of structural permuations.

next page

top of page    |    return to Probe Studies - INDEX   |    return to Probe Studies menu
  For further information and contact details for the Author    |    Contact USEE