The Data Mining of the 2001 NEMP year 8 Mathematics Dataset
Using Cluster Analysis
 

THE COMPUTATIONAL METHODS


Computational Considerations

This section documents schematically the data processing steps involved in this project. Providing this scheme of the steps involved should allow other researchers to pursue the same steps should they wish.

All cluster analysis was carried out using the CLUSTAN Graphics (Wishart, 1999) clustering software. This software was utilised because of the ease with which it could be used (in interface with XL), and the graphical output that it could provide.

Crosstabulations and quantitative analyses were carried out utilising SPSS softeware.

Microsoft XL software was used in some data manipulation and in the development of the graphs of cluster group profiles.

The process of analysis involved several transfers of the NEMP datasets provided (and transformations into different data formats). The flow of the data through these stages is illustrated in the five figures below.

The Stages of the Analysis
The figures provided illustrate the handling of the NEMP data though 5 stages of the computations involved in analysis. The stages and the figures provided are those of:

  1. Creation of XL files suitable for CLUSTAN
    The step of data translation and merging to move from individual systat files to an aggregated XL file (and also an SPSS data file). See Figure 1.
  2. From XL file to the Cluster Analysis Results.
    The operational decisions in utilising CLUSTAN Graphics and the results obtained.
    See Figure 2.
  3. From Cluster Analysis to Validated Cluster Solution.
    The validation of the cluster solutions in terms of their within-cluster and between-cluster distances and their lying outside the values expected in Null cluster dataset solutions.
    See Figure 3.
  4. Quantitative Analysis of Cluster Membership
    The comparison of the cluster solutions in terms of their discriminating capacity on external variables (using SPSS). See Figure 4.
  5. Qualitative Analysis of Cluster Membership
    Obtaining an intuitive understanding of the clusters via labelling of cluster groups and relating the results of cluster groups to each other and the National curriculum. See Figure 5.


Figure 1 Creation of XL files suitable for CLUSTAN


 


Figure 2:   From XL file to the Cluster Analysis Results.



Figure 3 From Cluster Analysis to Validated Cluster Solution [3]



[3] 30 truncation for validation purposes.

 

 

Figure 4:  Quantitative Analysis of Cluster Membership


 

Figure 5:  Qualitative Analysis of Cluster Membership

 

next page

top of page    |    return to Probe Studies - INDEX   |    return to Probe Studies menu
  For further information and contact details for the Author    |    Contact USEE