Run DataXplore Neural Network Module

Much of the documentation found in this section is also available from Dataxplore.pdf, the guide to DataXplore.exe

Guidance Notes

There is a known bug in the Fuzzy neural network that has to do with array sizes. The Fuzzy neural network will not presently work with a unique conditions grid that is greater than about 2000 records. This problem is being address by developing a new version DataXplore tool. The table below suggests the file naming conventions that are useful while running the Neural Network Module. The ordering of the entries in the table reflects the order in which the tables would be created in the neural network module.

File naming convention Definitions
sdmuc1_train.dta Neural network training input file. Used by RBFLN. Default name generated by Generate Neural Network Inputs.
sdmuc1_class.dta Neural network classification file. Used by RBFLN and Fuzzy Neural Network Default name generated by Generate Neural Network Inputs.
sdmuc1_train.par Training parameters derived by the RBFLN from training on sdmuc1_train.dta. Suggested name for user to give to the RBFLN training output from the neural network module..
sdmuc1_class.rbn Classification results derived by the RBFLN classsification of sdmuc1_class.dta using the training parameters in sdmuc1_train.par. Suggested name for user to give to the RBFLN classification results output from  the neural network module. This file will be the input for the next step in ArcSDM, Read the Results from the Neural Network Module.
sdmuc1_class.cen Centers of the clusters found by the Fuzzy Neural network in the so called "Training" set using sdmuc1_class.dta as the input. Suggested name for user to give to the RBFLN training output from  the neural network module.
sdmuc1_class.fuz Classification results derived by the Fuzzy Neural Network classification of sdmuc1_class.dta using the clusters centers stored in sdmuc1_class.cen. Suggested name for user to give to the RBFLN training output from  the neural network module. This file will be the input for the next step in ArcSDM, Read the Results from the Neural Network Module.

To run DataXplore

Select 'Run Neural Network Module...' from the ArcSDM3 menu.

This will launch DataXplore, which is a separate MS Windows program from ArcMap.

While DataXplore is running, ArcMap is fully accessible.

The remainder of this section describes how to use DataXplore with reference to ArcSDM3 and the input files generated in the previous step.

To perform Radial Basis Functional Link Network analysis (RBFLN)

there are three steps:

Training

Click the 'Train' button to display the following dialog:

Select the file that contains your training data and click OK. The default name for neural network training data files generated from ArcSDM3 is sdmuc#_train.dta.

The RBFLN parameter dialog is then displayed:

The parameters in the box 'RBFLN Parameters From data File' are read from the training data file specified in the previous step.

Click 'Start Train'. A report in the following format is displayed:

Contents of the report

No. of Hidden Layers -
No. of Input Vectors - the number of training vectors (unique conditions at the location of training points) in the training data file
No. of Unique Targets - this will always be 2, either 0 or 1, representing the presence or absence of a mineral occurrence
Total iterations - DataXplore parameter set in the parameter dialog. 200 is the default.
SSE - Sum Squared Error

Guidance Note: An SSE of 1 is considered good.

Result of Training

Column 1: Vector No. - an integer uniquely identifying each vector
Column 2: Target - contains values of either 0 or 1. 0 indicates the absence of a mineral occurrence, 1 indicates presence.
Column 3: Actual Output - The actual value that was calculated for each training vector. The range of values is 0 to 1.
Column 4 - n: The input data. These are the actual values read from the training data file. If the file was generated by ArcSDM3, these values were derived from a unique conditions attribute table. The values for each evidential theme have been normalized or scaled between 0 and 1, and an area weighted average of known values has been calculated and used to define areas where data has been identified as missing.

To save this report to a file

You can optionally save the contents of this report to a text file for later inspection or reporting purposes. To do this,

  1. Click 'Print ListBox Content to Temporary File'.
  2. Specify a filename and location when prompted.

Save the results of the training session

  1. Click 'Save Result as .par and Return'.
  2. Specify a filename and location when prompted.

This file must be created for input into the testing and classifying steps that follow. The file name will end with the extension par. It is suggested to name this file sdmuc#_train.par

You will be returned to the RBFLN parameter dialog.

Click 'Return to Main Menu'.

 

Testing

We have not provided a mechanism to provide a separate testing dataset, which should be a test set like the training set but a set that was not used for training. Thus the only way to create a testing set is to run the 'Generate Inputs to Neural Networks'  with a different 'deposits' and 'non-deposits' training sets and possibly a different study area so a different subset of the evidence is used, and thus create a new unique conditions raster, train and class files for use in testing. Then testing can be run on the new train or class tables using the parameters file from the training. This will provide an independent test of the training and particularly if the training has over trained on the data.

Click the 'Test' button.

Select the file that contains the data to be classified and click OK. The default name for neural network training data files generated from ArcSDM3 is class#.dta.

You will then be prompted to select an RBFLN Test Parameter (*.par) file

  1. Select the that you just generated in the training step.
  2. Click 'Open'.

The test is performed and the following report is displayed:

Report Contents

Input Vectors Dimension - the number of components in an input vector. If the data file was generated by ArcSDM3, this is the number of evidential themes.
No. of Hidden Layers - number of training vectors
Target Vectors Dimension - this is always 1
No. of Input Vectors - this is the number of unique conditions written to the data file
SSE - Sum Squared Error

There are two targets, 0 and 1. In the example of the screen-captured reported above, 2096 vectors (or unique conditions) were classified as 0 and 1366 were classified as 1. The assignment of vectors to either class 0 or 1 is done by rounding the value calculated by the RBFLN. ArcSDM3 will read the calculated values rather than the classified (rounded) value.

Result of RBFLN Test

Column 1: Vector No. - an integer uniquely identifying each vector (or unique condition)
Column 2: Class No. - the class to which the vector has been classified

To save this report to a file

You can optionally save the contents of this report to a text file for later inspection or reporting purposes. To do this,

  1. Click 'Print ListBox Content to temporary file'.
  2. Specify a filename and location when prompted.
  3. Click 'Save'.
  4. Click 'Return' to return to the main DataXplore dialog.

Classify

The steps for classifying data are the same as those taken for testing the classification in the preceding section, with the exception that results are saved to a permanent file that can be subsequently read by ArcSDM3.

Click the 'Classify' button.

When prompted, select the file containing the data to classify. If generated by ArcSDM3, the default name is sdmuc#_class.dta and click 'Open'.

When prompted, select the file containing the parameter file that was the result of the training step. Its extension is .par. Click 'Open'.'

When processing is complete, a report will be displayed with the following format:

The information reported in the report header is the same as shown an described in the preceding section about the testing process. The values reported should be identical.

Guidance Note: Typically the SSE from classification will be larger than the SSE from training. This is commonly due the having a small number of training sites that do not represent the complete range of unique conditions. One way to improve the classification SSE is to generalize the evidence layers to some small number of classes. This will typically improve the SSE for classification but will not change the SSE for training by much.

Additional information is reported in the results section, as follows:

Result of RBFLN Classify

Column 0: Vector No. an integer uniquely identifying each input vector (unique condition). This number corresponds to the values in the unique conditions grid and attribute table. The values will be used to join the RBFLN results to the unique conditions grid theme when they are read by ArcSDM3 in order to create a response theme.
Column 1: Classified Class No. the target class that the vector is classified as. The value is either 0 or 1 and is derived by rounding the actual output value.
Column 2: Target Output The contents of the target output from the data file (all 0's).
Column 3: Actual Output The value calculated by the RBFLN. This value can be read by ArcSDM3 for mapping.

To save this report to a file

You can optionally save the contents of this report to a text file for later inspection or reporting purposes. To do this,

  1. Click 'Print ListBox Content to temporary file'.
  2. Specify a filename and location when prompted.
  3. Click 'Save'.

Save the results of the RBFLN classification session

  1. Click 'Save Results as .rbn and Return'. The suggested name is sdmuc#_class.rbn
  2. Specify a filename and location when prompted.

This file must be saved if you want to read the results back into ArcMap via ArcSDM3.

You will be returned to the main DataXplore dialog.

Fuzzy Clustering: Fuzzy (Unsupervised)

There are two steps in doing a classification by fuzzy clustering:

Although the Fuzzy neural network does not use training the process is done in two steps, clustering to define the cluster centers and then classification of all the data based on these clusters. The clustering to define the centers is done by the processes labeled Training and then the classification based on these centers is applied by the process labeled Classification. This partition of the processes allows the user to apply the centers from one data set to different dataset with the same evidence layers, if desired.

There is a known bug in the Fuzzy neural network that has to do with array sizes. The Fuzzy neural network will not presently work with a unique conditions grid that is greater than about 2000 records. This problem is being address by developing a new neural network module.

Training

Click the 'Train' button located in the box labelled 'Fuzzy (Unsupervised)' on the main DataXplore dialog.

This will a dialog prompting you to select a 'Fuzzy Train Data File'. The fuzzy clustering algorithm trains using the complete data set that is to be classified. If your input files were generated by ArcSDM3, this file is the one that contains the entire unique conditions table information. The default name was sdmuc#_class.dta.

Select the file to use.

Click 'Open'.

The following dialog will be displayed:

Accept the default parameters as displayed.

Click 'Start Clustering'.

When the clustering processes are complete, a report is displayed in the following format:

Fuzzy Clustering Result

Number of Classes - This is the number of classifications that DataXplore defined from the data.

Center[i] - the values that define the center of each cluster. The number of dimensions equals the number of input evidential themes.

For each Class, the following information is reported:

Vectors Number - the number of vectors (unique conditions) belonging to the class
Weighted Fuzzy Variance (WFV) - the value for the weighted variance of the class
Vector Index in this Class - a list o1f integers identifying the vectors (unique conditions) that belong to the class
Clustering Validity (1/XieBeni) - a measure of the validity of the clustering results. A small value indicates better clustering.

Merge

You can merge classes by clicking the 'Merge' button. Often cluster centers are found that are quite close together. So it is a good practice to Merge clusters to come up with a minimum set of clusters. When there are no further changes, the minimum number has been defined. Often several of the clusters are simply complements of another cluster as seen in the pattern membership display of the results.

You may want to do this if:

When the merge process is complete, the message 'Merge finished' will be displayed at the top of an updated report. There may be a point at which the algorithm cannot merge the data into fewer classes than currently exist.

To save this report to a file

You can optionally save the contents of this report to a text file for later inspection or reporting purposes. To do this,

  1. Click 'Save ListBox Content to temporary file'.
  2. Specify a filename and location when prompted.
  3. Click 'Save'.

When you are satisfied with the number of classes, click 'Finish'. This will prompt you to save the results of the clustering to a Fuzzy Cluster Result Center file (*.cen).

You must save the results of the clustering step for input into the Classify step that follows.

After you have saved the .cen file, click 'Return'.

Classify

Click the 'Classify' button.

Select the 'Fuzzy Classify Data file'. This is the same file that was selected as the 'Fuzzy Train Data File' in the preceding step. If generated by ArcSDM3, it contains the complete unique conditions table information and a name of sdmuc#_class.dta was proposed by default.

Click 'Open'.

When prompted, select the Fuzzy Classify Center file generated in the training step:

The results of the classification are reported in a similar report to the one created during the training step. It can be saved by clicking 'Save ListBox Content to temporary file' and specifying a name and location for the file.

Click 'Finish'.

You will be prompted to save a Fuzzy Classify Result file (.fuz). Specify the file name and click 'Save'. This file is required to read the results of fuzzy clustering back into ArcMap via ArcSDM3. The suggested name for this file is sdmuc#_class.fuz

Next Section Contents Home