Generate Neural Network Input Files

The first part of this process is the same as Calculating a Response Theme for the weights of evidence and logistic regression methods.

  1. Before starting this process, you may want to define the fuzzy membership of the "Deposit" and "Non-Deposit" training sites in what is being modeled. If no fuzzy membership is desire, none is required. Fuzzy membership is value between [0,1] that reflects how well a particular point represents what is being modeled. The attribute containing these values is a real or floating value. Simply add this field to the attribute table of the training sites and enter the desired fuzzy membership values. For example for mineral deposits, the size of the Deposit may be used to define a fuzzy membership. This is an expert decision and offers many opportunities are experimentation.
  2. Select 'Generate Neural Network Input Files...' from the ArcSDM3 menu. This displays the 'Input to Neural Network - Themes' dialog.
  3. Select the evidential themes to include in the analysis. The evidence rasters selected will be recorded in the Description box of the General tab in the raster Properties window.
  4. Click 'Specify Fields'. This opens the 'Inputs to Neural Network - Fields' dialog.
  5. For each evidential theme, specify the GEN file containing the generalization to analyze, the integer that defines areas of missing data and the theme data type, either free or ordered.
  6. Click OK.
  7. Click 'Generate Input Files...'.
  8. Next the Fuzzy Membership windows will appear, as shown below on the left. First window for the Deposits Training Sites and then for Non-Deposits Training Sites. The text in the box at the top of the window defines which training site is being used. The drop-down list contains all real-valued attributes. Select the one containing the fuzzy membership that you previously added to the training site shapefile attributes.  If no fuzzy membership is desired, select <NONE> in the drop-down list. After selecting the fuzzy membership attribute to use as shown on the right version of the window below, click Select Attribute.

  1. When prompted for the tables names, specify the following file names and locations as in the table below. These tables are used by the neural network for training and classification.

filename and location of the...

Description

Default Name

unique conditions theme grid Jump to definition. sdmuc#
training file
  • only generated if the RBFLN option is selected
  • text file containing information from unique conditions in which training points are located
  • 1 row = 1 unique condition
  • in DataXplore, 1 row = 1 training vector
  • each unique condition is written once, even though it may contain more than one training point
  • if a training point indicating presence of an object and another indicating absence occur in the same unique condition, the training point indicating presence takes priority
sdumuc#_train#.dta
data file
  • text file containing complete unique conditions
  • 1 row = 1 unique condition
  • in DataXplore, 1 row = 1 feature vector
sdmuc#_class#.dta

After the files have been created, the unique conditions grid theme will be added to the active view with a default name of 'SDMUC#'.

Sample output files from Generate Neural Net Input files.

Neural network input Files
A. Class.dta file from a unique conditions table written to neural network input file, delimited text format
Description of file header:

Line 1 (5): Number of evidential themes.
Also called components or features, in the context of neural networks and DataXplore.

Line 2 (151): Number of training points.
In the context of neural networks, "the number of centres (for clustering) or the number of radial basis functions (each with a centre) when our neural networks are used. But the number is not fixed, and may be changed by the program or by the user in either the unsupervised (fuzzy) clustering or in the supervised training of radial basis function neural networks.

Line 3 (1): Number of output components, or values to be mapped as response themes.
This is always set to one in the ArcSDM3 application.

Line 4 (A: 3462) (B: 151): Number of unique conditions.
In the context of neural networks, the number of target vectors.

Description of data columns:

Column 1: Unique condition number.

Column 2: Number of training points in unique condition (not currently used). In the Train.dta file, as described below, this number is the FID of the source point.

Column 3: Area of unique condition (not currently used). In the Train.dta file, as described below, this number is the FID of the unique condition.

Columns 4 - 8: Unique condition values or contents of target vectors.
Values are transformed from the ArcMap unique conditions table in two ways:

  1. the range of an evidential theme's values is normalized between 0 and 1
  2. the missing data integer is replaced with an area weighted mean

Column 9: In the Class file this column is zero. In the Train file this column is the fuzzy membership.

  1. After saving the training file (SDMUC#_train) and classification file (SDMUC#_class.dta) for use by the neural network, the Dialog Caption window below will appear. This window allows you to preview and print the training set (Train.dta) and unique conditions files (Class.dta) that will be used for training and classification by the RBFLN neural network or classificaiton in the Fuzzy Neural Network. Note that the number of training records in the sdumuc#_train.dta file will typically be fewer than the total number in your shapefiles. This is because training sites with duplicate unique conditions have been eliminated. The rule for eliminating duplicates is to keep the one with the highest fuzzy membership.
  2. In the Train.dta file, the second column records the FID of the "Deposits" Training site, and if the "Not Deposits" Training sites have a fuzzy membership, the 1000 + (FID*1000) is reported . If the training site is a "Not Deposit" site and no fuzzy membership is specified, then 1000 reported in column 2. The third column in the train file contains the FID from the SDMUC raster of the unique condition that contained the training site. The image below shows an example of a Train.dta file where fuzzy membership was selected for both "Deposits" and "Not Deposits" training sites.  In this example the 5th record, starting with 1 in column 1, is from a "Not Deposit" training site with an FID of 4. This point occurs in unique condition 1 in the source SDMUC raster. The next record, starting with 2 in column 1, is a "Deposit" training site with an FID of 24. This point occurs in unique condition 2 in the source SDMUC raster.

Next Section Contents Home