Applet for constructing non-linear approximations of 2D data, using various algorithms
(developed by Dr. Evgeny Mirkes, University Of Leicester)

This browser does not support Applets.

Table of symbols
Neighborhood function
Association data points with nodes
Robust association data points with nodes
Learning history
Fraction of variance unexplained
Map extension with same curvature
Learning Dialog
Window for graphic history reflection
Best second node finding
Robust best second node finding
SOM nodes location optimization
SOM
GSOM
GalaSOM
Nodes location optimization for elastic map
REM
RGEM
GalaEM
Nodes location optimization for principal tree
RPT
GalaPT
Brief user guide
Menu panel "Data set"
Common parts of the menu panel for all methods
Menu panel "PCA"
Menu panel "SOM"
Menu panel "GSOM"
Menu panel "GalaSOM"
Menu panel "REM"
Menu panel "RGEM"
Menu panel "GalaEM"
Menu panel "RPT"
Menu panel "GalaPT"
Menu panel "Std. Data"

K-means and K-medoids: applet
KNN and Potential Energy: applet
PCA, SOM and GSOM: applet

Table of symbols

{x_i}¹_m - data points for approximation.

{y_i}¹_n – set of map nodes.

C_i – set of data points associated with the node y_i.

{C_i}¹_n – set of sets data point, associated with nodes.

E – set of data points which is not associated with any node.

||x-y|| – Euclidean distance between points x and y.

R – robust radius, or radius of data-nodes interactions.

L_i – list of nodes connected with node y_i. If j∈L_i then map (tree) include edge y_i y_j.

Neighborhood function

Linear B-splain neighborhood function with radius 3

All self organizing map modification use neighborhood function to select set of nodes taking part in learning with the win nod. Neighborhood function also assign weights to nodes. In this applet Linear B-splain is used as neighborhood function.

Applet options dialog serves to assign “neighborhood radius” h_max (default value is equal 3). Neighborhood radius equals to 3 means three nodes before and after winner node taking part in learning (see picture).

If y_i is winner node and h_max is neighborhood radius, weights of other nodes are calculated by next formula:

h_ij=1-|i-j|/(h_max+1) if |i-j|≤h_max and

h_ij=0 if |i-j|>h_max (the example is given on the left).

Association data points with nodes

To use some learning algorithm (Batch SOM learning algorithm , elastic map learning algorithm, Principal tree) is necessary to associate any data points with nearest node. The set of data points associated with node y_i is defined as C_i= {l:||x_l-y_i||²≤ ||x_l-y_j||², ∀i≠j}

If for some point x_l ||x_l-y_i|| ²=||x_l-y_j||², then point x_l associating with node witch has smallest number.

After association was completed we had sets {C_i}¹_n. The set of not associated points is empty: E=∅.

Robust association data points with nodes

To use some learning algorithm ( Batch SOM learning algorithm , elastic map learning algorithm, Principal tree ) is necessary to associate any data points with nearest node. To use robust modification of those method is necessary to associate any data points with nearest node when distance between data point and nearest node is less or equal than robust radius. Robust radius another name is radius of data nodes interaction. The set of data points associated with node y_i is defined as C_i= {l:||x_l-y_i||²≤ ||x_l-y_j||², ||x_l-y_i||≤ R², ∀i≠j}.

If for some point x_l ||x_l-y_i|| ²=||x_l-y_j||², then point x_l associating with node witch has smallest number.

After association was completed we had sets {C_i}¹_n and set of nonassociated points E={l:||x_l-y_i||>R², ∀i=1,...,n}.

Learning history

During the map learning applet saves nodes location. Every learning history record contain itself number (step number) and locations of all map nodes.

All learning history records are available to look step by step view or as slide show.

More over for this applet there exist possibilities to write history into file. If applet was started without html page as the application you can press key combinations “Ctrl+P”. The dialog will appear to configure output (see left).

In field “select file name for history writing” you have to select the file name by using button “Browse”.

Check box “Write picture” allow to switch on and of writing the picture.

Radio button group “picture file format” is available only if writing picture switched on. This group allow select one of the three formats to write pictures.

Radio button group “Data point size” is available only if eps format of file is selected.

Radio button group “Write history point” allow to select the frequency of output. “Every step” item entails the writing of all history points including steps of nodes optimizations. This option is useful for methods without growing such as SOM, EM and so on. “After change number of nodes” item allows to write history shortly, with fixing important states only. The states after adding or cutting the nodes and after optimization would be written. “After decreasing number of nodes” item is applicable for GalaPT method only. If you use cutting graph grammar, you would write history point after executing whole loop of grammars using.

The group of check boxes “Write parameters” allows to select parameters which would be written:

Number of nodes;

Fraction of variance unexplained (see “ Fraction of variance unexplained".

Energy (Lyapunov function) is applicable for elastic map and principal graphs. For SOM this parameter is not defined and would be written as zero.

Geometrical complexity is the measure of approximator complexity. For SOM and REM this value is calculated as Gc=1/4 ∑_i=2,…,n-1||y_i-1-2y_i+ y_i+1||², for RPT this value is calculated as Gc=∑_i=1,…,n,|L_i|>1||y_i-1/ |L_i|∑_{j∈L_i}y_j ||².

Structural complexity has format “number of k-stars”|“number of (k-1)-stars”|…| “number of 3-stars”||“number of nodes”. The meaning of structural complexity is described in ???.

Construction complexity is the number of grammars applications for initial graph to obtain the current graph. The meaning of construction complexity is described in ???.

Fraction of variance unexplained

Point projection on vector.

In this applets series error of approximation evaluates as fraction of variance unexplained. Basic variance unexplained is calculated as σ_b=1/n ∑_i=1,…,n||x_i-E(x)||, E(x)= 1/n ∑_i=1,…,nx_i.

For given nodes {y_i}¹_n location variance unexplained is calculated. For any data point x, its projection to map is defined. Lets node y_i is nearest to point x. Node y_i is connected with one (y_i is outside node), two (or more for the principal tree) nodes. Let’s nodes y_{i_i},…,y_{i_k} are connected with node y_i.

Calculate point x projection to node y_i and on all edges y_iy_{i_r} (see left). Square of distance between node y_i and point x is calculated as d₀=|| x-y_i||².

To calculate point x projection on edge y_iy_{i_r} evaluate length of projection throw scalar product (see left): q=(x-y_i ,y_{i_r}-y_i)/ ||y_{i_r}-y_i||

This value is measured in vectors (y_{i_r}-y_i) and may be both positive (left figure) and negative (right figure). If value q is negative, then projection on line, which contains nodes y_{i_r} and y_i, is out of edge y_iy_{i_r}. At this case define d_r=∞. If value q is positive, then length of segment ||c-y_i|| evaluates as is equal to q. By Pythagoras theorem distance square between point x and edge y_iy_{i_r} is evaluated as: d_r=||c-x||²=d₀²- q²

Distance square between point x and other edges is evaluated analogously.

Distance square between point x and map is evaluated as D_i=min_r=0,…,kd_r. Map variance unexplained is evaluated as σ_m=∑_i=1,…,nD_i.

Fraction of variance unexplained is evaluated as σ_f=σ_m/σ_b. This value is used as approximation error.

Map extension with same curvature

There are three nodes of broken line z_-2 , z_-1 , z₀. Is necessary to find coordinates of new node z_p such, that angle between vectors (z_p -z₀) and (z₀ -z_-1) is equal to angle between vectors (z₀ -z_-1) and (z_-1 -z_-2). Nodes z_-2 , z_-1 , z₀ , z_p form convex figure (see figure).

If nodes z_-2 , z_-1 , z₀ is known sinφ and cosφ is evaluated as

sinφ= ((z_-1,x -z_-2,x )(z_0,y -z_-1,y )- (z_-1,y -z_-2,y )(z_0,x -z_-1,x ))/ |z_-1 -z_-2||z₀ -z_-1|.

cosφ= ((z_-1,x -z_-2,x )(z_0,x -z_-1,x )+ (z_-1,y -z_-2,y )(z_0,y -z_-1,y ))/ |z_-1 -z_-2||z₀ -z_-1|.

Unit vector along the lines (z₀ -z_-1) is calculated as e=(z₀ -z_-1)/ |z₀ -z_-1|.

To evaluate unit vector along the lines (z_p -z₀) vector e would be multiplied by rotation matrix:

To calculate new node z_p vector (z_p -z₀) is multiplied by l and is added to vector z₀: z_p=z₀+l (z_p -z₀).

Learning Dialog

When button “Learn” is pressed the learning process dialog is displayed (see left). There are current number of learning steps in the first column and fraction of variance unexplained current value in the second column. Displaying data updates twice per second. If number of learning steps is changed between two last updates “Learning…” is reflected in the bottom left corners. If number of learning steps wasn’t changed during some updates “Learning…n” is reflected, where n is number of updates.

Button “Interrupt learning” in the bottom right corner serves to interrupt learning process.

Window for graphic history reflection

Window for graphic history reflection serves for easy visual displaying changing of some map parameters during learning process. This window accessible in universal methods GalaSOM, GalaEM, GalaPT only.

Three diagram are displayed in windows:

1. Current number of nodes

2. Fraction variance unexplained

3. Value of energy for minimization. This value is method specific.

During the history viewing red vertical line indicate current position in window. At the top of window current numerical values of parameters are displayed.

When button “Save to file” is pressed, following data saves to file with user specified name:

1. Step number.

2. Current quantity of nodes.

3. Fraction variance unexplained.

4. Value of energy for minimization.

File name choosing dialog is standard for using operation system.

Best second node finding

Searching radius R is parameter of this method. Fitness is evaluated as number of data points which associated with new node in accordance with "Association data points with nodes”.

Searching algorithm.

1. Searching parameters is initialized as angle is zero φ=0; number of new associated points is zero nap=0; best new node is empty y_p=null.

2. New node coordinates are evaluated as y_0x=y_1x+R cos φ, y_0y=y_1y+R sin φ

3. Set of new associated points is evaluated as C₀={l:||x_l -y₀ ||≤ ||x_l -y_i ||, i>0}

4. If |C₀|>nap, then new node is saved as best: nap=|C₀|, y_p=y₀.

5. Angle is increased: φ=φ+20.

6. If φ<360 then go to step 2.

Robust best second node finding

Searching radius R is parameter of this method. Fitness is evaluated as number of nonassociated data points which associated with new node in accordance with “Robust association data points with nodes".

Searching algorithm.

1. Searching parameters is initialized as angle is zero φ=0; number of new associated points is zero nap=0; best new node is empty y_p=null.

2. New node coordinates are evaluated as y_0x=y_1x+R cos φ, y_0y=y_1y+R sin φ

3. Set of new associated points is evaluated as C₀={l:||x_l -y₀ ||≤ R , l∈E}

4. If |C₀|>nap, then new node is saved as best: nap=|C₀|, y_p=y₀.

5. Angle is increased: φ=φ+20.

6. If φ<360 then go to step 2.

SOM nodes location optimization

Algorithm of nodes location optimization is the same for all SOM learning methods used in this applets series. All applets use Batch SOM learning algorithm.

Nodes location optimization algorithm.

1. Data points are associated with nodes (“ Association data points with nodes”). If used a robust modification ( GalaSOM) then would be used algorithm described in “ Robust association data points with nodes”.

2. If all sets {C_i}¹_n, evaluated on step 1, are equal to sets, evaluated on previous iteration, then learning stops.

3. Add new record to learning history to place new nodes locations.

4. Centroid is calculated for every node: X_i=0 if C_i=∅ and X_i=∑_{j∈C_i}x_j if C_i≠∅.

5. New location is evaluated for every node.

5.1. Following values are calculated for every node:

5.1.1. Sum weighted adjustment wc_i=∑_j=1,…,n|C_j|h_ij , here |C_j| is power of set C_j or number of points associated with node y_i.

5.1.2. Sum centroid Y_i=∑_j=1,…,nX_jh_ij

5.2. If sum weighted adjustment is equals to zero for node y_i, then new node y_i location is the same as old itself location. Else new node y_i location is evaluated as y_i=Y_i /wc_i.

6. Step counter is increased.

7. If step counter is equals to maximum number of optimization step, then learning stops.

8. Calculation returns to step 1.

SOM

There is one parameter – “Neighbourhood radius” in Self-Organizing Map (SOM) learning algorithm. Parameter was assigned before learning start and was used to evaluate “Neighborhood function”.

Developed by Kohonen Batch SOM learning algorithm is used to learn.

Initial nodes location was assigned before learning start ???.

All learning steps are saved into “Learning history”.

If learning starts after previous learning, the last record of “Learning history” is used instead of user defined initial nodes location. Otherwise user defined initial nodes location is used and saved into the first learning history record.

Before learning starts sets of data points, associated with nodes, is assigned to empty set (C_i=∅, i=1,…,n), steps counter is assigned to zero. Maximum number of optimization step is assigned to 100.

Algorithm of nodes locations optimization was described in “SOM nodes location optimization”.

GSOM

Growing Self-Organizing Map (GSOM) method is customized by following parameters assigned before learning start:

“Neighbourhood radius” parameter is used to evaluate “Neighborhood function”.

“Maximum number of nodes” parameter restricts size of map.

“Stop when fraction of variance unexplained percent is less than” parameter define threshold for fraction of variance unexplained.

Developed by Kohonen Batch SOM learning algorithm is used to learn.

Initial nodes location was assigned before learning start ???.

All learning steps are saved into “Learning history”.

If learning starts after previous learning, the last record of “Learning history” is used instead of user defined initial nodes location. Otherwise user defined initial nodes location is used and saved into the first learning history record.

Maximum number of optimization step is assigned to 20.

Map learning algorithm

1. Basic variance unexplained is calculated (“ Fraction of variance unexplained”).

2. Before loop starts sets of data points of associated with nodes is assigned to empty set (C_i=∅, i=1,…,n), steps counter is assigned to zero.

3. Nodes location optimization (“ SOM nodes location optimization”)

4. Fraction of variance unexplained fvu is evaluated.

5. If fraction of variance unexplained is less than user defined threshold then learning loop is broken off.

6. If number of nodes is equal to user defined maximum number of nodes then learning loop is broken off.

7. If user press button “Interrupt learning” in Learning Dialog then learning loop is broken off.

8. Map growing.

8.1. In case of map contains only one node do:

8.1.1. New node is located to place of randomly selected data point.

8.1.2. New learning history record is created. Old and new nodes are saved into record.

8.2. In case of map contains more than one node do:

8.2.1. Sets C₁ and C_n is evaluated (see “Association data points with nodes”).

8.2.2. If those sets are empty then map growing is impossible and growing loop is broken off.

8.2.3. The first node quality is evaluated d_f=∑_x∈C₁||x-y₁||²

8.2.4. The last node quality is evaluated d_l=∑_{x∈C_n}||x-y_n||²

8.2.5. If d_f>d_l, then

8.2.5.1. New nodes are created by attaching copy of the first edge to first node: y₀=2y₁-y₂.

8.2.5.2. New set of nodes fraction of variance unexplained fvu_new is evaluated.

8.2.5.3. If fvu_new< fvu then new learning history record is created, new set of nodes is saved in record, learning is continued from step 2.

8.2.5.4. New node y₀ is removed. New nodes are created by attaching copy of the last edge to the last node: y_n+1=2y_n-y_n-1.

8.2.5.5. New set of nodes fraction of variance unexplained fvu_new is evaluated.

8.2.5.6. If fvu_new< fvu then new learning history record is created, new set of nodes is saved in record, learning is continued from step 2.

8.2.5.7. Else growing loop is broken off.

8.2.6. If d_f < d_l, then

8.2.6.1. New nodes are created by attaching copy of the last edge to the last node: y_n+1=2y_n-y_n-1.

8.2.6.2. New set of nodes fraction of variance unexplained fvu_new is evaluated.

8.2.6.3. If fvu_new< fvu then new learning history record is created, new set of nodes is saved in record, learning is continued from step 2.

8.2.6.4. New node y₀ is removed. New nodes are created by attaching copy of the first edge to the first node: y₀=2y₁-y₂.

8.2.6.5. New set of nodes fraction of variance unexplained fvu_new is evaluated.

8.2.6.6. If fvu_new< fvu then new learning history record is created, new set of nodes is saved in record, learning is continued from step 2.

8.2.6.7. Else growing loop is broken off.

GalaSOM

GalaSOM is universal method realizing some abilities not included to student applet. There are many options for GalaSOM.

“Neighbourhood radius” parameter is used to evaluate “Neighborhood function”.

The second parameters define growing procedure to use. Next values of this parameter available.

None – map growing disabled.

Line growing – map growing by adding copy of outside edges to border nodes.

Curve growing – map growing by expansion with same curvature.

Expansive – map growing by division most expansive edge.

Loaded – map growing by division most expansive edge. Most loaded edge is edge with maximal quantity of data points associated with nodes of edge.

Border – map growing by bisection outside edges.

“Stop when fraction of variance unexplained percent is less than” parameter define threshold for fraction of variance unexplained.

Checkbox "Use robust" switch on/off using robust version of algorithm.

“Data - Node interaction radius” R parameter is used to Robust association data points with nodes. This parameter also is named robust radius.

“Maximum number of nodes” parameter restricts size of map.

If checkbox “Show error diagrams” switched on, Window for graphic history reflection is displayed. There is no energy to minimize for SOM. As following this parameter not displayed in window.

Developed by Kohonen Batch SOM learning algorithm is used to learn.

Initial nodes location was assigned before learning start ???.

All learning steps are saved into “Learning history”.

If learning starts after previous learning, the last record of “Learning history” is used instead of user defined initial nodes location. Otherwise user defined initial nodes location is used and saved into the first learning history record.

Maximum number of optimization step is assigned to 20.

Map learning algorithm

1. Basic variance unexplained is calculated (“ Fraction of variance unexplained”).

2. Before loop starts sets of data points of associated with nodes is assigned to empty set (C_i=∅, i=1,…,n), steps counter is assigned to zero.

3. Nodes location optimization (“ SOM nodes location optimization”)

4. If growing procedure is None growing loop is broken off.

5. Fraction of variance unexplained fvu is evaluated.

6. If fraction of variance unexplained is less than user defined threshold then learning loop is broken off.

7. If number of nodes is equal to user defined maximum number of nodes then learning loop is broken off.

8. If user press button “Interrupt learning” in Learning Dialog then learning loop is broken off.

9. Map growing.

9.1. In case of map contains only one node do:

9.1.1. Let’s new node is empty y_p=null

9.1.2. If robust version is used and there exist nonassociated points then

9.1.2.1. Execute “Robust best second node finding” with r=R.

9.1.2.2. If y_p=null, then execute “Robust best second node finding” with r=2R.

9.1.3. If y_p=null, then

9.1.3.1. If robust version is used then r=R.

9.1.3.2. Else r=50.

9.1.3.3. Execute “Best second node finding”.

9.1.4. If y_p=null, then growing is impossible and is broken off.

9.1.5. New learning history record is created. Old and new nodes are saved into record.

9.2. If number of nodes is more than one then

9.2.1. If growing procedure is “Line growing” or “Curve growing”, then

9.2.1.1. If growing procedure is "Line growing" or number of nodes is two (there is no data for expansion with save curvature), then

9.2.1.1.1. The first edge extension coefficient k is evaluated.

9.2.1.1.1.1. Lets d equal to distance between the first and the second nodes.

9.2.1.1.1.2. If robust radius is less than work desk size but greater than d, then k=R/d, else k=1.

9.2.1.1.2. New node is calculated as y₀=y₁+k( y₁-y₂).

9.2.1.1.3. The first edge extension coefficient k is evaluated.

9.2.1.1.3.1. Lets d equal to distance between the last and the penultimate nodes.

9.2.1.1.3.2. If robust radius is less than work desk size but greater than d, then k=R/d, else k=1.

9.2.1.1.4. New node is calculated as y_n+1=y_n+k( y_n-y_n-1).

9.2.1.2. If growing procedure is “Curve growing” and number of nodes is greater than two, then

9.2.1.2.1. The first edge extension length l is evaluated

9.2.1.2.1.1. Lets l equal to the first edge length.

9.2.1.2.1.2. If robust radius is less than work desk size but greater than l, then l=R.

9.2.1.2.2. New node y₀ is calculated by executing “Map extension with same curvature” where z₀=y₁, z_-1=y₂, z_-3=y₃

9.2.1.2.3. The last edge extension length l is evaluated.

9.2.1.2.3.1. Lets l equal to the last edge length.

9.2.1.2.3.2. If robust radius is less than work desk size but greater than l, then l=R.

9.2.1.2.4. New node y_n+1 is calculated by executing “Map extension with same curvature” where z₀=y_n, z_-1=y_n-1, z_-3=y_n-2

9.2.1.3. Node to expand map is selected. Lets m₀=0, m_n+1=0.

9.2.1.4. If there exist nonassociated data points, then m₀=|{p∈E, ||x_p-y₀||< R}|, m_n+1=|{p∈E, ||x_p-y_n+1||< R}|.

9.2.1.5. If m₀=m_n+1=0, then m₀=|C₀|, m_n+1=|C_n+1|. Association is result of execution of “Association data points with nodes”.

9.2.1.6. If m₀=m_n+1=0, then growing is impossible and is broken off.

9.2.1.7. New learning history record is created. All old nodes is copied to created record.

9.2.1.8. If m₀>0 then node y₀ includes to created record.

9.2.1.9. If m_n+1>0 then node y_n+1 includes to created record.

9.2.2. If growing procedure is “Expansive”, then

9.2.2.1. New learning history record is created.

9.2.2.2. Most expansive edge defines.

9.2.2.3. New node creates in the middle of most expansive edge.

9.2.2.4. All nodes copying to created history record.

9.2.3. If growing procedure is “Loaded”, then

9.2.3.1. New learning history record is created.

9.2.3.2.Most loaded edge defines. Load of edge y_i y_i+1 is |C_i|+|C_i+1|.

9.2.3.3. New node creates in the middle of most loaded edge.

9.2.3.4. All nodes copying to created history record.

9.2.4. If growing procedure is “Border”, then

9.2.4.1. New learning history record is created.

9.2.4.2. All old nodes is copied to created record.

9.2.4.3. Two nodes creates in middle of the first and the last edges.

10. Go to step 2.

Nodes location optimization for elastic map

Algorithm of nodes location optimization is the same for all elastic map learning methods. This method has four parameters:

“Stretching module” λ

“Bending module” μ

“Data - Node interaction radius” R parameter is used to "Robust association data points with nodes". This parameter also is named robust radius.

Maximum number of optimization steps.

Nodes location optimization algorithm.

1. Data points are associated with nodes (“Robust association data points with nodes”). If used a not robust modification (GalaEM) use algorithm described in “Association data points with nodes”.

2. If all sets {C_i}¹_n, evaluated on step 1, are equals to sets, evaluated on previous iteration, then learning stops.

3. Add new record to learning history to place new nodes locations.

4. Centroid is calculated for every node: X_i=0 if C_i=∅ and X_i=∑_{j∈C_i}x_j if C_i≠∅.

5. New location is evaluated for every node.

5.1. If map contains only one node then new node locations is evaluated as y₁=X₁ /|C₁|

5.2. If map contain more than one node then:

5.2.1. Matrix of linear algebraic equation system is formed as sum A=A¹+λA²+μA³, where matrix A¹ fits with data approximation, A² fits with edges stretching limitation, A³ fits with edges bending limitation. To form matrix following rules are used.

5.2.1.1.

5.2.1.2. Filling in matrix A² is depends on number of nodes.

5.2.1.2.1. In case of 2 nodes matrix is

5.2.1.2.2. In case of more than 2 nodes matrix i

5.2.1.3. Filling in matrix A³ is depends on number of nodes.

5.2.1.3.1. In case of 2 nodes matrix A³ is zero matrix.

5.2.1.3.2. In case of 3 nodes matrix is

5.2.1.3.3. In case of 4 nodes matrix is

5.2.1.3.4. In case of more than 4 nodes matrix is

5.2.2. The right parts of linear algebraic equations Ay=b is b_i=X_i /m

5.2.3. To solve linear algebraic equations method of Gauss elimination is used.

5.2.4. Solution of linear algebraic equations is the new map nodes locations.

6. Step counter is increased.

7. If step counter is equals to maximum number of optimization step, then learning stops.

8. Calculation returns to step 1.

REM

Robust Elastic Map (REM) method is customized by following parameters assigned before learning start:

“Stretching module” λ

“Bending module” μ

“Data - Node interaction radius” R parameter is used to "Robust association data points with nodes". This parameter also is named robust radius. Only robust method version uses.

Initial nodes location was assigned before learning start. ???

All learning steps are saved into “Learning history”.

If learning starts after previous learning, the last record of “Learning history” is used instead of user defined initial nodes location. Otherwise user defined initial nodes location is used and saved into the first learning history record.

Before learning starts sets of data points, associated with nodes, is assigned to empty set ((C_i=∅, i=1,…,n), steps counter is assigned to zero.

To learn map “Nodes location optimization for elastic map” is used with maximum number of optimization step equals 100.

RGEM

Robust Growing Elastic Map (RGEM) method is customized by following parameters assigned before learning start:

“Stretching module” λ

“Bending module” μ

“Data - Node interaction radius” R parameter is used to "Robust association data points with nodes". This parameter also is named robust radius. Only robust method version uses.

“Maximum number of nodes” parameter restricts size of map.

“Stop when fraction of variance unexplained percent is less than” parameter define threshold for fraction of variance unexplained.

Initial nodes location was assigned before learning start. ???

All learning steps are saved into “Learning history”.

If learning starts after previous learning, the last record of “Learning history” is used instead of user defined initial nodes location. Otherwise user defined initial nodes location is used and saved into the first learning history record.

Maximum number of optimization step is assigned to 20.

Map learning algorithm.

1. Basic variance unexplained is calculated (“Fraction of variance unexplained”).

2. Before loop starts sets of data points, associated with nodes, is assigned to empty set (C_i=∅, i=1,…,n), steps counter is assigned to zero.

3. Nodes location optimization (“Nodes location optimization for elastic map”)

4. Fraction of variance unexplained fvu is evaluated.

5. If fraction of variance unexplained is less than user defined threshold then learning loop is broken off.

6. If number of nodes is equal to user defined maximum number of nodes then learning loop is broken off.

7. Map growing

7.1. In case of map contains only one node do:

7.1.1. Let’s new node is empty y_p=null

7.1.2. If there exist nonassociated points then

7.1.2.1. Execute “Robust best second node finding” with r=R.

7.1.2.2. If y_p=null, then execute “Robust best second node finding” with r=2R.

7.1.3. If y_p=null, then r=R.

7.1.4. Execute “Best second node finding”.

7.1.5. If y_p=null, then growing is impossible and is broken off.

7.1.6. New learning history record is created. Old and new nodes are saved into record.

7.2. In case of map contains more than one node then prepare projects to attach new adges to the first and the last nodes of map.

7.2.1. In case of map contains only two nodes linear attachments are prepared:

7.2.1.1. Edge stretching coefficient is evaluated.

7.2.1.1.1. If robust radius is less than work desk size, then k=R/||y₁-y₂||

7.2.1.1.2. Else k=1.

7.2.1.1.3. If k<1, then k=1.

7.2.1.2. New nodes locations are evaluated as: y₀=y₁+k(y₁-y₂), y₃=y₂+k(y₂-y₁)

7.2.2. In case of map contains more than two nodes map extension with same curvature is used.

7.2.2.1. Edge length to attach to the first node is calculated. If robust radius is less than work desk size and is more than the first edge length, then l_b=R, else l_b=|y₂-y₁|.

7.2.2.2. New first map node t₀ is evaluated as Map extension with same curvature with nodes y₃, y₂, y₁ and length l_b.

7.2.2.3. Edge length to attach to the last node is calculated. If robust radius is less than work desk size and is more than the last edge length, then l_e=R, else l_e=|y_n-y_n-1|.

7.2.2.4. New last map node y_n+1 is evaluated as Map extension with same curvature with nodes y_n+2, y_n+1, y_n and length l_e.

7.2.3. Best node to expand map is chosen.

7.2.3.1. m₀ and m_n+1 is number of points, associated with new nodes. Let's they are zero

7.2.3.2. If nonassociated points set is not empty, values m₀ and m_n+1 is evaluated as: m₀=|{l:l∈E, ||x_l-y₀||≤R}], m_n+1=|{l:l∈E, ||x_l-y_n+1||≤R}].

7.2.3.3. In case of m₀=m_n+1=0, values m₀ and m_n+1 are evaluated as: m₀=|{l:l∈E, ||x_l-y₀||≤||x_l-y_i||, i=1,…,n}], m_n+1=|{l:l∈E, ||x_l-y_n+1||≤||x_l-y_i||, i=1,…,n}}].

7.2.4. If m₀=m_n+1=0, then growing is impossible and growing loop is broken off.

7.2.5. New history line record is created.

7.2.6. In case of m₀=m_n+1>0, both new nodes and old nodes are saved into learning history record.

7.2.7. In case of m₀>m_n+1, new node y₀ and old nodes are saved into learning history record.

7.2.8. In case of m₀< m_n+1, new node y_n+1 and old nodes are saved into learning history record.

8. Go to step 2.

GalaEM

GalaEM is universal method realizing some elastic map abilities not included to student applet. There are many options for GalaEM.

“Stretching module” λ

“Bending module” μ

The third parameter define growing procedure to use. Next values of this parameter available.

None – map growing disabled.

Line growing – map growing by adding copy of outside edges to border nodes.

Curve growing – map growing by expansion with same curvature.

Expansive – map growing by division most expansive edge.

Loaded – map growing by division most expansive edge. Most loaded edge is edge with maximal quantity of data points associated with nodes of edge.

Border – map growing by bisection outside edges.

“Stop when fraction of variance unexplained percent is less than” parameter define threshold for fraction of variance unexplained.

Checkbox "Use robust" switch on/off using robust version of algorithm.

“Data - Node interaction radius” R parameter is used to Robust association data points with nodes. This parameter also is named robust radius.

“Maximum number of nodes” parameter restricts size of map.

If checkbox “Show error diagrams” switched on, Window for graphic history reflection is displayed. The elastic map energy to minimization is calculated by following formulas. If map is not robust:
En=1/m ∑_i=1,…,n∑_{x∈C_i} ||x-y_i||²+λ∑_i=2,…,n||y_i-y_i-1||² +μ∑_i=2,…,n-1||y_i-1-2y_i+y_i+1||²
If map is robust: En=1/m (∑_i=1,…,n∑_{x∈C_i} ||x-y_i||²+|E|R²) +λ∑_i=2,…,n||y_i-y_i-1||² +μ∑_i=2,…,n-1||y_i-1-2y_i+y_i+1||²

Initial nodes location was assigned before learning start ???.

All learning steps are saved into “Learning history”.

If learning starts after previous learning, the last record of “Learning history” is used instead of user defined initial nodes location. Otherwise user defined initial nodes location is used and saved into the first learning history record.

Maximum number of optimization step is assigned to 20.

Map learning algorithm

1. Basic variance unexplained is calculated (“ Fraction of variance unexplained”).

2. Before loop starts sets of data points of associated with nodes is assigned to empty set (C_i=∅, i=1,…,n), steps counter is assigned to zero.

3. Nodes location optimization (“ Nodes location optimization for elastic map”)

4. If growing procedure is None growing loop is broken off.

5. Fraction of variance unexplained fvu is evaluated.

6. If fraction of variance unexplained is less than user defined threshold then learning loop is broken off.

7. If number of nodes is equal to user defined maximum number of nodes then learning loop is broken off.

8. If user press button “Interrupt learning” in Learning Dialog then learning loop is broken off.

9. Map growing.

9.1. In case of map contains only one node do:

9.1.1. Let’s new node is empty y_p=null

9.1.2. If robust version is used and there exist nonassociated points then

9.1.2.1. Execute “Robust best second node finding” with r=R.

9.1.2.2. If y_p=null, then execute “Robust best second node finding” with r=2R.

9.1.3. If y_p=null, then

9.1.3.1. If robust version is used then r=R.

9.1.3.2. Else r=50.

9.1.3.3. Execute “Best second node finding”.

9.1.4. If y_p=null, then growing is impossible and is broken off.

9.1.5. New learning history record is created. Old and new nodes are saved into record.

9.2. If number of nodes is more than one then

9.2.1. If growing procedure is “Line growing” or “Curve growing”, then

9.2.1.1. If growing procedure is "Line growing" or number of nodes is two (there is no data for expansion with save curvature), then

9.2.1.1.1. The first edge extension coefficient k is evaluated.

9.2.1.1.1.1. Lets d equal to distance between the first and the second nodes.

9.2.1.1.1.2. If robust radius is less than work desk size but greater than d, then k=R/d, else k=1.

9.2.1.1.2. New node is calculated as y₀=y₁+k( y₁-y₂).

9.2.1.1.3. The first edge extension coefficient k is evaluated.

9.2.1.1.3.1. Lets d equal to distance between the last and the penultimate nodes.

9.2.1.1.3.2. If robust radius is less than work desk size but greater than d, then k=R/d, else k=1.

9.2.1.1.4. New node is calculated as y_n+1=y_n+k( y_n-y_n-1).

9.2.1.2. If growing procedure is “Curve growing” and number of nodes is greater than two, then

9.2.1.2.1. The first edge extension length l is evaluated

9.2.1.2.1.1. Lets l equal to the first edge length.

9.2.1.2.1.2. If robust radius is less than work desk size but greater than l, then l=R.

9.2.1.2.2. New node y₀ is calculated by executing “Map extension with same curvature” where z₀=y₁, z_-1=y₂, z_-3=y₃

9.2.1.2.3. The last edge extension length l is evaluated.

9.2.1.2.3.1. Lets l equal to the last edge length.

9.2.1.2.3.2. If robust radius is less than work desk size but greater than l, then l=R.

9.2.1.2.4. New node y_n+1 is calculated by executing “Map extension with same curvature” where z₀=y_n, z_-1=y_n-1, z_-3=y_n-2

9.2.1.3. Node to expand map is selected. Lets m₀=0, m_n+1=0.

9.2.1.4. If there exist nonassociated data points, then m₀=|{p∈E, ||x_p-y₀||< R}|, m_n+1=|{p∈E, ||x_p-y_n+1||< R}|.

9.2.1.5. If m₀=m_n+1=0, then m₀=|C₀|, m_n+1=|C_n+1|. Association is result of execution of “Association data points with nodes”.

9.2.1.6. If m₀=m_n+1=0, then growing is impossible and is broken off.

9.2.1.7. New learning history record is created. All old nodes is copied to created record.

9.2.1.8. If m₀>0 then node y₀ includes to created record.

9.2.1.9. If m_n+1>0 then node y_n+1 includes to created record.

9.2.2. If growing procedure is “Expansive”, then

9.2.2.1. New learning history record is created.

9.2.2.2. Most expansive edge defines.

9.2.2.3. New node creates in the middle of most expansive edge.

9.2.2.4. All nodes copying to created history record.

9.2.3. If growing procedure is “Loaded”, then

9.2.3.1. New learning history record is created.

9.2.3.2.Most loaded edge defines. Load of edge y_i y_i+1 is |C_i|+|C_i+1|.

9.2.3.3. New node creates in the middle of most loaded edge.

9.2.3.4. All nodes copying to created history record.

9.2.4. If growing procedure is “Border”, then

9.2.4.1. New learning history record is created.

9.2.4.2. All old nodes is copied to created record.

9.2.4.3. Two nodes creates in middle of the first and the last edges.

10. Go to step 2.

Nodes location optimization for principal tree

All principal tree methods uses nodes location optimization procedure in some place of algorithms. Abstraction this procedure is needed. Nodes location optimization procedure parameters is listed below.

“Stretching module” λ

“Bending module” μ

“Scaling exponent” α

Necessity to save optimization history.

Optimization algorithm

1. Data points are associated with nodes

1.1. If used a non robust modification “Association data points with nodes” executed.

1.2. If used a robust modification “Robust association data points with nodes” executed.

2. If all sets {C_i}¹_n is empty then algorithm stops.

3. If all sets {C_i}¹_n, evaluated on step 1, are equals to sets, evaluated on previous iteration, then learning stops.

4. Step counter is increased.

5. If step counter is equals to maximum number of optimization step, then learning stops.

6. If save history is necessary, add new record of learning history to place new nodes locations is created, else pseudo history record is created.

7. Centroids are calculated for every node: X_i=0 if C_i=∅ and X_i=∑_{j∈C_i}x_j if C_i≠∅.

8. New location is evaluated for every node.

8.1. If map contains only one node then new node locations is evaluated as y₁=X₁ /|C₁|

8.2. If map contain more than one node then:

8.2.1. Matrix of linear algebraic equation system is formed as sum A=A¹+λA²+μA³, where matrix A¹ fits with data approximation, A² fits with edges stretching limitation, A³ fits with edges bending limitation. To form matrix following rules are used.

8.2.1.1.

8.2.1.2. Filling in matrix A² is calculated by next algorithm.

8.2.1.2.1. Let's all elements of matrix A² are zero a_ij²=0.

8.2.1.2.2. For any node y_i execute next procedure. For any j∈L_i, j>i do next changes in matrix A²

8.2.1.2.2.1. a_ii²←a_ii²+1

8.2.1.2.2.2. a_jj²←a_jj²+1

8.2.1.2.2.3. a_ij²←a_ij²-1

8.2.1.2.2.4. a_ji²←a_ji²-1

8.2.1.3. Matrix A³ is calculated by next algorithm.

8.2.1.3.1. Let's all elements of matrix A² are zero a_ij³=0.

8.2.1.3.2. For any node y_i with |L_i|>1 execute next procedure

8.2.1.3.2.1. The diagonal element is changed as a_ii³←a_ii³+|L_i|^α

8.2.1.3.2.2. Following elements of i-th row is changed for any j∈L_i: a_ij³←a_ij³-|L_i|^α-1

8.2.1.3.2.3. Following elements of j-th row is changed for any j∈L_i:

8.2.1.3.2.3.1. a_ji³←a_ji³-|L_i|^α-1

8.2.1.3.2.3.2. Following elements of j-th row is changed for any k∈L_i: a_jk³←a_jk³-|L_i|^α-2

8.2.2. The right parts of linear algebraic equations Ay=b is b_i=X_i /m

8.2.3. To solve linear algebraic equations method of Gauss elimination is used.

8.2.4. Solution of linear algebraic equations is the new map nodes locations.

9. Calculation returns to step 1.

RPT

Robust principal tree (topological grammar) method uses growing grammar only. Method has following parameters.

“Stretching module” λ

“Bending module” μ

“Data - Node interaction radius” R parameter is used to Robust association data points with nodes. This parameter also is named robust radius.

“Stop when fraction of variance unexplained percent is less than” parameter define threshold for fraction of variance unexplained.

“Maximum number of nodes” parameter restricts size of map.

Initial nodes location was assigned before learning start ???.

All learning steps are saved into “Learning history”.

If learning starts after previous learning, the last record of “Learning history” is used instead of user defined initial nodes location. Otherwise user defined initial nodes location is used and saved into the first learning history record.

“Nodes location optimization for principal tree” is used on some algorithms steps. This method used robust version only. “Scaling exponent” sets to zero. Maximum number of optimization step is assigned to 20.

The principal tree energy to minimization is calculated as En=1/m (∑_i=1,…,n∑_{x∈C_i} ||x-y_i||²+|E|R²) +λ∑_i=1,…,n∑_{j∈L_i, j>i}||y_i-y_j||² +μ∑_i=1,…,n,|L_i|>1 ||y_i-1/|L_i| ∑_{j∈L_i}y_j||²

Tree learning algorithm.

1. Basic variance unexplained is calculated (“ Fraction of variance unexplained”).

2. Nods optimization with history saving executes.

3. Fraction of variance unexplained fvu is evaluated.

4. If fraction of variance unexplained is less than user defined threshold then learning loop is broken off.

5. If number of nodes is equal to user defined maximum number of nodes then learning loop is broken off.

6. If user press button “Interrupt learning” in Learning Dialog then learning loop is broken off.

7. Loop of tree structure changing.

7.1. In case of map contains only one node do:

7.1.1. Let’s new node is empty y_p=null

7.1.2. If there exist nonassociated points then

7.1.2.1. Execute “Robust best second node finding” with r=R.

7.1.2.2. If y_p=null, then execute “Robust best second node finding” with r=2R.

7.1.3. If y_p=null, then r=R, execute “Best second node finding”.

7.1.4. If y_p=null, then growing is impossible and is broken off.

7.1.5. New learning history record is created. Old and new nodes are saved into record.

7.2. If number of nodes is more than one then

7.2.1. Optimal energy value sets to ∞, rule number sets to zero, best node number sets to zero.

7.2.2. “Add node and edge” rule using. Next procedure executes for any node y_i.

7.2.2.1. New node adds to node y_i with zero flexible energy of new node y_i star: y_new=(|L_i|+1)y_i-∑_{j∈L_i}y_i

7.2.2.2. Nods optimization without history saving executes.

7.2.2.3. Minimizing energy value is evaluated for the step 7.2.2.2 result tree.

7.2.2.4. If evaluated minimizing energy value is less than optimal energy value then optimal energy value sets to evaluated minimizing energy value, best node number sets to i, rule number sets to 1.

7.2.3. “Add node by edge cutting” rule using. Next procedure executes for any edge:

7.2.3.1. New node adds to middle of edge.

7.2.3.2. Nods optimization without history saving executes.

7.2.3.3. Minimizing energy value is evaluated for the step 8.2.2.2.2 result tree.

7.2.3.4. If evaluated minimizing energy value is less than optimal energy value then optimal energy value sets to evaluated minimizing energy value, best nodes numbers sets to nodes of edge, rule number sets to 2.

7.2.4. New learning history record is created.

7.2.5. If rule number is 1, then best node is added. Old and new nodes are saved into record.

7.2.6. If rule number is 2, then new node add to middle of best nodes edge. Old and new nodes are saved into record.

8. Go to step 2

GalaPT

GalaPT is the principal tree (topological grammar) universal method, included many abilities excluded from student applet. GalaPT has many parameters.

“Stretching module” λ

“Bending module” μ

“Scaling exponent” α

The third parameters “Scaling exponent” α define factor in “Nodes location optimization for principal tree” and energy for minimization.

“Growing grammar” group of checkboxes is define growing grammar rules to use.

Checkbox “Add node and edge” switches on/off using of adding new node to existing node rule.

Checkbox “Add node by edge cutting” switches on/off using of division edges rule.

“Shrinking grammar” group of checkboxes is define shrinking grammar rules to use.

Checkbox “Delete outside node” switches on/off using of removing outside node rule.

Checkbox “Delete edge by nodes union” switches on/off using of edge removing rule. When edge is removed its nodes (stars) is united.

The “Using frequency” fields group serves to control using growing grammar and shrinking grammar frequency. At every loop the first do some (upper field) steps of growing grammar, then some (down field) steps of shrinking grammar.

“Stop when fraction of variance unexplained percent is less than” parameter define threshold for fraction of variance unexplained.

Checkbox «Use robust» switch on/off using robust version of algorithm.

“Data - Node interaction radius” R parameter is used to Robust association data points with nodes. This parameter also is named robust radius.

“Maximum number of nodes” parameter restricts size of map.

If checkbox “Show error diagrams” switched on, Window for graphic history reflection is displayed. The principal tree energy to minimization is calculated by following formulas. If tree is not robust:
En=1/m ∑_i=1,…,n∑_{x∈C_i}||x-y_i||² +λ∑_i=1,…,n∑_{j∈L_i, j>i}||y_i-y_j|| ²+μ∑_i=1,…,n,|L_i|>1 ||y_i-1/|L_i| ∑_{j∈L_i}y_j||²
If tree is robust
En=1/m (∑_i=1,…,n∑_{x∈C_i} ||x-y_i||²+|E|R²) +λ∑_i=1,…,n∑_{j∈L_i, j>i}||y_i-y_j|| ² +μ∑_i=1,…,n,|L_i|>1 ||y_i-1/|L_i| ∑_{j∈L_i}y_j||²

Initial nodes location was assigned before learning start ???.

All learning steps are saved into “Learning history”.

If learning starts after previous learning, the last record of “Learning history” is used instead of user defined initial nodes location. Otherwise user defined initial nodes location is used and saved into the first learning history record.

Maximum number of optimization step is assigned to 20.

Tree learning algorithm

1. Basic variance unexplained is calculated (“ Fraction of variance unexplained”).

2. Growing indicator is set to true (the growing rules executed the first). Tree changing operation number is set to user defined growing grammar steps.

3. Nods optimization with history saving executes.

4. Fraction of variance unexplained is evaluated.

5. If fraction of variance unexplained is less than user defined threshold then learning loop is broken off.

6. If number of nodes is equal to user defined maximum number of nodes then learning loop is broken off.

7. If user press button “Interrupt learning” in Learning Dialog then learning loop is broken off.

8. Loop of tree structure changing

8.1. In case of map contains only one node do:

8.1.1. Let’s new node is empty y_p=null

8.1.2. If robust version is used and there exist nonassociated points then

8.1.2.1. Execute “Robust best second node finding” with r=R.

8.1.2.2. If y_p=null, then execute “Robust best second node finding” with r=2R.

8.1.3. If y_p=null, then

8.1.3.1. If robust version is used then r=R.

8.1.3.2. Else r=50.

8.1.3.3. Execute “Best second node finding”

8.1.4. If y_p=null, then growing is impossible and is broken off.

8.1.5. New learning history record is created. Old and new nodes are saved into record.

8.2. If number of nodes is more than one then

8.2.1. Optimal energy value sets to ∞, rule number sets to zero, best node number sets to zero.

8.2.2.If growing indicator is true, then

8.2.2.1. If checkbox “Add node and edge” is switched on, then next procedure executes for any node y_i.

8.2.2.1.1. New node adds to node y_i with zero flexible energy of new node y_i star: y_new=(|L_i|+1)y_i-∑_{j∈L_i}y_i

8.2.2.1.2. Nods optimization without history saving executes.

8.2.2.1.3. Minimizing energy value is evaluated for the step 8.2.2.1.2 result tree.

8.2.2.1.4. If evaluated minimizing energy value is less than optimal energy value then optimal energy value sets to evaluated minimizing energy value, best node number sets to i, rule number sets to 1.

8.2.2.2. If checkbox “Add node by edge cutting” is switched on, then next procedure executes for any edge:

8.2.2.2.1. New node adds to middle of edge.

8.2.2.2.2. Nods optimization without history saving executes.

8.2.2.2.3. Minimizing energy value is evaluated for the step 8.2.2.2.2 result tree.

8.2.2.2.4. If evaluated minimizing energy value is less than optimal energy value then optimal energy value sets to evaluated minimizing energy value, best nodes numbers sets to nodes of edge, rule number sets to 2.

8.2.2.3. New learning history record is created.

8.2.2.4. If rule number is 1, then best node is added. Old and new nodes are saved into record.

8.2.2.5. If rule number is 2, then new number add to middle of best nodes edge. Old and new nodes are saved into record.

8.2.3. If growing indicator is false, then

8.2.3.1. If checkbox “Delete outside node” is switched on, then next procedure executes for any nodes with |L_i|=1

8.2.3.1.1. The y_i node removes.

8.2.3.1.2. Nods optimization without history saving executes.

8.2.3.1.3. Minimizing energy value is evaluated for the step 8.2.3.1.2 result tree.

8.2.3.1.4. If evaluated minimizing energy value is less than optimal energy value then optimal energy value sets to evaluated minimizing energy value, best nodes numbers sets to i, rule number sets to 3.

8.2.3.2. If checkbox “Delete edge by nodes union” is switched on, then next procedure executes for any edge connects not outside nodes y_i and y_j (∀i≠j):

8.2.3.2.1. All nodes connected with y_j connects with y_i. The y_j node removes.

8.2.3.2.2. Nods optimization without history saving executes.

8.2.3.2.3. Minimizing energy value is evaluated for the step 8.2.3.2.2 result tree.

8.2.3.2.4. If evaluated minimizing energy value is less than optimal energy value then optimal energy value sets to evaluated minimizing energy value, best nodes numbers sets to nodes of edge, rule number sets to 4.

8.2.3.3. New learning history record is created.

8.2.3.4. If rule number is 3, then best node removes. Kept nodes are copied into record.

8.2.3.5. If rule number is 4, then best nodes edge removes. Kept nodes are copied into record.

8.2.4. Tree changing operations number is decreased by 1. If it equals zero, then

8.2.4.1. Growing indicator is set to not growing indicator.

8.2.4.2. If growing indicator is true, tree changing operation number is set to user defined growing grammar steps.

8.2.4.3. If growing indicator is false, tree changing operation number is set to user defined shrinking grammar steps.

8.2.4.4. If tree changing operation number equals to zero (it is possible, when user defined shrinking grammar steps equals to zero), growing indicator is set to true, tree changing operation number is set to user defined growing grammar steps.

9. Go to step 3

Brief guide for this applet

The applet contains several menu panels (top left), the University of Leicester label (top right), work desk (center) and the label of the Attribution 3.0 Creative Commons publication license (bottom).

To wisit the web-site of the University of Leicester click on the University of Leicester logo.

To read the Creative Commons publication license click on the bottom license panel.

Data set

The first menu panel allows you to create and edit the data set. Every data point is displayed as a small circle on the work desk.

The first part of the Data set panel contains one big coloured rectangle and a palette built from six small differently coloured rectangles. To select one of the six colors click on a small rectangle. The big coloured rectangle displays the currently selected color. In this applet, this is just the colour of the cursor. The colours are used for data points in the applets K-means and K-medoids and KNN and Potential Energy to distinguish points from different classes and clusters.

The third part of the Data set panel contains six buttons. The first three buttons change the type of cursor brush.

The button One point switches the brush to add single points. Every mouse click on the work desk will add a point to the data set.

The button Scatter allows you to add several points by one mouse click. You can choose the number of added points in the Number of points spinner at the middle part of the menu panel. Points are scattered randomly in a circle which radius is determined by the slider Caliber in the middle part of menu panel.

The button Erase switches the mouse cursor to the eraser. When you click the eraser mouse cursor on the work desk, all points, whose centres are covered by the cursor, are removed from the work desk.

The button Select is not used in this applet.

When you press the Random button several points are added on the work desk. The number of points is defined by the Number of points spinner. The locations of new points are generated randomly on whole work desk.

The button Clear all completely clears the work desk. It removes all kinds of object: data points, centroids, test results, maps and so on.

Common parts of the menu panel for all methods
The All methods menu panel containes two parts.

The right part, Learning history, contains five buttons, one field for data input and two information fields.

The leftmost button shows the first step. The second button moves to the previous step.

The third button starts the slide show with delay between steps selected in editable field Delay (ms). It goes always from the first step to the last one.

The fourth button moves to the next step. The rightmost button shows the last step. The information field Step # shows the number of the current step.

The information field from indicates the total number of learning steps in the learning history.

The left part of all methods menu panels contains the name of the method, two fields to show the number of nodes, and six standard buttons.

The Random init button serves to add a coding vector at randomly selected data point to the current set of coding vectors.

The PCA init button serves to add a new coding vector on the first principal conponent line. All coding vectors are distributed on the first principal component line equidistantly. The variance of this equidistribution on a segment is selected equal to the total variance of the data points.

The Remove initialization button removes all nodes.

The Options button opens options dialog for customizing method parameters. This dialog is specific for every method.

The Learn button starts the learning process, specific for every method.

The Show error/Hide error button shows/hides the tableau, which contains the names of the methods and the corresponding errors (Fraction of variance unexplained,). The tableau is located at the right bottom corner of the work desk.

PCA
Use the menu panel PCA to work with principal component analysis.

For PCA, all points and lines are in a blue color. The points for PCA have the form of four-ray stars (see left). To start PCA learning two points are needed. You can select initial points by mouse clicks at choosen locations, or by the button Random initialization. When you try add the third point, the first point is removed.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results. You will see one four-ray star, the centroid of the data set, and one line, the current approximation of the first principal component.

SOM
Use the menu panel SOM to work with self organizing maps.

For SOM, all points and lines are in a red color. The points for SOM have the square form (see left). To start SOM learning, the initial positions of all coding vectors are needed. To add a point to an initial approximation, click the mouse button at the choosen location, or use the button Random init or the button PCA init. When you add a new point, it is attached to the last added point.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results.

GSOM
Use the menu panel GSOM to work with growing self organizing map.

For GSOM, all points and lines are in a brown color. The points for GSOM have the square form (see left). To start learning, GSOM is needed to initiate one point. To place initial point click the mouse button in the chosen location, or push the button Random init or the button PCA init. When you add a new point, the new point is joined with the last added point.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results.

GalaSOM
Use the menu panel GalaSOM to work with universal growing self organizing map.

For GalaSOM, all points and lines are in a green color. The points for GalaSOM have the square form (see left). To start learning, GalaSOM is needed to initiate one point. To place initial point click the mouse button in the chosen location, or push the button Random init or the button PCA init. When you add a new point, the new point is joined with the last added point.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results.

REM
Use the menu panel REM to work with robust elastic map.

For REM, all points and lines are in a red color. The points for SOM have the form of triangle (see left). To start REM learning, the initial positions of all coding vectors are needed. To add a point to an initial approximation, click the mouse button at the choosen location, or use the button Random init or the button PCA init. When you add a new point, it is attached to the last added point.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results.

RGEM
Use the menu panel RGEM to work with robust growing elastic map.

For RGEM, all points and lines are in a brown color. The points for GalaSOM have the form of triangle (see left). To start learning, RGEM is needed to initiate one point. To place initial point click the mouse button in the chosen location, or push the button Random init or the button PCA init. When you add a new point, the new point is joined with the last added point.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results.

GalaEM
Use the menu panel GalaEM to work with universal growing elastic map.

For GalaEM, all points and lines are in a green color. The points for GalaSOM have the form of triangle (see left). To start learning, GalaEM is needed to initiate one point. To place initial point click the mouse button in the chosen location, or push the button Random init or the button PCA init. When you add a new point, the new point is joined with the last added point.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results.

RPT
Use the menu panel RPT to work with robust principal tree.

For RPT, all points and lines are in a brown color. The points for GalaSOM have the form of circle (see left). To start learning, RPT is needed to initiate one point. To place initial point click the mouse button in the chosen location, or push the button Random init or the button PCA init. When you add a new point, the new point is joined with the last added point.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results.

GalaPT
Use the menu panel GalaPT to work with universal principal tree.

For GalaPT, all points and lines are in a green color. The points for GalaSOM have the form of circle (see left). To start learning, GalaPT is needed to initiate one point. To place initial point click the mouse button in the chosen location, or push the button Random init or the button PCA init. When you add a new point, the new point is joined with the last added point.
After learning is finished, use the Learning history buttons at the right part of menu panel to look at the learning history and results.

Std. data
The menu panel Std. data (Std. data 2, Std dta 3) is needed to work with some standard data set.

The first button Clear all comletely clears the work desk. It removes all kinds of objects: data points, coding vectors and so on.

The button Scatter serves to scatter existing data points. The scattering procedure adds a specified number of new data points in a specifyed circle around every data point, which exists. Options of scattering are defined in dialog (see left).

The Number of points to add parameter defines the number of points to add to every data point.

The Scattering radius parameter defines the radius of the circle around each data point where these new data points should be equidistributed.

The last button Save data saves all data point locations to a text file named Data for mirkes.txt. In order to add this data set to the Std. data panel file with a short description you needed to send this file by e-mail to Evgeny Mirkes, emmirkes@gmail.com. Warning: To use the button Save data you should change settings of you browser security system, or start this applet off-line without brouser.

Any other button sends a stored ("standard") image to the work desk.