OT learning 5. Learning a stochastic grammar

Having shown that the algorithm can learn deep obligatory rankings, we will now see that it also performs well in replicating the variation in the language environment.

Create a place assimilation grammar as described in §2.6, and set all its rankings to 100.000:

 ranking value disharmony plasticity

 *GESTURE 100 100 1

 *REPLACE (t, p) 100 100 1

 *REPLACE (n, m) 100 100 1

Create a place assimilation distribution and generate 1000 string pairs (§3.1). Select the grammar and the two Strings objects, and learn with a plasticity of 0.1:

 ranking value disharmony plasticity

 *REPLACE (t, p) 104.54 103.14 1

 *REPLACE (n, m) 96.214 99.321 1

 *GESTURE 99.246 97.861

The output distributions are now (using OTGrammar: To output Distributions..., see §2.9):

 /an+pa/ → anpa 14.3%

 /an+pa/ → ampa 85.7%

 /at+ma/ → atma 96.9%

 /at+ma/ → apma 3.1%

After another 10,000 new string pairs, we have:

 ranking value disharmony plasticity

 *REPLACE (t, p) 106.764 107.154 1

 *GESTURE 97.899 97.161 1

 *REPLACE (n, m) 95.337 96.848 1

With the following output distributions (measured with a million draws):

 /an+pa/ → anpa 18.31%

 /an+pa/ → ampa 81.69%

 /at+ma/ → atma 99.91%

 /at+ma/ → apma 0.09%

The error rate is acceptably low, but the accuracy in reproducing the 80% - 20% distribution could be better. This is because the relatively high plasticity of 0.1 can only give a coarse approximation. So we lower the plasticity to 0.001, and supply 100,000 new data:

 ranking value disharmony plasticity

 *REPLACE (t, p) 106.81 107.184 1

 *GESTURE 97.782 99.682 1

 *REPLACE (n, m) 95.407 98.76 1

With the following output distributions:

 /an+pa/ → anpa 20.08%

 /an+pa/ → ampa 79.92%

 /at+ma/ → atma 99.94%

 /at+ma/ → apma 0.06%

So besides learning obligatory rankings like a child does, the algorithm can also replicate very well the probabilities of the environment. This means that a GLA learner can learn stochastic grammars.