The Learning Curve

Xander Barel BSc

17-09-2017

Introduction

In this small experiment I have tracked the progression over time of a ball bouncing task. It is a task that has recently been displayed by Ido Portal in the buildup behind the Conor McGregor fight. It is a task where one hits a ball against a wall with a fist.

For the first hundred attempts, I have recorded how many successive hits I could manage before I lost control over the ball and I needed to restart.

From this data images can be created that describe the skill development over time. These images represent a learning curve, and may lift the veil slightly about the nature of development. This interests me, as I study both learning processes and motor development.

For people not interested in reading statistics, but purely the outcome, I have created a TL;DR version.

Method and expectations

No research holds much value if it can not be replicated. For this we need a clear description of the task.

“Hit a tennisball against the wall with your fist. Count how many times the ball can be hit against the wall before one loses control of the ball. Loss of control happens when the ball ends up still or rolling.”

The size of each attempt is counted and noted, which creates a data set of 100 attempts with a value representing the amount of successive wall bounces. This data is then subject to statistical manipulation.

With our level of analysis five hypothetical scenario’s can be expected with regards to the nature of development.

The first is that there is no change in either accuracy or consistency. The null-hypothesis where one would see no difference.

If there is a difference, the other four scenarios can be described as follows.

Scenario A: Accuracy up, consistency up.

Would be the “ideal” case where as the skill level improves, the reliability of the skill improves too. Like a boxer who gets hit less often as the rounds go on, and every round the improvement is stable.

Scenario B: Accuracy down, consistency up.

Is a case where accuracy ‘trades’ for reliability. A real life example would be a tennis player who hits a good serve less often, and this happens every set. Someone who gets gradually worse as a game drags on. Fatigue will do this.

Scenario C: Accuracy up, consistency down.
Is a case where the skill improves, but as the skill level goes up, the likehood of success drops. This is a boxer who hits way harder as the rounds go on, but misses more too. Sudden quick and unstable improvement.

Scenario D: Accuracy down, consistency down.
Is a case where skill drops, but not the same every round. A boxer who starts swinging for the fences. Chances to hit drop, but some rounds can be very succesfull. Performance drops in a unpredictable manner. Excellent scenario for sports gambling.

In this article we will investigate what scenario corresponds with our data.

The data set consist the number of successive hits of the first 100 attempts. For sake of visual analysis, I have combined the attempts in sets of 5 attempts. This yields 20 sets of 5 attempts. From this a set mean score, variance, consistency and accuracy can be calculated.

Set mean score is the sum of hits divided by the number of attempts.

(total hits in five attempts) : (set size) = set mean score

From this set mean score you can see how the ability to hit the ball against the wall improves. If someone becomes better at hitting a ball against a wall, you will expect the set mean score to rise as more attempts have been done. This leads to the following statements:

“If ability increases, set mean score should increase at the number of attempts increases.”

Ability can be divided into several factors. For this analysis I have decided to pick two, namely accuracy and consistency. These are not the same, as accuracy described more how big the attempts are, and are a measure of ‘success’ and consistency describes how stable the improvement is. One can be consistently bad, or inconsistently good for instance.

Set variance is for each set, the sum total distance between each attempt of a set and that sets mean score. This gives an absolute measure of inconsistency. If a set mean score comes from scores ranging from very high to very low, what we can call low consistency, variance will be high. If the scores are very similar to the mean, what we call high consistency, variance will be low.

This leads to the following statements:

“If consistency increases variance should drop as the number of attempts increases.”

Analysis

So. Enter the visual data. To answer the question if there is improvement, we can have a look at how the set means develop over the number attempts. (Figure 1.)

There seems to be improvement. However we need to make a change to see the improvement more clearly. If you holds the premise that growth is a consequence of experience, then the growth should not be represented as a function of attempt sets. Some sets have much more experience in them than others, because the attempt sizes increase. In other words, as you improve there is more practice in each set. Set 1-5 are not equal in terms of time and experience as set 15-20. We can change the graph to represent the set mean score as a function of total hits.

This is what it looks like. (Figure 2.)

Even in mean numbers, this seems to be a rather dramatic increase. When expressed as a measure of accuracy however, the learning curve is considerably more modest.

Accuracy can be expressed as how hits relate to misses. Each attempt consists of a number of hits, that were counted, and a miss, that ends each attempt. We can create a measure for accuracy by taking the number of hits divided by total strikes. Say, if we hit the ball nine times, but miss the tenth, we are 90% accurate for that attempt.

((size of set hits) : (size of set in hits +5)) x 100 = accuracy in %

The development of accuracy in each set as a function of total hits is as follows. (Figure 3.)

This graph has a naturally logarithmic shape for a soon as you hit perfect accuracy you would have to play the game forever. Unless you install a time limit, which I did not include in this trial. From this visual analysis you could find a couple of things.

There is overall improvement in accuracy. The improvement happens in ‘arcs’. Three arcs, each higher and longer than the first, followed by a drop. Arc one is between 0-100 hits, the second arc between 150-300 and the third from 350-650, roughly.

Next we will focus on the matter of consistency. For this one can view the variance of each set as it develops over the number of hits done.

This is where statistical language becomes tricky.

Mean variance is the distance between the size of each attempts and the total average mean score for attempts combined divided by the number of attempts. The average distance of each attempt to the mean score. A big number indicates a large distance. You could use this to compare the development of variance between trials.

We only have one trial, and we want to see the development of consistency within that trial, so we split the attempts into sets of five, like before.

Set variance is the sum total of the distances of each attempt of that set to the set mean score of that set. Divide that by the set size, in this case five, to get the set mean variance.

Combine all the set variance and divide by the number of sets, to get the mean set variance.

As the attempt sizes increase so does the possibility for high variance. One would naturally expect the set mean variance to increase as attempt sizes, and with that hits per set, increase. If we want to have an intuitable image to represent consistency we need to correct for this. We can do it by standardising the set mean variance to a value that describes its magnitude relative to its own set mean score. Dividing set mean variance by the set mean score creates values that represent what the mean variance of any set would be if the mean score was the same for each set.

set mean variance : set mean score = relative variance

These values represent the relative variance and we can express this as a function of the amount of practice we had by relating it to the number of total successful hits, or to the total number of sets. For this analysis I chose the number of sets, as the size of the attempts per set has already been accounted for. To intuit consistency, we need to reverse the data as high variance indicates low consistency. Divide one by relative variance to get the relative consistency. (Figure 4). (Relative consistency is the same as set mean score : set mean variance)

From this data we could say that there is a taper down in the beginning and relative equal consistency from sets 5 through 20. Hard statistical tests could be used to test the significance. For now though, we could state that there is likely no increase in consistency and if there is any, there is a small decrease in consistency.

Conclusion

Conclusion of analysis of this single trial with 100 attempts to strike a ball against a wall would be the following:

“The learning curve demonstrates an improvement in ability with more practice. As accuracy improves in arcs, the consistency remains relatively stable after an initial drop.

Implications for practice and training based on these results would be that one should not expect a learning process to be characterised by a stable steady improvement, and that a decrease in performance at times is a normal part of development. Improvement in the beginning at first grows quickly and tapers as skill grows. Maybe the place to seek improvement then is not to look for success of greater magnitude but to work on consistency, as it does not automatically grow with improved skill.”