It turns out to be possible to calculate the parameters a and b of the cumulative probability distribution P = a*(1 - exp(-t/b)) from the average correct solution time and the number of problems solved correctly. The probability distribution corresponding to P is:

dP/dt = a*exp(-t/b)/b

If we multiply this expression by t and integrate by parts from 0 to t, we get:

a*(b - (b + t)*exp(-t/b))

The average correct solution time is therefore:

t

_{av}= a*(b - (b + t)*exp(-t/b)) / (a*(1 - exp(-t/b)))

which simplifies to:

t

_{av}= b - t * exp(-t/b) / (1 - exp(-t/b))

We also have:

1 - exp(-t/b) = P/a

and

exp(-t/b) = 1 - P/a

Eliminating the exponentials gives:

t

_{av}= b - t*(1 - P/a) / (P/a) = b - t*(a - P) / P

which can be rewritten as:

b = t

_{av}- t*(a - P)/P

This is a simple linear equation connecting a and b. (For P = 1 - exp(-t/c), this equation becomes c = t

_{av}- t*(1 - P)/P. For my data, this value of c matches the value given by least squares very closely indeed.) Eliminating b gives:

exp(-t/(t

_{av}- t*(a - P)/P)) = 1 - P/a

We can solve this equation numerically using Newton’s method, using P as the first approximation for a.

f(a) = exp(-t/(t

_{av}- t*(a - P)/P)) - 1 + P/a

f'(a) = (t / (t

_{av}- t*(a - P)/P))^2 * exp(-t / (t

_{av}- t*(a - P)/P)) / P - P/a^2

h = -f(a) / f'(a)

a' = a + h

where a’ is the next approximation for a.

(Note that f(a) can be rewritten as:

f(a) = exp(-1/(t

_{av}/t-(a - P)/P)) - 1 + P/a

The solution for a (and therefore b) therefore depends on t

_{av}/t, rather than on t

_{av}and t separately.)

For a fixed time limit t, we can calculate a, b and b/a from two clearly meaningful quantities:

t

_{av}= average of the correct solution times within the time limit, and

P = number of correct solutions found within the time limit (applied to each problem individually), divided by the total number of problems in the problem set.

This calculation involves all the solution times symmetrically and matches the tail of the distribution well. It is also easy to carry out with a spreadsheet:

￼

N.B. An average successful solution time of zero gives numerical problems here, as does a success probability of zero, but these should not be realistic scenarios. The calculated value of a can be more than 1 in some cases. Here is a plot of the P against t

_{av}for a = 1 and a = 1.1, with t = 100 seconds:

￼

The value of a is more than 1 when P is too large in relation to t

_{av}.

## No comments:

## Post a Comment