Skip to content

Mathematical Models

To understand what models are actually doing we need to utilise mathematical models. The concept was popularised in the late 60’s by Don Knuth.

We can calculate the total running time of an operation with the below mathematical model.

sum of cost x frequency of all operations

  • Analyse program to determine set of operations.
  • Cost depends on machine, compiler.
  • Frequency depends on algorithm, input data.

Today we run experiments to understand the cost of operations. As an example we could run an add operation a billion times to average the time in nano seconds to run that operation, below are types of operations to consider:

  • int add
  • int multiply
  • int divide
  • float add
  • float multiply
  • float divide
  • sine
  • arctangent
  • var declaration
  • assignment statement
  • array element access
  • 1d array allocation
  • 2d array allocation
  • string length
  • substring extraction
  • string concat

It is important to understand the operation itself and how that could effect running time for example string concatenation is proportional to the length of the string so this could change the running time of the operation significantly

The sum of these basic operations present in our algorithm will be used as our first component in our mathematical model.

int count = 0;
for(int i = 0; i < N; i++){
if(a[i] == 0){
count++;
}
}
  • variable declaration = 2
  • assignment statement = 2
  • less than compare = N + 1
  • equal to compare = N
  • array access = N
  • increment = N to 2 N
int count = 0;
for (int i = 0; i < N; i++){
for(int j = 0; j < N; j++){
if(a[i] + a[j] == 0){
count++;
}
}
}

We have exponentially increased the frequency of our operations within the brute force method.

  • variable declaration = N + 2
  • assignment statement = N + 2
  • less than compare = 12(N+1)(N+2)\frac{1}{2} (N+ 1) * (N + 2)
  • equal to compare = 12N(N1)\frac{1}{2} N * (N - 1)
  • array access = N(N1)N * (N - 1)
  • increment = 12N(N1)toN(N1)\frac{1}{2} N * (N - 1) to N * (N - 1)

It is convenient to have a measure of the amount of work involved in a computing process, even though it be a very crude one. We may count up the number of times that various elementary operations are applied in the whole process and then given them various weights. We might, for instance, count the number of additions, subtractions, multiplications, divisions, recording of numbers, and extractions of figures from tables. In the case of computing with matrices most of the work consists of multiplications and writing down numbers, and we shall therefore only attempt to count the number of multiplications and recordings.

Alan Turing (Rounding-Off Errors in Matrix Processes)

To paraphrase based on Turing’s suggestions we weight our operations by most expensive and count the frequency of those which are most expensive or executed most often as a proxy for running time. Essentially we hypothesize our running time utilising this method.

Only a few functions really turn up when we analyse algorithms. A great number of algorithms are described by these few functions:

  • log(n)\log(n)
  • NN
  • NlogNN \log N
  • N2N^2
  • N3N^3
  • 2N2^N

table of growth When we talk about order of growth we discard the leading coefficient as an example if we hypothesize the running time of an Algorithm is NlogNN \log N what that actually means is the running time of an algorithm is CNlogNCN \log N where C is some constant.

Order of GrowthNameTypical Code FrameworkDescriptionExampleT(n)T(2n)​
11Constanta = b + cStatementAdd two numbers11
logN\log NLogarithmicwhile(N > 1) N = N/2Divide in halfBinary search1\approx 1
NNLinearfor (int i = 0; i < N; i++)LoopFind the maximum22
NlogNN \log NLinearithmic(See mergesort)Divide and conquerMergesort2\approx 2
N2N^2Quadraticfor(..){for(..)}Double loopCheck all pairs44
N3N^3Cubicfor(..){for(..){for(..)}}Triple loopCheck all triples88
2N2^NExponential(See combinatorial search)Exhaustive searchCheck all subsetsT(n)T(n)
Growth Rate1970s1980s1990s2000s
11AnyAnyAnyAny
logN\log NAnyAnyAnyAny
NNMillionsTens of MillionsHundreds of MillionsBillions
NlogNN \log NHundreds of ThousandsMillionsTens of MillionsHundreds of Millions
N2N^2HundredsThousandsThousandsTens of Thousands
N3N^3HundredsHundredsThousandsThousands
2N2^N20\approx 2020s20s20s20s20\approx 20

Conclusion We need linear algorithms to keep pace with Moore’s law.

One way to simplify our analysis of an algorithm employing mathematical methods is to use tilde notation. Tilde notation works of the two core principles.

  1. Estimate running time (or memory) as a function of input size N.
  2. Ignore lower order terms.
    • When N is large, terms are negligible.
    • when N is small, we don’t care.
Example 1:16N3+20N+16\frac{1}{6} N^3 + 20N + 16~16N3\frac{1}{6}N^3
Example 2:16N3+100N43+56\frac{1}{6} N^3 + 100 N^\frac{4}{3} + 56~16N3\frac{1}{6}N^3
Example 3:16N3+12N2+1/3N\frac{1}{6}N^3 + \frac{1}{2} N^2 + 1/3N~16N3\frac{1}{6}N^3

To surmise:

  • In principle: Accurate Mathematical Models are available.
  • In Practice:
    • Formulas can be complicated.
    • Advanced mathematics might be required.
    • Exact models best left for the experts.

To summarise our analysis of an algorithm there are a couple of commonly used notations to help describe our order of growth.

notationprovidesexampleshorthand forused to
Big Thetaasymptotic order of growthΘ(N2)Θ(N^{2})12N2\frac{1}{2} N^{2}
10N210 N^{2}
5N2+22NlogN+3N5 N^{2} + 22 N log N + 3N
classify algorithms
Big OhΘ(N2)Θ(N^{2}) and smallerO(N2)O(N^{2})10N210 N^{2}
100N100 N
22NlogN+3N22 N log N + 3 N
develop upper bounds
Big OmegaΘ(N2)Θ(N^{2}) and larger
Ω(N2)Ω(N^{2})12N2\frac{1}{2} N^{2}
N5N^{5}
N3+22NlogN+3NN^{3} + 22 N log N + 3 N
Develop Lower Bounds