Skip to content

Algorithms and Analysis

Studying algorithms dates back to 300BC Euclid, it was formalised by Church and Turing in the 1930’s (at Princeton) with its impacts going beyond computer science being applied in a multitude of fields.

  • Algorithms are methods for solving a problem
  • Data Structures are methods for storing information
  • Computational models are replacing math models in scientific inquiry.

” Bad programmers worry about the code. Good programmers worry about data structures and their relationships. ” - Linus Torvalds (creator of Linux)

  • Model the Problem
  • Main elements of the problem to be solved
  • Find an algorithm to solve the problem
  • Is the algorithm fast enough?
  • If not why?
  • Find something that might address the issue
  • Iterate until satisfied.

Analysing algorithms can be seen from different perspectives, as an example a programmer might want to develop a working solution whereas a client just wants to solve a problem efficiently. As a student it’s important to understand all the different points of view. The key measure we will focus on for analysing the effectiveness of an algorithm would be running time.

As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will arise—By what course of calculation can these results be arrived at by the machine in the shortest time? ” — Charles Babbage (1864)

  • Predict Performance
  • Compare Algorithms
  • Provide Guarantees (Theory of Algorithms)
  • Understand Theoretical Basis (Theory of Algorithms)

Primary Practical Reason: Avoid Performance Bugs. (Confidence in completing the job in an allotted period of time).

Use the scientific method to understand Performance.

Scientific Method

  • Observe some features of the natural world ( Running time on our computer )
  • Hypothesize a model that is consistent with the observations ( Allow us to predict the running time for larger sets )
  • Predict events using the hypothesis
  • Verify the predictions by making further observations
  • Validate by repeating until the hypothesis and observations agree

Principles

  • Experiments must be reproducible
  • Hypotheses must be falsifiable

Given N distinct integers, how many triples sum to exactly zero?

30, -40, -20, -10, 40, 0, 10, 5

Answer: 4 (see below)

a[i]a[j]a[k]sum
30-40100
30-20-100
-404000
-100100

Write a program that can solve this on scale efficiently.

//Check each triple
int N = a.length;
int count = 0;
for (int i = 0; i < N; i++ ){
for (int j = i + 1; j < N; j++){
for (int k = j+1; k < N; k++){
if(a[i] + a[j] + a[k] == 0) count ++;
}
}
}
return count;

We would first take our algorithm and plot empirical data to understand our running time and how it scales. For the above code we might get a curve as such:

Ntime(s)
2500.0
5000.0
10000.1
20000.8
40006.4
800051.1
16000?

standard plot of an algorithm Continuing from our standard plot utilising a log-log scale we can then look at the slope to understand the efficiency of our algorithm at scale:

analyse

From our hypothesis we can now predict the running time of our program

The running time is about 1.006x 10POW(-10) x NPOW(2.999) seconds

  • 51.0 seconds for N = 8,000
  • 408.1 seconds for N = 16,000
Ntime(seconds)
8K51.1
8K51.0
8K51.1
16K410.8

Hypothesis Validated!

A quick way to estimate b in a power law relationship is by doubling our hypothesis and capturing the lg ratio.

NTimeRatiolg ratio
2500
5000.04.82.3
10000.16.92.8
20000.87.72.9
40006.48.03.0
800051.18.03.0

Here we witness a convergence of our lg ratio to a constant of 3. Exponent of N and running time.

(Determines Exponent B in Power Law && Determines constant in a power law)

  • Algorithm
  • Input Data

(Determines constant in a power law)

  • Hardware
  • Software
  • System