Algorithms and Analysis
Studying algorithms dates back to 300BC Euclid, it was formalised by Church and Turing in the 1930’s (at Princeton) with its impacts going beyond computer science being applied in a multitude of fields.
- Algorithms are methods for solving a problem
- Data Structures are methods for storing information
- Computational models are replacing math models in scientific inquiry.
” Bad programmers worry about the code. Good programmers worry about data structures and their relationships. ” - Linus Torvalds (creator of Linux)
Steps to developing a usable algorithm
Section titled “Steps to developing a usable algorithm”- Model the Problem
- Main elements of the problem to be solved
- Find an algorithm to solve the problem
- Is the algorithm fast enough?
- If not why?
- Find something that might address the issue
- Iterate until satisfied.
Analysing algorithms can be seen from different perspectives, as an example a programmer might want to develop a working solution whereas a client just wants to solve a problem efficiently. As a student it’s important to understand all the different points of view. The key measure we will focus on for analysing the effectiveness of an algorithm would be running time.
As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will arise—By what course of calculation can these results be arrived at by the machine in the shortest time? ” — Charles Babbage (1864)
Reasons to analyse Algorithms
Section titled “Reasons to analyse Algorithms”- Predict Performance
- Compare Algorithms
- Provide Guarantees (Theory of Algorithms)
- Understand Theoretical Basis (Theory of Algorithms)
Primary Practical Reason: Avoid Performance Bugs. (Confidence in completing the job in an allotted period of time).
Will my program be able to solve a large practical Input?
Section titled “Will my program be able to solve a large practical Input?”Use the scientific method to understand Performance.
Scientific Method
- Observe some features of the natural world ( Running time on our computer )
- Hypothesize a model that is consistent with the observations ( Allow us to predict the running time for larger sets )
- Predict events using the hypothesis
- Verify the predictions by making further observations
- Validate by repeating until the hypothesis and observations agree
Principles
- Experiments must be reproducible
- Hypotheses must be falsifiable
Example of Analysing Algorithms
Section titled “Example of Analysing Algorithms”Given N distinct integers, how many triples sum to exactly zero?
30, -40, -20, -10, 40, 0, 10, 5
Answer: 4 (see below)
| a[i] | a[j] | a[k] | sum |
|---|---|---|---|
| 30 | -40 | 10 | 0 |
| 30 | -20 | -10 | 0 |
| -40 | 40 | 0 | 0 |
| -10 | 0 | 10 | 0 |
Write a program that can solve this on scale efficiently.
Method 1 Brute Force (Slow - Quadratic)
Section titled “Method 1 Brute Force (Slow - Quadratic)”//Check each tripleint N = a.length;int count = 0;for (int i = 0; i < N; i++ ){ for (int j = i + 1; j < N; j++){ for (int k = j+1; k < N; k++){ if(a[i] + a[j] + a[k] == 0) count ++; } }}return count;Power Law
Section titled “Power Law”We would first take our algorithm and plot empirical data to understand our running time and how it scales. For the above code we might get a curve as such:
| N | time(s) |
|---|---|
| 250 | 0.0 |
| 500 | 0.0 |
| 1000 | 0.1 |
| 2000 | 0.8 |
| 4000 | 6.4 |
| 8000 | 51.1 |
| 16000 | ? |
Continuing from our standard plot utilising a log-log scale we can then look at the slope to understand the efficiency of our algorithm at scale:

From our hypothesis we can now predict the running time of our program
Predictions
Section titled “Predictions”The running time is about 1.006x 10POW(-10) x NPOW(2.999) seconds
- 51.0 seconds for N = 8,000
- 408.1 seconds for N = 16,000
Observations
Section titled “Observations”| N | time(seconds) |
|---|---|
| 8K | 51.1 |
| 8K | 51.0 |
| 8K | 51.1 |
| 16K | 410.8 |
Hypothesis Validated!
Doubling Hypothesis
Section titled “Doubling Hypothesis”A quick way to estimate b in a power law relationship is by doubling our hypothesis and capturing the lg ratio.
| N | Time | Ratio | lg ratio |
|---|---|---|---|
| 250 | 0 | ||
| 500 | 0.0 | 4.8 | 2.3 |
| 1000 | 0.1 | 6.9 | 2.8 |
| 2000 | 0.8 | 7.7 | 2.9 |
| 4000 | 6.4 | 8.0 | 3.0 |
| 8000 | 51.1 | 8.0 | 3.0 |
Here we witness a convergence of our lg ratio to a constant of 3. Exponent of N and running time.
System Independent Effects
Section titled “System Independent Effects”(Determines Exponent B in Power Law && Determines constant in a power law)
- Algorithm
- Input Data
System Dependent Effects
Section titled “System Dependent Effects”(Determines constant in a power law)
- Hardware
- Software
- System