## LeetCode #1 : Two Sum

$O(n)$ solution in python3: 44 ms, faster than 97,63%

## Description

Given an array of integers, return indices of the two numbers such that they add up to a specific target.

You may assume that each input would have exactly one solution, and you may not use the same element twice.

Example:

`Given nums = [2, 7, 11, 15], target = 9, Because nums[0] + nums[1] = 2 + 7 = 9, return [0, 1].`

(From the Two-Sum problem description)

### Test Vectors

- Requirement 1
- If the
`nums`

vector does not contain a valid solution, return an empty list. - Requirement 2
- The indexed sum matches the target: \(\sum_{i \in solution}(nums_{i}) = target\)
- Requirement 3
- No index in
`solution`

can be used twice. - Requirement 4
- \(|solution| = 2\) - … return the indices of the two numbers …

nums | target | solution | Comment |
---|---|---|---|

[1, 2] | 4 | [] | Req. #1 - no possible solution |

[] | 1 | [] | Req. #1 - edge case |

[2, 7, 11 ,15] | 9 | [0, 1] | Req. #2 |

[7, 5, 2] | 14 | [] | [0, 0] violates req. #3. [0, 1, 2] violates req. 4 |

## Solutions

### Problem API

```
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
```

### Test Results

This leads to the following tests:

```
import os
import pytest
class BaseTestClass:
"""
Info: the results are order-insensitive. To make the tests easier, we sort them.
"""
def test_original_vector_ok(self):
result = self.twoSum( nums=[2,7,11,15], target=9)
result.sort()
assert result == [0, 1]
def test_no_nums_give_no_result(self):
assert self.twoSum( nums=[], target=9) == []
def test_no_possible_solution_gives_no_result(self):
assert self.twoSum( nums=[1,2,3], target=999) == []
def test_same_elements_can_be_used(self):
# The requirement only cares about duplicate indices
result = self.twoSum( nums=[7,7], target=14)
result.sort()
assert result == [0, 1]
def test_same_elements_can_be_used_2(self):
# The requirement only cares about duplicate indices
result = self.twoSum( nums=[7,1,7], target=14)
result.sort()
assert result == [0, 2]
def test_the_same_index_cannot_be_used_twice(self):
# Without req. 3 [0,0] would be a solution
assert self.twoSum( nums=[7,9], target=14) == []
def test_only_two_elements_are_used(self):
# This vector has a solution with more than two numbers
assert self.twoSum( nums=[7,6,1], target=14) == []
def test_finds_one_of_multiple_solutions(self):
result = self.twoSum( nums=[1,2,3,4], target=5)
result.sort()
assert result in [[0,3], [1,2]]
def test_very_long_inputs_succeed(self):
target = 16000
nums = [ x * 2 for x in range(1, target // 2)]
res = self.twoSum(nums=nums, target=target)
res_sum = nums[res[0]] + nums[res[1]]
assert res_sum == target
```

### Solution 1: Simple, testable, \(O(n**2)\)

A straight forward solution to this problem is to iterate through all
permutations of 2 different elements of the `nums`

. Since the
indices are needed as a result, the permutations are generated over the
indices `0..len(nums)`

.

```
from typing import List
def two_element_permutations(count):
"""
Create the two element permutations of the first count numbers (0..count-1).
- duplicates, e.g. (1, 1) are not emitted
- the results are unique regardles of order.
E.g. if (a, b) has already been returned, (b, a) is not returned.
"""
# range goes from a..b-1.
for a in range(0, count - 1):
for b in range(a + 1, count - 0):
yield (a, b)
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
for (a,b) in two_element_permutations(len(nums)):
if nums[a] + nums[b] == target: return [a, b]
return []
```

#### Test Results

In addition to the functional tests, the generation of the permutations can be tested.

The tests for the permutation only test the first three cases: `0..3`

elements
because the cases `4..`

are just case `3`

, only longer.

```
import os
import pytest
from solution1a import two_element_permutations
def test_list_of_none_is_empty():
assert [ perm for perm in two_element_permutations(0) ] == []
def test_list_of_one_is_empty():
assert [ perm for perm in two_element_permutations(1) ] == []
def test_list_of_two():
assert [ perm for perm in two_element_permutations(2) ] == [(0,1)]
def test_list_of_three():
assert [ perm for perm in two_element_permutations(3) ] == [(0,1), (0,2), (1,2)]
```

With the following test result:

```
============================= test session starts ==============================
platform linux -- Python 3.7.5, pytest-3.10.1, py-1.8.0, pluggy-0.12.0
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/jens/Documents/leetcode.com/001-two_sum, inifile:
plugins: benchmark-3.2.2, profiling-1.7.0
collected 13 items
test_1a-solution.py ......... [ 69%]
test_1a-permutations.py .... [100%]
============================ slowest test durations ============================
(0.00 durations hidden. Use -vv to show these durations.)
========================== 13 passed in 0.05 seconds ===========================
Tests ran in 0.19362594500125851 seconds
```

### Solution 1b: Simple, slightly faster, \(O(n**2)\)

The first solution is very decomposed and each part is testable. Unfortunately
`yield`

is too slow for leetcode: with a large dataset a timeout is reached.

A quick fix is to inline the generator:

```
from typing import List
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
count = len(nums)
for a in range(0, count - 1):
for b in range(a + 1, count - 0):
if nums[a] + nums[b] == target: return [a, b]
return []
```

#### Test Results

```
============================= test session starts ==============================
platform linux -- Python 3.7.5, pytest-3.10.1, py-1.8.0, pluggy-0.12.0
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/jens/Documents/leetcode.com/001-two_sum, inifile:
plugins: benchmark-3.2.2, profiling-1.7.0
collected 9 items
test_1b-solution.py ......... [100%]
============================ slowest test durations ============================
(0.00 durations hidden. Use -vv to show these durations.)
=========================== 9 passed in 0.02 seconds ===========================
Tests ran in 0.16958735800290015 seconds
```

### Solution 2a: Using a dictionary: \(O(n)\)

The first two solutions are easy to understand but they are inefficient for
larger `n`

(longer `nums`

lists).

Fortunately the problem description gives some constraints that can be leveraged:

- The result is exactly two values
- Given a number
`a`

it is trivial to calculate`b`

such that`a + b = target`

If there would be a way to quickly (meaning: better than \(O(n)\)) find out if a
given value is contained in the `nums`

list then the whole algorithm can be
greatly improved.

A `set`

is a data structure that provides \(O(1)\) tests for membership. A
`dictionary`

(map) also allows to associate a value to the key.

The new algorithm creates a mapping from \(num \mapsto index\) (`index_by_value`

) with
\(O(1)\) lookup. The `nums`

list is iterated. For each number `a`

the needed `b`

is calculated. If `b`

is found in the dictionary (and is at a different position
than `a`

) a valid result has been found.

If there exist an \(a,b\) with \(a + b = target \land index_a \neq index_b\) it will
be found, because the algorithm searches the whole `nums`

list and tests each
element `a`

\(\in\) `nums`

for a matching `b`

\(\in\) `nums`

.

One edge case needs special consideration: `a == b`

and `a + b = target`

with
\(index_a \lt index_b\).

`index_by_value`

stores \(a \mapsto index_a\)- In this case the found indices would be
`[index_b, index_a]`

. `index_by_value`

stores \(a \mapsto index_b\)- In this case the found indices would be
`[index_a, index_b]`

.

In the `twoSum([7,7], 14)`

example the dictionary would look like this: `{ 7 => 1}`

. The iteration would test the first `7`

as `a`

(`a_index = 0`

), and calculate `b`

as `target - a = b = 7`

. A lookup in the dictionary would return `1`

as `b_index`

. The

```
from typing import List
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
index_by_value = {k:v for (k,v) in zip(nums, range(0, len(nums)))}
a_index = 0
for a in nums:
b = target - a
if b in index_by_value:
b_index = index_by_value[b]
# Requirement 4: no duplicate indices
if a_index != b_index:
return [a_index, b_index]
a_index += 1
return []
```

#### Test Results

```
============================= test session starts ==============================
platform linux -- Python 3.7.5, pytest-3.10.1, py-1.8.0, pluggy-0.12.0
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/jens/Documents/leetcode.com/001-two_sum, inifile:
plugins: benchmark-3.2.2, profiling-1.7.0
collected 9 items
test_2a-solution.py ......... [100%]
============================ slowest test durations ============================
(0.00 durations hidden. Use -vv to show these durations.)
=========================== 9 passed in 0.04 seconds ===========================
Tests ran in 0.1822243479982717 seconds
```

### Solution 2b: Using a dictionary: \(O(n)\)

Solution 2a can be improved by filling the dictionary step by step.

- In 2a the dictionary is pre-filled. This takes time and consumes memory.
- In 2b the dictionary only collects the numbers already visited. This restricts
the algorithm to only look
*back*, whereas in 2a it looked at the whole set. The algorithm is still correct because for any valid pair \(a, b\) with \(index_a < index_b\) it finds \(a\) when testing at \(index_b\).

```
from typing import List
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
index_by_value = dict()
a_index = 0
for a in nums:
b = target - a
if b in index_by_value:
b_index = index_by_value[b]
# Requirement 4: no duplicate indices
if a_index != b_index:
return [a_index, b_index]
else:
index_by_value[a] = a_index
a_index += 1
return []
```

#### Test Results

```
============================= test session starts ==============================
platform linux -- Python 3.7.5, pytest-3.10.1, py-1.8.0, pluggy-0.12.0
benchmark: 3.2.2 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/jens/Documents/leetcode.com/001-two_sum, inifile:
plugins: benchmark-3.2.2, profiling-1.7.0
collected 9 items
test_2b-solution.py ......... [100%]
============================ slowest test durations ============================
(0.00 durations hidden. Use -vv to show these durations.)
=========================== 9 passed in 0.04 seconds ===========================
Tests ran in 0.1827685319876764 seconds
```

## Summary

All solutions work (except #1 which leads to a timeout for large data sets).

Solution | Correct | Status | Runtime | Memory | Language |
---|---|---|---|---|---|

#1, \(O(n^2)\), with generator | Yes | Submission | Python3 | ||

#2, \(O(n^2)\), tight loop | Yes | Submission | 5988 ms, faster than 7.73% | 13.7 MB, less than 79.07% | Python3 |

#3, \(O(n)\), dictionary | Yes | Submission | 56 ms, faster than 74.89% | 14.4 MB, less than 35.11% | Python3 |

#4, \(O(n)\), optimized dictionary | Yes | Submission | 44 ms, faster than 97.63% | 14.3 MB, less than 52.56% | Python3 |