# Python: Probably the Best Language for Interview

I have watched candidates coding in the whiteboard with C++, Java, JavaScript, ES6 JavaScript, Python, Ruby, — looking forward to Rust, Clojure, Erlang etc, programers! I think Python is probably the best programming language for technical interview, especially whiteboard programming.

## High expressiveness

Python use indention to express control blocks, which has *zero cost* in the
whiteboard programming. No other languages can beat doing nothing!

Joke aside, Python absorbs many traits from functional languages for the better expressiveness, more notably:

- List comprehension
- Higher-order functions
- Generator and iterator

### List Comprehension

I think the list comprehension might be the most intuitive and powerful way to manipulate the iterable data.

We can flatten the nested list with even-number predict in one-liner:

```
In [6]: nested_list = [[1, 2], (3, 4, 5)]
In [7]: [x for sublist in nested_list for x in sublist if x % 2 == 0]
Out[7]: [2, 4]
```

We can explore the neighbors of position `(i, j)`

with predicts for
breadth-first search in the 2D grid, such as 695. Max Area of Island.

```
candidates = [
(x, y) for x, y in explore(i, j, rows, cols)
if grid[x][y] and (x, y) not in visited
]
```

### Higher-order functions

I prefer the list comprehension over `map`

and `filter`

composition for its
readability. The `filter`

shines when constructing the sequence from scratch.
For example, the following `explore`

method returns adjacent positions within
bounds:

```
def explore(i, j, rows, cols):
return filter(
lambda d: 0 <= d[0] < rows and 0 <= d[1] < cols,
[(i - 1, j), (i, j + 1), (i, j - 1), (i + 1, j)]
)
```

You can also invoke `sort`

, `sorted`

, `min`

, `max`

method with extra `key`

function to override the default ordering logic. For example, we can sort the
DNA sequence based on the occurrences:

```
>>> counter = Counter('acacctatgccagggttgagggtgcgacggga')
>>> sorted(counter, key=lambda k: counter[k], reverse=True)
['g', 'a', 'c', 't']
```

The module `operator`

also provides alternative methods for the operators, such
as `operator.add(x, y)`

is equivalent to `lambda x, y: x + y`

. It is merely a
matter of personal taste.

### Generator and Iterator

It is quite common to harvest the results from the recursion, such as tree traversal. Alternatively, we can use generator to emit results, then collect results externally. It is more readable, and consumes less memory in general.

```
def pre_order_traverse(root: TreeNode):
if root is None: return
yield root.val
yield from root.left
yield from root.right
# Materialize the generator with list
list(pre_order_traverse(root))
```

On the other hand, we might want to consume the iterable data progressively. For example, in the n-way merge, the next element in the sequence is pushed to the heap only when the preceeding element is popped. Instead of tracking the sequence’s cursor, we use the iterator to simplify the book keeping.

```
from heapq import heapify, heappush, heappop
def merge(seqs: List[List[int]]):
heap = [iter(s) for s in seqs]
heap = [(next(it), it) for it in heap]
heapify(heap)
while heap:
val, it = heappop(heap)
yield val
try:
heappush(heap, (next(it), it))
except StopIteration:
pass
```

## Batteries included

Python is reputable for the batteries include philosophy. Out of the box, Python programmers are equipped with:

`set`

, aka`HashSet`

in Java`dict`

, aka`HashMap`

in Java`list`

, a generic vector or array with random access`heapq`

, a min-heap, aka`PriorityQueue`

`collections.Counter`

, similar ideas as`MultiSet`

`collections.deque`

, a double-ended queue to support efficient appends and pops from both sides.`collections.defaultdict`

, can be used for`MultiHashMap`

in java.`collections.OrderedDict`

Let’s see how they make our lives easier, :-).

### Tuple

The built-in data structure `tuple`

is an essential build block despite it is
often underestimated.

We can store arbitrary data in a `tuple`

instead of declaring a class, — quick
and dirty but suitable for the interview context. Here is a more sophisticated
example to use tuple for double-linked list.

Recall the n-way merge example: the iterator is stored with the value, then pushed back to min-heap, and the heap-sort still works! This is due to that the tuple is compared from left to right:

```
In [47]: (1, 2) < (1, 3) < (2, 1)
Out[47]: True
```

If we encapsulate the data into a class, such as:

```
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
```

The comparison no longer works since how to compare `Point`

instance is
non-determined yet. Hint: implement `__lt__`

method.

```
In [49]: Point(1, 3) < Point(2, 1)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-49-40188351117f> in <module>
----> 1 Point(1, 3) < Point(2, 1)
TypeError: '<' not supported between instances of 'Point' and 'Point'
```

If you holds a strong opinion about type, the `collections.namedtuple`

might be
a good middle-ground:

```
In [51]: Point = namedtuple('Point', ['x', 'y'])
In [52]: Point(1, 2) < Point(1, 3) < Point(2, 1)
Out[52]: True
```

Personally, I would still prefer the `tuple`

approach just to avoid the
explanation of magic of `namedtuple`

.

The `tuple`

instance is immutable, so it can be used for the `dict`

key. For the
2D dynamic programming problem, instead of building the 2d matrix tediously (see
below), we can use `dict`

, such as `dp[(i, j)]`

instead.

### Trie

We can build the Trie with `defaultdict`

:

```
from collections import defaultdict
from pprint import pprint
def default_factory():
return defaultdict(default_factory)
trie = defaultdict(default_factory)
for word in ['hello', 'halt', 'hat']:
node = trie
for c in word:
node = node[c]
node['$'] = None
pprint(trie)
```

Each node of the `Trie`

is `defaultdict`

thanks to the `default_factory`

. The
dump shows the memory layouts as:

```
defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'h': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'a': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'l': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'t': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'$': None})}),
't': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'$': None})}),
'e': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'l': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'l': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'o': defaultdict(<function default_factory at 0x7fcacd51d6a8>,
{'$': None})})})})})})
```

### LRU Cache

We can implement the LRU cache with `OrderedDict`

, see the official
document.

### Parsing wth stack

The data structure, `stack`

is frequently used for expression parsing. The
`list`

can be modeled as stack with `append`

and `pop`

methods. As python is
dynamic-typed, it is quite handy to store heterogenous operands and operators in
the stack.

## Syntax sugar

In this section, I will share some tips which may save a minute or two in the interview.

Compare every adjacent elements in the list. The magic is `zip`

can aggregate
two sequences and stops when the shortest is exhausted.

`[x > y for x, y in zip(nums, nums[1:])]`

Build the lookup table for the element to its index.

`lut = dict((x, index) for index, x in enumerate(nums))`

Pre-allocate the list with desired capacity.

`nums = [0] * size`

It is worthy noting that python uses pass-by-reference, so the following snippet does not work as expected:

```
In [2]: rows, cols = 2, 3
...: row = [0] * cols
...: matrix = [row] * rows
...: matrix[0][0] = 3
In [3]: print(matrix)
[[3, 0, 0], [3, 0, 0]]
In [21]: row
Out[21]: [3, 0, 0]
```

Why? Because the `matrix`

is instantiated with same `row`

reference, thus
`matrix[0]`

, `matrix[1]`

points to the same `row`

!

The right approach^{1} is to proactively allocate each row:

`matrix = [row[:] for _ in range(row)]`

The `row[:]`

creates a copy of row, so each row of the matrix is independent.

With this in mind, we can modify a list during the iteration!

```
In [31]: l = [1, 2, 3, 4, 5]
In [32]: for x in l[:]:
...: if x % 2 == 0: l.remove(x)
In [33]: l
Out[33]: [1, 3, 5]
```

For demo purpose only, there are faster, more readable way to remove even elements.

## Conclusion

Python gains its reputation for its versatility. It is clearly not the fastest
programming language, — this rarely matters in the live coding interview, let
alone the whiteboard programming. Its *concise grammar*, *high expressiveness*
and *built-in data structures* will help you get the job(done).

- Please ping me you if you have better ideas to pre-allocate the matrix.↩