# Optimal disjoint-set based implementation

suggest changeWe can do two things to improve the simple and sub-optimal disjoint-set subalgorithms:

**Path compression heuristic**:`findSet`

does not need to ever handle a tree with height bigger than`2`

. If it ends up iterating such a tree, it can link the lower nodes directly to the root, optimizing future traversals;

```
subalgo findSet(v: a node):
if v.parent != v
v.parent = findSet(v.parent)
return v.parent
```

**Height-based merging heuristic**: for each node, store the height of its subtree. When merging, make the taller tree the parent of the smaller one, thus not increasing anyone’s height.

```
subalgo unionSet(u, v: nodes):
vRoot = findSet(v)
uRoot = findSet(u)
```

```
if vRoot == uRoot:
return
if vRoot.height < uRoot.height:
vRoot.parent = uRoot
else if vRoot.height > uRoot.height:
uRoot.parent = vRoot
else:
uRoot.parent = vRoot
uRoot.height = uRoot.height + 1
```

This leads to `O(alpha(n))`

time for each operation, where `alpha`

is the inverse of the fast-growing Ackermann function, thus it is very slow growing, and can be considered `O(1)`

for practical purposes.

This makes the entire Kruskal’s algorithm `O(m log m + m) = O(m log m)`

, because of the initial sorting.

**Note**

Path compression may reduce the height of the tree, hence comparing heights of the trees during union operation might not be a trivial task. Hence to avoid the complexity of storing and calculating the height of the trees the resulting parent can be picked randomly:

```
subalgo unionSet(u, v: nodes):
vRoot = findSet(v)
uRoot = findSet(u)
```

```
if vRoot == uRoot:
return
if random() % 2 == 0:
vRoot.parent = uRoot
else:
uRoot.parent = vRoot
```

In practice this randomised algorithm together with path compression for `findSet`

operation will result in comparable performance, yet much simpler to implement.