Radix Tree - nu1lspaxe

Introduction#

A Radix Tree (also called a Patricia Trie) is a space-optimized Trie. It merges chains of single-child nodes to reduce space usage.

Trie vs Radix Tree#

Feature	Trie	Radix Tree
Node content	Single characters	Strings (compressed prefixes)
Space usage	High (many short nodes)	Lower (due to compression)
Lookup speed	O(k), where k = length	Same, often faster in practice

For example, we are inserting ["romane", "romanus", "romulus", "rubens", "ruber", "rubicon", "rubicundus"] to

Trie : each letter gets its own nodes (deep trees)
Radix Tree : nodes may store strings like "rom", "ane", "ulus", making the tree shallower and faster

Implementation#

Node structure#

type Node struct {
	prefix   string
	children map[byte]*Node
	isWord   bool
}

type RadixTree struct {
	root *Node
}

func NewRadixTree() *RadixTree {
	return &RadixTree{
		root: &Node{
			children: make(map[byte]*Node),
		},
	}
}

Each node holds :

a prefix string
a map of children by the first byte
a flag isWord indicating if it ends a word

Inserting a Word#

func (t *RadixTree) Insert(word string) {
	if len(word) == 0 {
		return
	}

	curr := t.root

	for {
		// find a child that starts with the same character as the remaining word
		child, ok := curr.children[word[0]]
		if !ok {
			// no matching child, create a new one
			curr.children[word[0]] = &Node{
				prefix:   word,
				children: make(map[byte]*Node),
				isWord:   true,
			}
			return
		}

		// find the common prefix between the word and the child's prefix
		commonLen := 0
		minLen := min(len(child.prefix), len(word))

		for i := range minLen {
			if word[i] != child.prefix[i] {
				break
			}
			commonLen++
		}

		// if the child's prefix is fully matched
		if commonLen == len(child.prefix) {
			word = word[commonLen:]
			curr = child

			// if we've consumed the entire word, mark this node as a word
			if len(word) == 0 {
				child.isWord = true
				return
			}
			continue
		}

		// create a new node for the common prefix
		commonPrefix := child.prefix[:commonLen]
		childSuffix := child.prefix[commonLen:]
		wordSuffix := word[commonLen:]

		// create a new node for the child's remaining part
		newChild := &Node{
			prefix:   childSuffix,
			children: child.children,
			isWord:   child.isWord,
		}

		// update the existing child to represent the common prefix
		child.prefix = commonPrefix
		child.children = make(map[byte]*Node)
		child.isWord = false

		// add the new child under the common prefix
		if len(childSuffix) > 0 {
			child.children[childSuffix[0]] = newChild
		}

		// add the new word part
		if len(wordSuffix) > 0 {
			child.children[wordSuffix[0]] = &Node{
				prefix:   wordSuffix,
				children: make(map[byte]*Node),
				isWord:   true,
			}
		} else {
			child.isWord = true
		}

		return
	}
}

It starts from the root and looks for a child node that shares a prefix with the word to insert
If not child matches the first character, it simply adds a new node
If a match is found, we find the longest common prefix
Depending on how much of the prefix matches, we may need to :
- Traverse further down the tree
- Split the existing node (partial match)
- Create new children for the diverging suffixes

Searching for a Word#

func (t *RadixTree) Search(word string) (exists, isWord bool) {
	curr := t.root
	for {
		if len(word) == 0 {
			return true, curr.isWord
		}

		child, ok := curr.children[word[0]]
		if !ok {
			return false, false
		}

		if len(word) < len(child.prefix) {
			return false, false
		}

		if word[:len(child.prefix)] != child.prefix {
			return false, false
		}

		word = word[len(child.prefix):]
		curr = child
	}
}

Walks down the tree, matching prefixes at each step
If a child prefix matches the begining of the word, we trim that portion and continue
If the word if fully matches and the current node is a word, we return true

Prefix matching is similar to Search, but only checks if a full prefix path exists, without requiring the final node to be a full word.