IZE meets AI -- semantic search, smarter labels, and agentic orientation

Mar 18, 2026 · 5 min read · search history ize ai llms ·

This is the last post in the IZE series. In the previous installment, I looked at two ways to generalize the IZE algorithm itself: preferring consistent facets and searching for trees with better goodness scores. Here I want to ask a different question: what does the AI revolution of the last few years actually change about what could be built here?

I see three distinct opportunities, at different layers of the system.

Three layers where AI supports or leverages IZE-style hierarchical search

Weighting splits for semantic search

The IZE algorithm's core operation is: find the tag that, if used as a split, puts the most items in the "yes" branch. It's a raw count. That works well for keyword or boolean retrieval, where all results are retrieved by a hard match and any notion of "how well does this item match the query?" is addressed by ranking.

Semantic (vector) search changes that. Instead of a binary retrieved/not-retrieved, each result comes with a similarity score. A document that closely matches the query vector scores higher than one that's only vaguely related. Retrieval becomes a numeric threshold operation that happens after ranking, instead of before. The current IZE algorithm ignores the ranking signal entirely.

A natural fix would be to weight each item's contribution to the split by its similarity score. Instead of counting how many items have a given tag, you'd sum the similarity scores for items with that tag. The algorithm picks the split that maximizes this weighted sum. The effect is that splits capturing the most query-relevant items are preferred, not just the largest groups. A tag shared by five highly-relevant items would beat a tag shared by eight weakly-relevant ones.

This change is modest in implementation -- a drop-in replacement for the count function -- but its behavioral consequences are meaningful. With vector search, there's no hard threshold for what counts as a "result"; you're always choosing a cutoff somewhere. Weighted splits let you defer that decision, naturally favoring splits that correspond to concentrations of relevance in the result set.

LLM-generated tags and labels

Traditional IZE labels are just the keyword itself. The algorithm splits on "brand:Nike" and displays either that or "Nike." For well-structured catalog metadata this is fine, but it's limiting in two ways.

The first limitation is on the input side. If your documents are free text -- product descriptions, internal wiki pages, email threads -- you either need users to manually tag everything (the original IZE burden) or run some kind of extraction. LLMs are genuinely good at this, though it's worth noting that older ML approaches (named entity recognition, text classification) worked reasonably well on structured domains like product catalogs. LLMs are just much more general: they can extract tags from arbitrary free text without domain-specific training data. Given a product description, a model can extract structured attributes -- material, use case, price tier -- at index time, and the resulting tags feed directly into the IZE algorithm without changing it.

The second limitation is on the output side, and it's more interesting. A node split on "outdoor AND lightweight AND under $50" gets labeled in the current system as those three facet labels, on separate rows. That's accurate but not particularly readable, especially as the hierarchy deepens. An LLM could generate a short natural-language label for each node -- "budget hiking gear" or "entry-level camping essentials" -- using the facet values and a sample of the items in that node as input. This label could be more useful to a browsing user than the raw conjunction of filters.

The challenge might be latency and coverage -- computing these labels in real time might slow down user interactions. And you can't pre-compute labels for every possible query result. But you can pre-compute the common cases, cache new labels as they arise, and fall back to facet-value labels for the long tail.

IZE hierarchies for agentic orientation

This one is more speculative. AI agents doing retrieval -- browsing documents, fetching context, answering questions over large corpora -- have an orientation problem that's structurally similar to the one IZE was designed to solve for humans. An agent dropped into an unfamiliar knowledge base doesn't know the vocabulary, doesn't know the distribution of content, and has to issue multiple queries to figure out what's there. Each query costs tokens and latency. The agent might retrieve irrelevant documents, miss relevant ones, or loop through the same territory repeatedly.

An IZE hierarchy over the corpus is a compact representation of its structure and contents. A one-page hierarchy for a moderately complex corpus might fit in 50-100 tokens and give the agent a useful map before it issues a single retrieval query. The agent could look at the top two levels, decide which branch is relevant to its current task, then drill into that subtree -- issuing targeted queries rather than broad exploratory ones. This is token-efficient in a way that sending a large retrieved set to the model isn't.

The hierarchy also might make agent behavior more predictable. Rather than issuing open-ended queries and hoping for good retrieval, the agent follows a structured path. That structure can be logged, audited, and optimized. Whether this would actually improve agent performance in practice is an open empirical question, but the mechanism seems sound. The same intuition behind IZE's user interface design -- "a search UI should educate as well as retrieve" -- applies to AI systems navigating unfamiliar document spaces, not just to humans.

That's the end of the series. To recap: IZE was a 1988 DOS application that built dynamically-generated hierarchies from tagged documents. It was commercially unsuccessful but technically interesting, and its core idea -- splitting search results recursively on the most informative single keyword -- was re-invented and studied independently in several different domains. Today, with rich catalog metadata, vastly faster computers and algorithms, and LLMs, the algorithm is more tractable than ever. E-commerce and enterprise search use-cases seem relevant, and modern AI technology could be both a user and a driver of effective systems.

Whether any of this gets built in a product is a different question. If you have a compelling use case, please reach out.

Notes:

This post was primarily human-authored, with AI assistance for research, editing, diagram drafting, and organization. The AI filled a Secondary author role. The core ideas and final voice are mine.