Could We Build IZE Again?

Mar 9, 2026 · 6 min read · search history ize ·

In the previous installment of this series, I looked at what came after IZE -- faceted search, clustering algorithms, and the various ways web search, personal information management, and e-commerce tried to solve similar problems to what IZE was attacking. None of them ended up doing what IZE did. The question I want to take up here is: could we build something like IZE today? I think the answer is yes, for at least one domain. IZE's hierarchical search algorithm is worth revisiting for e-commerce operators with medium-to-large catalogs who want to help new customers orient themselves. Modern catalog metadata makes it easier to implement than ever.

Why e-commerce is the right fit

What would a modern product leveraging the IZE algorithm look like? The original product required users to manually and consistently tag their documents with keywords, which was a significant burden. What if we thought of IZE as a search UI pattern, instead of a product? Modern e-commerce catalogs already have rich metadata for filtering and faceted display -- brand, category, color, size, material, price range, and so on -- that could serve as a replacement for those user-defined keywords. You don't need to ask anybody to tag anything new, and you don't need to worry about stopwords or morphology issues that arise in free-text search. The facet values are already there.

Catalogs tend to have hundreds up to millions of items, and users often don't know what they're looking for. Medium-to-large catalogs are exactly where new-user disorientation is most costly.

As I noted in the first post in this series, a search UI has two jobs: helping people find relevant information, and educating them about the structure of the information space. IZE was unusually good at the second job. Most modern search UIs focus on the first job, and do a weaker job at the second.

Faceted search is fine if you know you want "blue sneakers, size 10, under $100." It's less great if you're browsing a new-to-you outdoor gear store and you're not sure what category your need falls into. A dynamically-generated hierarchy could orient you to the available content in a way that faceted search doesn't. This is the specific gap IZE was designed to fill: not just finding, but educating users about the shape of the result set.

How the algorithm works

What I find appealing about IZE's approach is that it's synthetic and dynamic. In the original implementation, as described in a previous post, the hierarchy isn't pre-defined by a taxonomy team; it emerges from the documents themselves, and their user-specified keywords, in response to the user's current query. That's different from faceted search, where the facets are orthogonal and fixed in advance.

When applied to an e-commerce catalog, the algorithm works just like the original IZE algorithm, but using the catalog's facets instead of user-defined keywords. After the initial search, which provides the result grid, the algorithm makes subsequent queries, leveraging a faceted-search feature that provides counts of items that match each facet value. For example, if the user searches for "blue phone", the search engine might indicate that "brand:Samsung" has 100 matches, "brand:Apple" has 50, and so on. The algorithm uses these counts to greedily select the splits, then recurses to build the IZE hierarchy. The result is a dynamically-generated hierarchy of facet values, which the user can navigate to understand and explore the catalog. Unlike the original IZE, we'd implement a two-column UI: the hierarchy on the left, the filtered items on the right, updating as the user navigates.

There's also a relationship here to post-query refinement suggestions, which I wrote about earlier. PQRSs are a lighter-weight version of the same idea -- after you submit a query, the system suggests filters that would help you narrow the results. IZE's approach is more structured and recursive, but the intuition is similar: show the user the shape and scope of the result set, not just the results.

A demo

To explore these ideas, I built a naive implementation on top of an existing search engine with demo data.

The implementation is not difficult, but it has a significant performance problem: the recursion requires re-querying the search engine for each node. So, rather than searching for "blue phone" once, behind the scenes you might be searching for "blue phone" and "blue phone AND color:blue" and "blue phone AND color:blue AND brand:Apple" and so on. This is not a production-ready approach, but it does show what the two-column UI might look and feel like.

Want to try it out? The demo is here. It's a bit buggy, but it's fun to play with! You can compare the standard faceted search UI with what the IZE algorithm produces, and note how much more you learn about the structure of your search results from the IZE algorithm.

Comparison of IZE hierarchy with faceted search

The IZE hierarchy is more intuitive and easier to navigate, it does not require scrolling, and it provides a better sense of the structure of the result set. However, for experienced users, it may be less efficient to filter on a specific, known facet value.

Could it be optimized?

The performance of the algorithm could be improved, with some engineering effort. The recursion is expensive because we're starting from scratch with a new query at each step, and because the query itself does ranking and many other features that aren't needed for the hierarchical clustering. An implementation using algorithms that re-use intermediate steps, such as the roaring bitmaps generated for specific sub-searches, could be much faster -- fast enough, I think, to be viable for production use on medium-to-large catalogs. The bitmaps could be cached and re-used for subsequent sub-queries, and the recursive algorithm could be optimized to avoid redundant or unnecessary work.

The product question

The real barrier isn't engineering -- it's operator adoption risk. Users are trained on faceted search; a new pattern requires a reason to switch. The case for trying: new-user conversion on large catalogs is a real, measurable problem, and the IZE approach is a testable hypothesis. An A/B test comparing standard faceted search against a dynamically-generated hierarchy for new users on a large catalog would be a reasonable way to find out.

In the last two posts of this series, I'll be covering extensions and variations on IZE that might yield even more compelling and intuitive hierarchies, and will explore the implications of modern AI on IZE-based search UI.

And if you're interested in implementing some variation of the IZE algorithm efficiently, for a real business problem, I'd love to hear from you.

Note: This post was primarily human-authored, with AI assistance for research, editing, and organization. The AI filled a Secondary author role. The core ideas and final voice are mine.