How IZE Really Worked - Algorithm, Patent, Limits, and Esther Dyson
In the first post in this series, I introduced IZE -- a DOS-era personal information manager with a novel approach to search and navigation. Here I want to go deeper into how it actually worked, what its limits were, and how it was received at the time.
The algorithm
The core of IZE was patented by Paul Kleinberger (US5062074A, "Information retrieval system and method"). The basic algorithm is easy enough to describe:
- First, filter the documents to include only those that contain the search term.
- Then, for each document in the results set, count the number of times each keyword appears.
- Find the most frequently occurring keyword across the result set, and split the results into two groups -- those that include that top keyword, and those that don't. This defines the root node of a hierarchy tree.
- Then, recursively apply the same process (return to step 2) to each group, splitting the largest nodes first, using the top keyword in each group as the next split.
- Stop when there aren't any documents left to split, or when you've filled up the screen (about 15 rows).
Consider this example:

After the first step, finding documents with "new" tagged in the Letters textbase, there are 14 documents remaining. The most frequently occurring keyword in those documents is "announcements", which appears 5 times. So the first split is into "announcements" (5 documents) and "not announcements" (9 documents). The next recursive step would be to break down "not announcements", which would split into "welcome" (4) and "not welcome" (5). Note that "not announcements" and "not welcome" are not explicitly shown in the UI, but they're present in the underlying tree structure.
The algorithm is simple, but it's also surprisingly effective. It's a form of divisive hierarchical clustering, where the splits are determined by the most frequent keyword in the current set of documents.
The binary tree that results gets converted into an indented hierarchy, which makes it easy to display and navigate. Selecting a subtree drills down, re-building the hierarchy from the subset of documents. Or you could go to a specific document by selecting the ▶️ symbol. There's some additional complexity related to fully redundant keywords, and some details related to how the hierarchy is displayed on a 24-row screen, but that's the core of it.
It's worth comparing IZE's approach with contemporary search engines. They also filter to include items that match the search term -- that's the "retrieval" step. But they also rank results to show the most relevant documents first. IZE didn't rank results at all. Instead, it gave you structure -- a way to see what kinds of documents matched your query, and to navigate to the ones you cared about. The tradeoff was that you couldn't just scan the top few results; you had to engage with the hierarchy. But for some use cases that was arguably the right approach.
| 1980s search engine | IZE (1988) | Modern search engine | |
|---|---|---|---|
| Search | Boolean (AND/OR/NOT) | Boolean | Keyword + semantic |
| Ranking | Yes, by relevance score | None | Yes, by relevance + personalization |
| Orientation | Flat result list; you're on your own | Keyword-split hierarchy; browse the structure | Faceted filters, query suggestions, related searches |
Topical search
One feature worth calling out was what IZE called "topical search." When you drilled down through the hierarchy, you were following a strict top-down path -- each level narrowed the set further. But topical search let you reset the search from your current context, which could include items that would otherwise have been excluded by the path you'd taken.
In the running example above, you could do a topical search for "welcome", and it would show all results with "new and welcome", even if they did have "announcement" in them as well. This was a mechanism to let you browse the structure more freely, and supported a more exploratory workflow.
Limits
IZE had some real constraints. The initial search was optionally full-text, but refinement was only based on the pre-defined keywords used to define the hierarchy. Support for natural language queries was limited -- there was only basic handling of affixes and synonyms. The system had a hard limit of 32,000 documents, whether internal or external (linked but not imported). Keywords could be selected manually in the text editor, or you could define rules to auto-tag documents -- something like "add this keyword if it's present" -- but there was no sophisticated natural language processing going on. This was the late 1980s and inexpensive PCs, after all.
Esther Dyson's take
Esther Dyson covered the technology that would become IZE (then known as TNET) in her influential newsletter Release 1.0 in May 1987. For those unfamiliar, Dyson was one of the most respected technology analysts and investors of the era -- her newsletter was required reading for anyone in the PC software industry. Getting a writeup in Release 1.0 was a significant endorsement.

She then opines:
The beauty of TNET is that it does the work. Automatic indexing is easy; automatic structuring based on such an index, to our knowledge, isn't commercially available. It can be scaled up; indeed, its value increases on large-scale text bases (although performance may be an issue)... Unlike typical text search programs, where you can get exactly what you ask for whether or not you ask for the right thing, TNET helps you explore what you could want -- not in the usual way of providing a list of key words, a ranking of items by number of matches with a list or cluster of key words, but in a meaningful way, so you can easily see what you have to choose from.
Product history
IZE was acquired and then published by by Persoft, a Madison, Wisconsin-based software company. After Persoft was acquired a few years later, IZE was spun off to a new company called Retrieval Dynamics, Inc. IZE's sales weren't strong enough to sustain the company, and it was discontinued.
Now, you can download IZE from abandonware site Vetusware, and run it in a DOS emulator. I wouldn't necessarily recommend you do so, but as I'll discuss in future posts, it may provide some inspiration. But first, I'll look at technologies that came after IZE -- faceted search, outliners, topic models -- and how the problem IZE was trying to solve has been addressed (or not) by later systems.
Note: This post was primarily human-authored, with AI assistance for research, editing, and organization. The AI filled a Secondary author role. The core ideas and final voice are mine.