2010-01-07

Search in the versioned world

Whatever you are using: a blog, a portal, a forum or microblogging service, search works on current set of documents.


If I remove my last blog post, I expect a search to omit it in the search results. Search ought to include only the most recent state of documents.

But what if you can navigate through the history of document? What should appear in the search? We are used to see only the most recent version of document. For example in wikipedia search doesn't show documents that included searched phrase in the past.

This problem can appear also in other contexts. Consider using a VCS. I created some content, or code. I removed it as it looked unneeded. Some time later I realized that I need this content back. What can I do to recover it? Currently I have to manually review diffs in GIT/Mercurial/or whatever VCS. Shouldn't be a button somewhere: "search in history"?

So, how do we expect a search to work on versioned content? Is that feature useful at all?


2 comments:

cezio said...

Actually, Mercurial has a grep command, which greps history. See $hg grep --help for details.

Git's grep works on one revision, but git log has -S switch, which looks for differences that introduce or remove an instance of string.

Nick Rosencrantz said...

In many cases all you got to retrieve the version you look for is when it's committed. So from a user perspective one easy to handle approach could be some function like timemachine allowing a view relative a date displaying the archive like it was that time ie. switch the now and view everything relative another historical now managing the archive exactly the way it was for instance february last year.