2007-09-19

Die Google! - future search engines


While talking with a friend, I realized that Google isn't the last word in the web search engines.

From technical point of view google search engine isn't magic, the topic of indexing data is nothing revolutionary.

The only interesting thing about Google success is their method of promoting good results and punishing bad in search results. In Google nomenclature it's called the (mistified) Pagerank. They keep the exact algorithm in secret, but in the beginning it was some kind of counting links to specified website.


The development of Sergey and Larry's algorithm was tied by their input data. The only thing they had was the source of web page and links between sites. Nothing more. (maybe the domain name, link url and hosting center, but
that doesn't say anything about the content)

Currently, in the time of user generated content, we have a bit more information.

We have direct link between the content and the author. In the Web 2.0 epoch we exactly know who/where/when wrote specified text.

If you search for some information and you found someone's information useful, maybe you would like to seek for more posts of this person. Maybe you would like to see what threads does this person read. Or if you read some forum, it could be interesting for you that most of the people interested in this forum like some different forum. (This is already implemented in online shops. You can see that people who bought this item also bought something else.) Or if your search for information maybe you would like to see content created by your friends first?

I want to emphasize the fact, that exact information about the author of the content is something that Google will never be able to get.

Maybe the idea is to change the meaning of search. Maybe nowadays the thing of 'search' is to link the user not with the content, but the user with the author?

I'm already using this.
When I find something interesting, I want to see other things from the author. I want to know what he is interested in, I search for his profile in LinkedIn, I search for his interesting links in del.icio.us.

How about you?



3 comments:

tomalak said...

That's all well and good if you have an internet whose infrastructure relies inherently on the existence of pre-defined social networking services. Do you really want to give individual sites the power to control the entire infrastructure of how the internet is glued together?

The net is about random, decentralised and unmoderated publication. The only thing that everything on the net should be forced to have in common is the HTTP protocol: text. And modern search engines handle that very nicely already.

Well, except Cuil. :)

majek said...

> Do you really want to give individual sites
> the power to control the entire infrastructure
> of how the internet is glued together?

It's not about the infrastructure it's about the content. I don't want to give them the power. They already have it. Look at Facebook, Twitter, etc. They can remove any content they like.

I feel that the www/http has major feature missing. It's "global" authentication.

Google doesn't have the knowledge of the author of the content. Facebook has.

Give me www with authentication, and then even Google could have this knowledge.

But now, Facebook has a strong advantage. Though, I think they still don't know how to use it.

> Modern search engines handle http/text very
> nicely already.

Why Google became so popular? They found a way to connect and rank documents. But it's time to realize that the <a> tag is not the only connection on the net. The connection to the real author of the content is also relevant. And (only?) social networking sites have this data.

Sajan said...

www.udacity.com has a building your own search engine course which explains the pagerank method..