Thursday, February 12, 2004

Linkage in Web documents

12/02/04, Thursday: 6.09pm

Why should we view the Web as a bunch of pages and a bunch of links among them? Is that all Web is about? If you think more about it, when you create a Web site, what you have in mind is the number of topics to cover or the important concepts, but not the number of html pages you will have and how you are going to link these pages. Then when did this concept of these Web pages and linkage structure among them come into picture?

Most of the search engines, Web services, and lot of Web applications view the World Wide Web as these bunch of these inter-linked Web pages, instead of the important concepts that represent these documents. A more proper way to think about the Web is in terms of Web directories which organizes the Web pages in terms of importnat concepts and relationships among the concepts.

One possible reason, why we do not have this perspective of concepts and relationships among them is because it is too difficult to do. When you create a Web site, you won't often have time to put it in one of the pre-defined categories (or add a new category and add your page to it), and define all the concepts that you talk about and point the corresponding content to these concepts. So the whole Web consists of these un-curated and un-documented Web pages and there is no immediate way to organize these pages in terms of concepts and relationships among them. The only information we have is the bunch of interlinked Web pages, and we just have to deal with them. If we can analyze these bunch of interlinked Web pages to organize them in terms of important concepts that they talk about with relationships among these concepts, that's great! But how can we do it?