Monday, April 09, 2007

Notes on the Andrew Tomkins Presentation

Topic: Web search and online communities

content quality
now we are getting better contents such as flickr

content growth
professional content
Personal content
total web growth: 1-3 M/day
published: 3-4GB/day
professional web: 2 GB
social web ~5-10 GB
private text: 2TB
upper bound: 140TB

content ownership is fragmenting
yahoo share of web content >10% of web

metcalfe law
The community value of a network grows as the square of the number of its users increase.

social media challenges
find it: find the right data which is original and worth-indexing
combine it: combining various type of data and media.

coping with scale
upper bound: 140 tb/day
me: its a simplistic way of calculating the data. data increase is not a linear function.

storage: 52pb/yr
cost: 25 million dollor/yr

key things in generating data and owning a part of internet
gathering contents
making deals
working with users

understanding the content
crawler: lucene zettair lemur

new in web search
class specific QP
vertical alignment
inlined content from 3rd parties
new interfaces as in msn web search
user contributed structures
integrated UGC- user generated content
simple structured queries
query correction
local search
structured content

where research has mostly done:
core search
local search

where not done
effectively using desktop real estate
providing integrated content to users
integrating social networking to search.

how to monetize ajax based websites
how the search engines crawl graphic and ajax sites

ooooooooooooo End of the post oooooooooooooooo

No comments:

Post a Comment

Design thinking at work (Book summary)

Key things to keep in mind The essence of design thinking, in many ways, involves identifying problems by seeing things that other peo...