content quality
now we are getting better contents such as flickr
content growth
professional content
Personal content
total web growth: 1-3 M/day
published: 3-4GB/day
professional web: 2 GB
social web ~5-10 GB
private text: 2TB
upper bound: 140TB
fragmentation
content ownership is fragmenting
yahoo share of web content >10% of web
metcalfe law
The community value of a network grows as the square of the number of its users increase.
social media challenges
find it: find the right data which is original and worth-indexing
combine it: combining various type of data and media.
coping with scale
upper bound: 140 tb/day
me: its a simplistic way of calculating the data. data increase is not a linear function.
storage: 52pb/yr
cost: 25 million dollor/yr
key things in generating data and owning a part of internet
gathering contents
making deals
working with users
understanding the content
crawler: lucene zettair lemur
new in web search
class specific QP
vertical alignment
inlined content from 3rd parties
new interfaces as in msn web search
user contributed structures
integrated UGC- user generated content
simple structured queries
query correction
local search
structured content
where research has mostly done:
core search
local search
where not done
effectively using desktop real estate
providing integrated content to users
integrating social networking to search.
Questions:
how to monetize ajax based websites
how the search engines crawl graphic and ajax sites
ooooooooooooo End of the post oooooooooooooooo
No comments:
Post a Comment