Question
Estimate the average number of visitors per page(websites) from a page view data stream of Tuples when 20% accessibility of the entire stream Let us
Estimate the average number of visitors per page(websites) from a page view data stream of Tuples when 20% accessibility of the entire stream
Let us assume that we have access to page view data from several websites in the form of a stream of tuples with the following schema: (Domain_ID, Page_ID, Person_ID)
Let's make the following assumptions about this stream:
Domain_IDs are unique, but a Page_ID is unique only within a Domain (i.e., different Domains may assign the same ID to different pages)
Person ID's are unique only within a Domain (different Domains may assign the same ID to different visitors).
In the full input stream, for each Domain we have S # of pages visited by N visitors each and D # of pages visited by 2N visitors each and no pages have a different numbers of visitors.
There are no duplicate visits by any user for any page, i.e. a given (Domain_ID, Page_ID, Person_ID) tuple only occurs once in the entire stream
We are interested in answering the following query: - For each domain, estimate the average number of visitors per page.
If we only have access to a random sample that captures 20% of the entire input stream, estimate the degree of underestimation or overestimation that this represents as compared to the true averages in the stream.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started