|
|
|
|
LEADER |
01914 am a22002653u 4500 |
001 |
87005 |
042 |
|
|
|a dc
|
100 |
1 |
0 |
|a Indyk, Piotr
|e author
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
|e contributor
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
|e contributor
|
100 |
1 |
0 |
|a Indyk, Piotr
|e contributor
|
100 |
1 |
0 |
|a Rubinfeld, Ronitt
|e contributor
|
700 |
1 |
0 |
|a Levi, Reut
|e author
|
700 |
1 |
0 |
|a Rubinfeld, Ronitt
|e author
|
245 |
0 |
0 |
|a Approximating and testing k-histogram distributions in sub-linear time
|
260 |
|
|
|b Association for Computing Machinery (ACM),
|c 2014-05-15T18:13:10Z.
|
856 |
|
|
|z Get fulltext
|u http://hdl.handle.net/1721.1/87005
|
520 |
|
|
|a A discrete distribution p, over [n], is a k histogram if its probability distribution function can be represented as a piece-wise constant function with k pieces. Such a function is represented by a list of k intervals and k corresponding values. We consider the following problem: given a collection of samples from a distribution p, find a k-histogram that (approximately) minimizes the l [subscript 2] distance to the distribution p. We give time and sample efficient algorithms for this problem. We further provide algorithms that distinguish distributions that have the property of being a k-histogram from distributions that are ε-far from any k-histogram in the l [subscript 1] distance and l [subscript 2] distance respectively.
|
520 |
|
|
|a David & Lucile Packard Foundation (Fellowship)
|
520 |
|
|
|a National Science Foundation (U.S.) (Grant CCF-0728645)
|
520 |
|
|
|a National Science Foundation (U.S.) (Grant 0732334)
|
520 |
|
|
|a National Science Foundation (U.S.) (Grant 0728645)
|
546 |
|
|
|a en_US
|
655 |
7 |
|
|a Article
|
773 |
|
|
|t Proceedings of the 31st symposium on Principles of Database Systems (PODS '12)
|