Approximately Uniform Random Sampling In Sensor Networks
Approximately Uniform Random Sampling
in Sensor Networks
Boulat A. Bash
John W. Byers
Jeffrey Considine
Computer Science Department
Boston University
{boulat, byers, jconsidi}@cs.bu.edu
Abstract
the sensor network. One typical sensor database sce-
nario involves sensor elements that are prone to failure,
Recent work in sensor databases has focused ex-
are highly resource-constrained, and must communi-
tensively on distributed query problems, notably
cate across a lossy network.
Sensor networks com-
distributed computation of aggregates.
Exist-
prised of small battery-powered motes are a represen-
ing methods for computing aggregates broadcast
tative instantiation of this scenario [7]. In such an
queries to all sensors and use in-network aggre-
gation of responses to minimize messaging costs.
environment, aggregation queries are particularly ef-
In this work, we focus on uniform random sam-
fective, as they are robust to node and link failures,
pling across nodes, which can serve both as an
can be resilient to incorrect or outlying responses, and
alternative building block for aggregation and as
are amenable to the use of in-network processing to
an integral component of many other useful ran-
minimize messaging cost. For these queries, approxi-
domized algorithms. Prior to our work, the best
mate answers typically suffice, especially in light of the
existing proposals for uniform random sampling
very high cost of ensuring 100% reliability in commu-
of sensors involve contacting all nodes in the net-
nications in sensor networks. Recent work has focused
work. We propose a practical method which is
on computation of aggregates using a request/reply
only approximately uniform, but contacts a num-
model in which a query is broadcast to a region of in-
ber of sensors proportional to the diameter of
terest, individual sensors make best-effort replies, and
the network instead of its size.
The approxi-
mation achieved is tunably close to exact uni-
responses are aggregated in-network en route to the
form sampling, and only relies on well-known
origin of the query [3, 11, 20].
existing primitives, namely geographic routing,
In this paper, we argue that there is a rich and rela-
distributed computation of Voronoi regions and
tively under-explored set of classic statistical methods
von Neumann’s rejection method.
Ultimately,
that have not yet been extensively studied in the do-
our sampling algorithm has the same worst-case
main of sensor databases. In particular, we propose
asymptotic cost as routing a point-to-point mes-
a more careful study of random sampling methods,
sage, and thus it is asymptotically optimal among
which have long been used in other domains to approx-
request/reply-based sampling methods. We pro-
imately compute aggregates such as MEDIAN, AVG,
vide experimental results demonstrating the effec-
tiveness of our algorithm on both synthetic and
and MODE [2, 12, 13]. Random sampling is a par-
real sensor topologies.
ticularly good fit for approximate aggregation queries
in the sensor network domain in light of the poten-
tially modest messaging cost. While we view random
1
Introduction
sampling as especially useful in the context of data
In the emerging research area of sensor databases, a
management and data aggregation problems, we also
central challenge is to develop cost-effective methods
note that it is an integral component of other useful
to extract answers to queries about conditions inside
randomized algorithms that are potentially applicable
to sensor networks, including randomized routing [18].
The authors were supported in part by NSF grants ANI-
In the context of sensor networks, a natural abstrac-
9986397, ANI-0093296, ANI-0205294, and EIA-0202067.
tion is spatial sampling, i.e. sampling from geographi-
Copyright 2004, held by the author(s)
cal locations within the network uniformly at random.
Proceedings of the First Workshop on Data Mana-
On a 2-D network with bounded spatial extent, such
gement for Sensor Networks (DMSN 2004),
an objective can easily be realized by picking an (x, y)
Toronto, Canada, August 30th, 2004.
coordinate from within the space at random and us-
http://db.cs.pitt.edu/dmsn04/
ing geographical routing to route to the node closest
to that point. While this is desirable for many appli-
2
Sampling: Problems and Methods
cations, such as computing spatial averages [6], many
We now formally define our sampling problems.
other applications and database queries prefer to ig-
nore geometry and instead wish to sample uniformly
Definition 1 (Uniform random sampling) An
from the set of nodes. Examples include querying av-
algorithm samples uniformly at random from a set of
erage sensor battery life, counting the number of nodes
reachable sensors S if and only if it outputs a sensor
that are currently capable of executing a given sens-
ID s ∈ S with probability 1
|S| .
ing task, determining the 95th quantile of sensor CPU
utilization, or estimating the number of sensors that
Uniform random sampling is simple if the set of sen-
will fail within the next day. Our focus is to develop
sor IDs is known in advance and sensors neither fail nor
practical algorithms for uniformly sampling from a set
move. However, it is much more challenging in the re-
of sensor nodes with low messaging cost.
alistic case where the set of IDs may not be known and
Since it is well-known that nodes in a sensor net-
the set of reachable sensors dynamically changes over
work often have highly irregular placements, spatial
time. For these reasons, we will be content with the
sampling will produce non-uniform samples of the
following close approximation to uniform sampling.
nodes [5]. Our work relies on spatial sampling as a
Definition 2 (( , δ)-sampling) An algorithm per-
starting point, but uses practical methods for smooth-
forms ( , δ)-sampling of a reachable set S if and only
ing, or regularizing, the non-uniform samples to pro-
if it returns a sample s ∈ S such that no element of
duce approximately uniform node samples. The key
S is returned with probability greater than 1+
idea is to have each sensor node compute and maintain
|S| and at
least (1 − δ)|S| elements are output with probability at
the area of its Voronoi cell. A uniform node sample is
least 1
then realized by sending a sequence of spatial samples
|S| .
until one is “accepted”. A targeted node in the net-
By this definition, our goal is to sample from almost
work “accepts” by responding to a given spatial sam-
all sensors nearly uniformly with tunable parameters
ple with an appropriate probability normalized by its
and δ. Our definition allows us to under-sample a
Voronoi cell size, otherwise it “rejects”. The specifics
small fraction δ of the nodes.
of this normalization depend on global statistics on the
In a sensor network scenario, we typically wish to
number of nodes in the network and on an appropri-
sample from a set of pairs k, v where k identifies a
ate k-quantile of Voronoi cell sizes across the network.
particular sensor and is unique within the set, and v
We argue that these statistics can be updated infre-
is some value associated with the sensor. This value
quently and consistently. Ultimately, this application
might be a measurement by the sensor, such as the
of von Neumann’s rejection method [19] results in ap-
local temperature, or an internal statistic such as the
proximately uniform node samples.
remaining battery life. As motivated earlier, sampling
As sketched above, our algorithm for generating a
in sensor networks is more challenging since neither a
random sample has a messaging cost that is typically
full list of sensors nor direct communication with them
bounded by the messaging cost of a small constant
is available.
number of spatial samples in the expectation. This
Prior to this work, the following two methods for
cost is low since the messaging cost of computing a spa-
near-uniform sampling were proposed in the context
tial sample is akin to routing a point-to-point message
of sensor and other overlay networks.
using a geographic routing method such as GPSR [8].
In the worst case, such a message traverses the di-
Min-wise sampling: In [13], the use of min-wise
ameter of the network. In contrast, the best existing
samples [1] was proposed for sampling a sensor net-
methods for node sampling, which can compute an ex-
work uniformly at random. Given a hash function h
actly uniform sample, necessitate contacting all nodes
on sensor IDs, they returned the value associated with
in the network [13]. We note that the additional infre-
the ID s such that h(s) is minimal (i.e. ∀s ∈S(h(s) ≤
h
quent global update costs incurred by our algorithm
(s ))). Each sensor would then propagate the value
can be amortized by the potentially vast number of
associated with the smallest observed h(s ).
With
samples that can be taken between updates.
careful control of the transmissions, this scheme can
be implemented with each node in the sensor network
The remainder of the paper is organized as follows:
sending a constant number of messages, for a total of
Section 2 formalizes the uniform sampling problem and
Θ(|S|) transmissions. However, since the entire net-
the limitations of existing methods. We summarize the
work is involved, this is an expensive operation.
building blocks of our proposed method in Section 3.
Our rejection-based sensor sampling algorithm is pre-
Random walks: Another natural method for sam-
sented in Section 4. Then in Section 5 we describe
pling is the use of random walks. In the sensor network
the practical implementation issues, and Section 6 con-
domain, one could generate a random sample by prop-
cludes with the broader implications and applications
agating a request message along a randomly chosen k-
of our work.
hop path starting from the query sink, and sampling
the kth sensor reached. Unfortunately, this procedure
to weight sensor readings for spatial aggregates [6] and
would both need to use a large value of k, and would
they are easily computable, but it is well known that
need to compensate for the fact that the method is
these areas vary widely when the sensors are placed
biased toward drawing samples from near the center
randomly [16]. This variation leads to a bias in spatial
of the spatial region where sensors are located, as we
sampling – each sensor is chosen with probability in
demonstrate in Section 5.1.
proportion to A(s), the area of its Voronoi cell. For
convenience, we assume the areas are normalized so
Our methods follow a rather different line. Like the
that they sum to one, and thus A(s) can also be in-
random walk method, we ultimately seek out a single
terpreted as the probability a randomly chosen point
sensor, but our choice of the route to the sensor avoids
is closest to s.
many of the dependencies and complications of the
random walk approach.
von Neumann’s rejection method: Much of the
early work on random sampling focused on sampling
3
Prerequisites
complex distributions, assuming the ability to sample
simpler distributions. A well known example of this
Instead of choosing a path at random, we choose a lo-
is von Neumann’s rejection method [10, 19]. Suppose
cation in the sensor coordinate space at random and
we wish to sample from a distribution with probabil-
route a probe to its closest sensor using geographic
ity density function f (i.e. an event t has probability
routing techniques.
When we partition the coordi-
f(t)). If we can sample from a distribution with prob-
nate space into regions of ownership by mapping the
ability density function g, then we can sample from
nearest neighbor regions (Voronoi cells) to sensors, we
f as follows. First generate a sample t using g, but
note that these regions are irregularly sized in most
f(t)
instances. Thus, this naive spatial sampling method is
only accept and return sample t with probability cg(t),
very likely to generate a biased sample. Therefore, our
where c is a positive constant. If t is not accepted, it
last key step is to use von Neumann’s rejection method
is rejected and the process repeats for a new sample
f
to normalize the samples. We now briefly summarize
t. Assuming that c is chosen so that (t)
cg(t) ≤ 1, then
these three prerequisite ideas.
the probability of picking a particular event t on the
first attempt is g(t) · f(t)
f(t). It then follows that
Geographic routing: If every node in a network is
cg(t) = 1c
aware of its own coordinates (e.g. via GPS), then it is
after c expected samples from g, we have one sample
possible to route to a particular position using entirely
from f .
local decisions. Most of these local routing decisions
can be made in a greedy fashion, simply choosing the
4
Rejection-based Sensor Sampling
neighboring node which has the closest coordinates to
the destination. This greedy routing fails when there
We now describe our method to combine ideas of spa-
is an obstruction, or “void”, which must be circumnav-
tial sampling with von Neumann’s rejection method to
igated to reach the destination. GPSR [8] provides an
flatten out an irregular probability distribution into a
elegant solution to this problem with just two states.
nearly uniform one. For our application, the desired
The default state of GPSR is greedy routing, while the
density function is uniform, i.e. f (t) = 1
|S| , and the
other state follows the perimeters of voids until greedy
distribution which we can sample from, g(t), is the
routing can resume. When a packet reaches its target
distribution of Voronoi cell areas. One weakness in
point, another round of perimeter routing is run to
von Neumann’s method for exactly reproducing a dis-
visit each of the immediately surrounding sensors so
tribution f is that the constant c must be chosen so
that it can find the sensor nearest to the target point.
f
that for all events t,
(t)
g(t) ≤ c. In our application, if
For typical topologies in 2-D, geographic routing takes
there exists a very small Voronoi cell, then c, and hence
Θ(
|S|) steps.
the expected messaging cost, can be very large. Since
Voronoi diagrams: Once routing to an arbitrary
we cannot rule out this possibility, we content our-
point is possible, we must also quantify the size of the
selves for now with generating approximately uniform
region of points that are closest to a particular sensor
samples. Later, in Section 5.2, we consider strategies
s. Formally, the set of points closer to sensor s than
to boost sampling probabilities for the smallest cells to
any other sensor is called the Voronoi cell of s [4]. In
significantly reduce residual sampling bias. We employ
the planar case which we consider, the Voronoi cell of
the following basic algorithm.
s is a convex polygon containing s, where each edge of
this polygon lies on a perpendicular bisector between
Algorithm 1 (Rejection-based Sampling)
s and another sensor. The exact boundaries of this
1 The random sampler picks a random location in
Voronoi cell are easily determined exactly by locating
the sensor field and routes a message to the sensor
all of the sensors in the immediate vicinity of s. The
s closest to this point, using geographic routing
areas of these Voronoi cells have been used previously
and pre-computation of Voronoi cells.
(a) MIT sensor testbed. Reproduced with permission
(b) James Reserve sensor network. Reproduced with
from [17]
permission from [5]
Figure 1: Maps of real sensor deployments used in our experiments.
2 With probability min(A(s),τ)
A(s)
, s accepts and re-
(1 − δ)|S| elements are sampled with probability at
ports its value, where τ is a threshold to be defined
least 1
|S| . First, we show that all large cells, i.e. cells
shortly.
with area at least τ , are sampled in a given iteration
of the sampling algorithm with probability at least
3 Otherwise, s rejects and the random sampler re-
1
|S| . The probability that a given sensor s is sampled
peats Steps 1-3. The random sampler also returns
in a particular probe is ps = min(A(s), τ ), and thus
to Step 1 if it times out waiting for a response.
the probability that a particular probe is successful is
ps ≤ |S|τ. Now let E denote the event that the
Intuitively, τ can be thought of as a threshold on
s
algorithm ultimately samples from a large cell .
Voronoi cell areas, in which we think of any Voronoi
cell of area at least τ as large and any area less than
p
τ
≥ 1 .
τ
Pr[E ] =
=
as small.
By our procedure, all large cells will
p
p
s s
s s
|S|
be selected equiprobably, but small cells will be se-
Now since large cells are at least a (1 − δ) fraction of
lected with smaller probability, in proportion to their
all cells by the setting of k ≤ δ, we have that at least
area. To ensure that Algorithm 1 results in ( , δ) sam-
pling, we must guarantee that the fraction of small
(1 − δ)|S| elements are sampled with probability at
least 1
cells (sampled non-uniformly) is less than δ, and that
|S| .
the bias introduced by under-sampling small cells re-
Next we show that no element is sampled with prob-
sults in at most (1 + )-oversampling of large cells. In
ability greater than 1+
|S| . By construction, large cells
practice, we set τ to be the area of the cell that is
are sampled with highest probability, so we restrict
the k-quantile, where k = min δ,
attention to those cells. Starting from the same prob-
1+
, and prove the
ability bound as before:
following main result.
p
τ
Pr[E ] =
=
Theorem 1 Running
Algorithm
1
with
k
=
p
p
p
s s
s|A(s)<τ s +
s|A(s)≥τ s
τ
min δ, 1+
and setting τ
to be the cell area
=
p
τ
that is the k-quantile results in ( , δ)-sensor sampling.
s|A(s)<τ s +
s|A(s)≥τ
τ
≤
Proof: By our problem definition, it suffices to show
τ
s|A(s)≥τ
that the method ensures that no element of S is sam-
τ
≤
pled with probability greater than 1+
|S| and at least
(1 − k)|S|τ
τ
≤ (1 − 1+ )|S|τ
≤
1
( 1+
0.4
1+ − 1+ )|S|
0.8 0.6
1.1
1
1.3
0.8
=
1.3
0.8
( 1
0.6
1.5
1+ )|S|
1.1
1.5
0.8
1 +
=
.
0.8
0.6
|S|
0.8
1.3
1.7
1.5
0.9
0.4
0.4
Thus the theorem follows.
1.3
1.3
Relating this result back to von Neumann’s method,
1.3
this corresponds to a situation in which c =
1
1.3
τ |S| . As
with the rejection method, the probability that a par-
1.1
1.3
0.6
ticular sensor s is picked and accepted on the first at-
0.9
1.5
1.1
1.1
1.7
tempt is A(s) min(A(s),τ)
A(s)
= min(A(s), τ ). It remains to
0.9
select an appropriate threshold τ for our algorithm.
0.6
0.9
0.9
0.6
0.6
4.1
Threshold Management
Given user-specified values of
and δ, the threshold τ
should be set to the k-quantile of the Voronoi cell ar-
eas, where k = min δ, 1+
as discussed earlier. The
k-quantile can be computed during an initial prepro-
cessing step using recent techniques developed in the
Figure 2: Sample distribution using long random walks
sensor database community. In particular, work such
along adjacent Voronoi cells. Each sensor’s cell is la-
as [3, 11] shows how to efficiently count the number
beled with its probability relative to the mean. For
of sensors matching some criteria (e.g. with a cell area
example, a sensor labeled 1.3 is picked with probabil-
below a specified threshold) and deriving other simple
ity 1.3/|S|.
statistics such as the average cell area. We note that
while these values need to be updated to account for
sensors placed uniformly at random on a unit square.
dynamic changes within the sensor network, they need
The first real network, illustrated in Figure 1(a), is
not be exact, as bounds on the values suffice for our
a testbed deployed at MIT [17]. These sensors were
methods. Therefore, only infrequent updating of these
heuristically placed according to expected quality as a
global statistics is needed to maintain consistent and
vantage point, and proximity to available power out-
approximately correct values. Updating these statis-
lets. The second real deployment, illustrated in Fig-
tics can easily be performed either by piggybacking
ure 1(b), is a sensor network for micro-climate mon-
them on the random probes or on various control and
itoring at the James Reserve [5]. These sensors are
maintenance messages. Either way, once these statis-
more concentrated in the lower left, where there is
tics are available, the sampler recomputes τ , and sends
thick foliage.
it with each probe. Since the sampler’s value of τ is
The objective of these experiments was to demon-
included in the query, each sensor deciding to accept
strate that we can cheaply obtain a close approxima-
or reject a probe acts consistently.
tion to uniform sampling. Thus, besides examining
and δ at for various choices of τ , we also examine
5
Practical Implementation Issues
the expected value of the random variable Y, which is
the number of probes sent before a sample is returned.
We now discuss the details of a practical implementa-
The actual energy costs of our method depend heav-
tion of Algorithm 1. We begin in Section 5.1 present-
ily upon the geographic routing protocol in use. Since
ing experimental results using the basic implementa-
testing the performance of various geographical rout-
tion outlined in Section 4, and then discuss various
ing protocols is beyond the scope of this work, we do
refinements to improve the uniformity of sampling in
not implement geographic routing in our simulation.
Section 5.2.
First, we confirm our intuition that random walks
are unsuitable for near-uniform random sampling. We
5.1
Experiments
consider the following random process. Starting at any
We experimentally validated our proposed sampling
sensor in the network, a query repeatedly considers the
algorithm using three topologies: two from real sen-
sensors with adjacent Voronoi cells and moves to one
sor deployments and one synthetic topology with 215
chosen uniformly at random. After a sufficient num-
5.0/|S|
5.0/|S|
naive
naive
c = 1
c = 1
c = 2
c = 2
4.0/|S|
4.0/|S|
c = 3
c = 3
c = 4
c = 4
3.0/|S|
3.0/|S|
2.0/|S|
2.0/|S|
Pr[node is selected]
Pr[node is selected]
1.0/|S|
1.0/|S|
0.0/|S|
0.0/|S|
0
5
10
15
20
25
30
35
40
0
10
20
30
40
50
Node Rank
Node Rank
(a) MIT sensor testbed
(b) James Reserve sensor network
Figure 3: Resulting distributions for real testbeds. Nodes are in increasing order of Voronoi cell area.
c
δ
E [Y]
c
δ
E [Y]
c
δ
E [Y]
naive
1.9
0.6
1.00
naive
4.3
0.69
1.00
naive
3.8
0.57
1.00
1
0.34
0.45
1.34
1
0.48
0.46
1.48
1
0.27
0.41
1.27
2
0.047
0.25
2.09
2
0.12
0.23
2.24
2
0.051
0.15
2.10
3
0
0
3.00
3
0.041
0.15
3.12
3
0.017
0.06
3.05
4
0
0
4.00
4
0.012
0.038
4.05
4
0.0079
0.026
4.03
5
0
0
5.00
5
0.0072
0.019
5.04
5
0.0042
0.017
5.02
(a) MIT sensor testbed
(b) James Reserve sensor network
(c) 215 randomly placed points
Table 1: Summary of experimental results
ber of steps to converge on the stationary distribution,
no sensors with less than a third of the average cell
the query outputs its current location. Figure 2 shows
area in their Voronoi cell. With the James Reserve
the Voronoi diagram of the MIT sensor testbed and
network, one sensor has a cell area of slightly more
the relative sampling probabilities of each sensor. As
than one tenth of the average, so c ≥ 10 is necessary
expected, the sensors most likely to be chosen are in
for uniform sampling. However, this is the only sensor
the middle of the network, and the sensors least likely
which is under-sampled for c ≥ 5.
to be chosen are on the edges of the network. Suffi-
For comparison, Table 1(c) summarizes the corre-
ciently long random walks on this topology can achieve
sponding results for a synthetically generated topol-
(0.71, 0.52)-sampling. This is better than naive spatial
ogy of 215 randomly placed points on a unit square.
sampling, which would achieve (1.90, 0.60)-sampling
The smallest Voronoi cell in this topology was slightly
on the same topology, but our rejection-based methods
smaller than
1
98|S| , so if exact sampling is desired,
will give much better results.
an average of c ≥ 99 probes per sample are needed.
Figure 3 shows the results of Algorithm 1 on the
However, just setting c = 5 achieves (0.0042, 0.017)-
real topologies assuming that there are no faults and
sampling.
each sensor knows the area of its own Voronoi cell.
Figure 4 shows the cell size distributions of our test
The areas of both networks are the areas of their min-
topologies where the impact of human choices on sen-
imum bounding boxes. The threshold τ was set to
1
sor placement is present. First, humans are prone to
c|S| for c = 1, 2, 3, 4, 5, and the naive spatial sampling
favor interesting or easily accessible points, resulting
method is included as a baseline. As c increases and τ
in sensors being clustered together, each with below-
decreases, the distribution becomes more uniform and
average area. This is evident in Figure 4: the two
improvements in both
and δ are clearly visible.
real sensor networks have a larger fraction of sensors
Tables 1(a) and 1(b) summarize the parameters of
with below-average Voronoi cell areas than a randomly
the resulting sampling distributions, along with the
generated topology. At the same time, humans are un-
expected number of probes for each sample. With the
likely to choose very poor placements where many sen-
MIT sensor testbed, setting c = 3 (equiv. τ =
1
3|S| )
sors are extremely close together. Figure 4 also hints
results in uniform sampling – this is because there are
at this point, as the smallest Voronoi cells in syntheti-
6
Future Work and Conclusions
1
0.9
James Reserve
Uniform random sampling is a standard and useful
MIT
0.8
random
primitive underlying many algorithmic and statisti-
0.7
cal methods. Our work focused on the unique con-
0.6
straints imposed by sensor networks, and the problem
0.5
of cheaply selecting one sensor node uniformly at ran-
0.4
dom. In future work, there are numerous generaliza-
tions to consider. Our methods immediately general-
0.3
Cumulative Probability
ize to queries that wish to sample nodes satisfying a
0.2
geometric predicate, such as those within a region of
0.1
interest, but we have not yet studied how to efficiently
0
0.0/|S|
1.0/|S|
2.0/|S|
3.0/|S|
4.0/|S|
5.0/|S|
6.0/|S|
sample from nodes satisfying a non-geometric predi-
Voronoi Cell Size
cate. Another interesting question is how best to take
advantage of parallelism when the number of samples
Figure 4: Cell size distributions for random and real
needed or the expected number of attempts is high.
testbeds
Here, distinct probes may traverse common network
links, so clever strategies may be able to reduce total
cally generated networks are significantly smaller than
transmission costs. We also plan to consider how to op-
the ones in real topologies.
timize sampling for queries which do not fall into a re-
quest/reply paradigm. For example, if query patterns
5.2
Algorithmic Modifications
are known in advance, such as periodic fixed queries,
We now consider a variety of heuristics for improving
a more streamlined method for sampling that avoids
our baseline algorithm by reducing the impact of small
explicit requests could be implemented in a decentral-
Voronoi cells on the ( , δ)-approximation.
ized fashion. However, our methods may still find use
in answering such queries since their “on-demand” na-
Sleeping: Perhaps the simplest method for handling
ture allows quick responses to unexpected events or
sensors with very small Voronoi cells is for some of
failures.
these sensors to sleep.
Sleeping sensors are deacti-
Finally, we note that variants of our sampling meth-
vated, and sampling from them is thus rendered impos-
ods can be applied much more broadly, outside the
sible. Putting one small cell to sleep will increase the
context of sensor networks. For example, uniform node
size of adjacent cells (which are also likely to be small),
sampling is also an important problem in structured
so it is not necessary to put all small cells to sleep to
P2P networks based on coordinate systems [9]. Vari-
remove their impact. We note that this approach is
ants of our methods apply to these P2P scenarios and
similar in spirit to some routing schemes which use
provide a simpler and more topology-agnostic alterna-
sleep for power management, particularly in crowded
tive to existing methods.
areas [21]. Because the sensed values from the sleep-
ing nodes are unavailable, this approach may not be
Acknowledgments
appropriate for some applications.
We are grateful to Deepak Ganesan and Stanislav Rost
Pointers: Another method for increasing the sam-
for the use of their sensor deployment maps and al-
pling probability of small cells is for larger cells to
lowing them to be reproduced here. We thank Phil
keep pointers to nearby small cells and forward some
Gibbons, Kanishka Gupta, and Niky Riga for helpful
rejected probes to those small cells. That is, when-
conversations and feedback on earlier versions of this
ever a large cell would reject a probe, it may instead
manuscript.
redirect the probe to a nearby small cell. The proba-
bility of forwarding a probe can be negotiated between
the cells based on their respective sizes. Essentially, a
large cell would donate part of its “unused” area to its
small neighbor.
Virtual coordinates: Instead of using real-world ge-
ographic coordinates to map points to sensors, we can
use virtual coordinates [14, 15], modified to include ei-
ther a repulsive force between close sensors, or a hard
lower bound on the inter-sensor distances. Virtual co-
ordinate spaces also allow the boundaries of the sensor
network to be pre-defined, instead of explored via pe-
riodic probing [5].
References
[18] L. G. Valiant and G. J. Brebner. Universal schemes
for parallel communication. In Proceedings of the 13th
[1] A. Broder, M. Charikar, A. Frieze, and M. Mitzen-
Annual ACM Symposium on Theory of Computing,
macher. Min-wise independent permutations. Journal
pages 263–277. ACM Press, 1981.
of Computer and System Sciences, 60:630–659, 2000.
[19] J. von Neumann.
Various techniques used in con-
[2] S. Chaudhuri, R. Motwani, and V. Narasayya. Ran-
nection with random digits.
U.S. National Bureau
dom sampling for histogram construction: how much
of Standards Applied Mathematics Series, 12:36–38,
is enough?
In Proc. of ACM SIGMOD ’98, pages
1951.
436–447, 1998.
[3] J. Considine, F. Li, G. Kollios, and J. Byers. Approx-
[20] Y. Yao and J. Gehrke. The Cougar approach to in-
imate aggregation techniques for sensor databases. In
network query processing in sensor networks. ACM
Proc. of the IEEE Int’l Conf. on Data Engineering,
SIGMOD Record, 31(3):9–18, 2002.
March 2004.
[21] W. Ye, J. Heidemann, and D. Estrin.
An energy-
[4] M. de Berg, O. Schwarzkopf, M. van Kreveld, and
efficient MAC protocol for wireless sensor networks. In
M. Overmars. Computational Geometry: Algorithms
Proc. of IEEE Infocom, pages 1567–1576, June 2002.
and Applications. Springer-Verlag, 2nd edition, 2000.
[5] D. Ganesan, S. Ratnasamy, H. Wang, and D. Es-
trin. Coping with irregular spatio-temporal sampling
in sensor networks. In Proc. of HotNets-II, November
2003.
[6] C.-C. Han, S. Ganeriwal, A. Boulis, and M. Srivas-
tava. Going beyond nodal aggregates: Spatial aver-
age of a physical process in sensor networks. Poster
in ACM SenSys, Nov. 2003.
[7] J. Hill and D. Culler. Mica: A Wireless Platform for
Deeply Embedded Networks. IEEE Micro, 22(6):12–
24, Nov/Dec 2002.
[8] B. Karp and H. Kung. GPSR: Greedy perimeter state-
less routing for wireless networks. In ACM MobiCom,
Aug. 2000.
[9] V. King and J. Saia. Choosing a random peer. In
Proc. of ACM PODC’04, July 2004.
[10] D. E. Knuth. The Art of Computer Programming, Vol-
ume 2: Seminumerical Algorithms. Addison-Wesley,
Reading, MA, 2nd. edition, 1981.
[11] S. Madden, M. Franklin, J. Hellerstein, and W. Hong.
TAG: a Tiny AGgregation Service for Ad-Hoc Sensor
Networks. In USENIX OSDI, 2002.
[12] G. S. Manku, S. Rajagopalan, and B. G. Lindsay. Ap-
proximate medians and other quantiles in one pass
and with limited memory. In Proc. of ACM SIGMOD
’98, pages 426–435, 1998.
[13] S. Nath and P. Gibbons. Synopsis diffusion for ro-
bust aggregation in sensor networks. Technical Report
ITR-03-08, Intel Research, Aug. 2003.
[14] J. Newsome and D. Song. GEM: Graph EMbedding
for routing and data-centric storage in sensor networks
without geographic information. In ACM SenSys ’03,
pages 76–88, 2003.
[15] A. Rao, S. Ratnasamy, C. Papadimitriou, S. Shenker,
and I. Stoica. Geographic routing without location
information. In ACM MobiCom, Sept. 2003.
[16] S. Ratnasamy, B. Karp, S. Shenker, D. Estrin,
R. Govindan, L. Yin, and F. Yu. Data-centric storage
in sensornets with GHT, a geographic hash table. Mo-
bile Networks and Applications, 8(4):427–442, 2003.
[17] S. Rost and H. Balakrishnan. Lobcast: Reliable Data
Dissemination in Wireless Sensor Networks. Under
submission.