and

`
`

`Corresponding author address: Dr. Charles A. Doswell III,
NOAA/ERL/National Severe Storms Laboratory, 1313 Halley Circle,
Norman, OK 73069. Internet e-mail:
doswell@nssl.noaa.gov`

Meteorological observing networks are nearly always irregularly distributed in space. This irregularity generally has an adverse impact on objective analysis and must be accounted for when designing an analysis scheme. Unfortunately, there has been no completely satisfactory measure of the degree of irregularity, which is of particular significance when designing artificial sampling networks for empirical studies of the impact of this spatial distribution irregularity. The authors propose a measure of the irregularity of sampling point distributions based on the gradient of the sums of the weights used in an objective analysis. Two alternatives that have been proposed, the fractal dimension and a "nonuniformity ratio," are examined as candidate measures, but the new method presented here is considered superior to these because it can be used to create a spatial "map" that illustrates the spatial structure of the irregularities in a sampling network, as well as to assign a single number to the network as a whole. Testing the new measure with uniform and artificial networks shows that this parameter seems to exhibit the desired properties. When tested with the United States surface and upper-air networks, the parameter provides quantitative information showing that the surface network is much more irregular than the rawinsonde network. It is shown that artificial networks can be created that duplicate the characteristics of the surface and rawinsonde networks; in the case of the surface network, however, a declustered version of the observation site distribution is required.

As noted in Koch et al. (1983), Smith et al. (1986), and Barnes
(1994),** **the *degree* of irregularity in an observational
array's distribution can have a large impact on the way an objective
analysis (OA) is done and how successful it is likely to be. In fact,
Buzzi et al. (1991) have developed a method to minimize the negative
impact of irregularity in the spatial sampling. Empirical tests of OA
schemes often are conducted to support the choices of the OA method
and any of its associated parameters. These empirical tests usually
make use of an analytic input function with which to compare the
analyzed values. On one hand, it is logical to sample the analytic
function with the *actual* station distribution (e.g., Smith et
al. 1986). When doing this, however, there is some degree of
uncertainty regarding the generality of the results; the given
results might depend to some unknown extent on the specific station
distribution under consideration and its position in relation to the
analytic function. If, instead, an *artificial* network is used
in empirical tests (e.g., Barnes 1994, hereinafter B94), it becomes
possible to remove the effects of a particular realization by
performing the tests on a number of different, but statistically
similar station distributions. The problem with this latter approach
is that there has not been a simple way to compare the artificial and
real networks. In other words, there has been no common measure of
irregularity between the two, such that it can be said with
confidence that the artificial network is "similar" in some sense to
the real network. The objective of the present study is to find a
measure of irregularity that allows the comparison of such artificial
distributions with real data arrays.

Two measures of the degree of irregularity proposed by other authors are investigated and found inadequate, for reasons described in section 2. A new measure of irregularity is proposed in section 3, based on an idea presented in the Appendix of Doswell and Caracena (1988, hereinafter DC88), and various tests of the proposed measure are presented. Section 4 contains two practical examples of using the measure, using the U.S. surface and rawinsonde networks pictured in Fig. 1, and section 5 concludes with a summary of the results of this work and additional topics for future research.

## a. The fractal dimension

Lovejoy et al. (1986) have proposed using the fractal dimension to
characterize the distribution of a geophysical data array. When
considering a station distribution in a two-dimensional embedding
space (as on the surface of the earth), the fractal dimension
(denoted *D* _{m}) should be two for a uniform
distribution of data points. Real, irregular data distributions (Fig.
1) should have a fractal dimension between zero and two, with the
degree of inhomogeneity being measured by 2-*D* _{m}.
The correlation dimension, *D* _{c}, is often used as an
approximation for the fractal dimension because it is easier to
calculate (Grassberger and Procaccia 1983; Lovejoy et al. 1986;
Korvin et al. 1990). Determining the correlation dimension consists
of counting the number of stations *n* within a series of
circles of increasing radii *r* around each point in the
observation lattice, so that *n* =*n* (*r * )
[1]. We followed the
recommendation of Korvin et al. (1990), who noted that *r*
should not exceed one-third of the *largest* interstation
distance to produce a reliable estimate of *D* _{c}. By
finding the average of *n* (*r* ) over all the stations
(avoiding counting the same distance twice), denoted <*n*
(*r* )>, a plot of ln <*n* (*r* )> vs. ln
*r* can be created. The correlation dimension, *D*
_{c}, is the slope of a line fitted to the data on such a
plot. Lovejoy et al. (1986) used the correlation dimension method to
find a fractal dimension of approximately 1.75 for the World
Meteorological Organization (WMO) network and presented this measure
as a guide in determining detectability limits. Because the fractal
dimension addresses the inhomogeneity of the network, we investigated
it as a candidate measure of irregularity to compare simulated and
real networks.

We applied the correlation dimension technique to fictitious and
real station distributions, including the surface and upper-air
stations of the contiguous United States. When considering stations
near the boundaries of the finite area data lattice, we obtain values
of *n* (*r* ) that differ significantly from those within
the interior of the data domain. This property of finite data domains
is well known: Barnes (1964), Achtemeier (1986), DC88, and Pauley
(1990) all recognized that data lattice boundaries create
difficulties. The standard approach (although by no means the only
one) is to erect what Cressie (1991, p. 607) called "guard areas"
inside the perimeter of the data lattice. In other words, one only
considers information from stations within the *interior* of the
data lattice; one chooses a guard *barrier* (a term we have used
in preference to Cressie's term, *area * ) such that the results
near the edge of the guard barrier are indistinguishable from those
deeper within the data lattice.

Having erected a guard barrier near the edges of the data lattice,
another problem arises, however. Fitting a straight line to the
points in the plot using least squares is a straightforward
procedure, but problems arise when deciding *which* points to
use in the fitting process if the entire profile is not linear.
Results of this process are shown in Fig. 2 for both the
surface and
upper-air networks. For the upper-air
network (Fig. 2a), two different parts of the curve appear linear,
yielding very different fractal dimensions of 1.97 and 4.46. The
concept of a network having different fractal dimensions over
different scales is not new (e.g., Tessier et al. 1994), but
complicates the use of the fractal dimension for comparing the
irregularity of station distributions. Furthermore, the fractal
dimension in either section of the plot can be changed by making
small changes in which points to consider in the line-fitting. Using
the last 12 data points at the top of the plot instead of the last 30
yields a fractal dimension of 1.72 instead of 1.97. The fractal
dimension of the surface network (Fig. 2b) also shows some indication
of multiple fractal dimensions (1.50 or 1.86, again depending on
which points are chosen for the fitted line). The uncertainty in the
fractal dimension associated with the choice of points for the
line-fitting is as large as or larger than the difference between the
upper-air and surface networks. Considering this point and the
possible uncertainty about which fractal dimension to use, we
conclude that the subjectivity associated with this measure is
unacceptable for comparing irregularity among different spatial
distributions.

## b. The nonuniformity ratio

In B94, considerable use was made of a nonuniformity ratio r proposed by Smith et al. (1986), which they defined as:

where *E* is what Smith et al. call the "equivalent uniform
station spacing" (defined as the spacing derived by distributing the
original number of stations uniformly over the data domain), and
*M* is the mean distance to each station's nearest neighbor in
the real array.[2] A uniform
sampling array would have r = 0, and the
greater the irregularity, the larger r
would become. It certainly can be argued that our proposed measure is
not substantially different from r.
However, r is a single number intended to
represent the nonuniformity of the data array as a whole. By using
the proposed measure described in the following section, the
irregularity can also be displayed over the domain, to provide a
picture of how the data density varies in space. In our opinion, this
conveys more information about the nonuniformity of data than does
any single number.

An equally important point is the fact that r is independent of the OA scheme that will be used with the data, while the measure proposed in the next section can be "matched" to the OA scheme by making use of the same parameters. Thus, this measure truly assesses the impact that the irregularity has on the OA scheme's results, and can provide feedback on the appropriate choice of parameters to minimize the effect of the irregularity on the OA.

Two ideas have contributed to our proposed measure. First, Barnes (1964) showed a figure (his Fig. 4) displaying the number of stations influencing the analysis as a function of space. Barnes used a "radius of influence" in his OA scheme: stations outside this radius were not considered in the analysis. Thus, the spatial constancy of the number of stations within this radius reflects, in a crude way, the uniformity of stations. Where that number is relatively constant (as in the center of the United States), the stations are relatively uniform. Barnes's figure shows that the major contribution to nonuniformity of rawinsonde observations is that associated with data domain boundaries. Outside of the land area of the United States, the data density drops precipitously. There are clusters and voids within the interior of the country also, and a finer contour interval would make this more obvious.

The other aspect of the idea was explored tentatively in DC88 in
the appendix. Specifically, they showed that the gradient of the
weight function used in distance-dependent weighted averaging
contains a term involving the gradient of the *normalizing
factor* , which is simply the sum of the weights affecting any
given grid point. Figure 3 shows that in
regions of quasi-uniform data, the
gradient of the sum of the weights
affecting the analysis should be quite small; in regions of
substantial irregularity, the gradient would be large, and could
affect the calculation of data gradients.

Therefore, we consider the magnitude of the gradient of the sum of the weights; that is,

(where *n* is the number of stations considered and *w*
_{k} is the weight assigned to the *k* ^{th}
station at the analysis point in question) to be a candidate
parameter for estimating the degree of irregularity in a station
distribution. Although the selection of a weighting function is a
potentially troublesome issue, it should be clear that unless the
selection is done poorly, many different functions all should give
roughly comparable results. We have chosen to use the Gaussian
weighting function proposed by Barnes (1964)

(where *R* _{k} is the Euclidean distance from the
analysis point to the *k* ^{th} data point and l is the *shaping parameter* of the scheme)
largely because of its convenience and familiarity. Determination of
the shaping parameter l is considered
below.

The examples shown in this paper are all for a single-pass OA scheme, but the proposed method can be adapted for multipass schemes. For purposes of this paper (testing the proposed measure), we consider it sufficient to employ any particular OA scheme; the single-pass, Gaussian-weighting has been chosen for convenience. We believe that if some other scheme is being used for OA, that scheme is the one to use for measuring the irregularity of the sampling network. Multipass OA techniques require calculating an inverse Fourier transformation on the known final response function of the multipass scheme, to find the single-pass weighting function equivalent to the multiple-pass scheme. Once that equivalent single-pass weighting function is known, (1) can be used to calculate the µ values as described in the following sections.

## a. Some preliminary issues

Certain parameters must be set before calculating the values of µ described by (1), and unwise choices of these parameters may render the measure useless. Thus, we now describe the experimentation that has led to the choices we advocate.

## 1) GUARD BARRIER

The notion of a guard barrier has been introduced already, in the
context of noting the effects of the data lattice boundary upon the
results of the fractal dimension method. In the context of our
method, we observe that the shaping parameter l in (2) determines a length scale of
importance: the *e* -folding distance for the weights. The
parameter l determines the "reach" of the
weighting scheme; for example, the weighting scheme gives a weight
less than 0.0183 for all points beyond a Euclidean distance of 2l. This means that a sum of the weights will not
"feel" the boundaries very much until it is within about 2l-3l. If the guard
barrier is chosen to be somewhere in this range, the average value of
the sum of the weights will not be affected adversely by the data
lattice boundaries. After some experimentation
(Fig. 4) with a uniform square grid,
which should yield µ = 0, we have chosen a guard barrier of
4D*d* , equivalent to about 3l (where D*d* is
the median *data* spacing). The choice of 4D*d* is a compromise between guard barriers
of 2D*d* and 6D*d* : 4D*d* (with l =
1.3; see the next section for a discussion of how l is chosen) gives a more accurate depiction of
µ than the 2D*d* case without
sacrificing so much of the data domain as in the 6D*d* case.

## 2) SHAPING PARAMETER

Given the foregoing experiments, it appears that by making l small enough in the interiors of our
theoretically "uniform" data grids, it is indeed possible to drive
µ to quite low values. When l is too
large, the interior of the data domain still "feels" the data
boundaries; however, it is not obvious that we would necessarily want
to make l extremely small, since that
implies excessive weighting on values quite close to the analysis
point. Ordinary OA considerations suggest that making l too small gives an excessively "noisy"
analysis. Our results (Fig. 4) show that when l is too small, the µ values increase owing
to spurious waves that appear in the field of the sums of the weights
because of a Moiré-like effect. These results show that the
smallest value of the average µ for the uniform grid occurs at
l = 1.3, which is 1.3 times the median
data spacing (D*d* ). This value was
endorsed by Pauley and Wu (1990) and is within the range of values
advocated by Caracena et al. (1984). It is important to note that one
should use the same value of [lambda] in the irregularity measure as
that used in one's OA scheme. For examining the theoretically uniform
square data lattices with unit spacing, we use l = 1.3. As we have shown, the guard barrier
that is suited best to a value of l = 1.3
is 4D*d* ; these choices make the
interior values of µ sufficiently small for any practical
purposes.

We have chosen not to use a "radius of influence" or "cutoff radius" in the analysis. Therefore, all data points are included in the sum of the weights at any point in the computational domain. Clearly, points far away contribute virtually nothing to the sum, which therefore will be dominated by the data distribution near any specific point in question. If a cutoff radius is used in one's OA scheme, however, the same cutoff radius should be used when testing the irregularity, to keep it "matched" to the OA scheme.

## 3) COMPUTATIONAL GRID SPACING

In our calculation of µ using (1), the gradients are computed
with second order finite differences on a square computational grid,
and the computational grid spacing has an effect on the size of the
µ values. The maximum and average values of µ increase as
the grid spacing decreases (Fig. 5); a
smaller grid spacing is able to detect more of the real value of the
magnitude of the gradients. Similar results were found for the
surface network (not shown). Apparently, the true value of µ can
be found only in the limit as the computational grid spacing
approaches zero. To maximize the accuracy of µ values while
keeping computational costs associated with a large number of grid
points within bounds, we have chosen a computational grid spacing of
D*d* /6; most of the value of the
gradient (99% in this example) is captured at this point (see Fig.
5). The D*d* /6 criterion can be
applied to most data distributions encountered in meteorology, unless
it is obvious a priori that the distribution is pathologically
irregular (large voids combined with intense clustering of sample
points). This choice obviously is related to issues of resolution
discussed in DC88.

## b. Tests with uniform data distributions

To conduct a "control" experiment with our method, we evaluate the
maximum, minimum and average µ values for a 27 X 15 uniform
square grid (as an example of a fictitious, uniform data
distribution). The computed average value of µ is slightly
greater than zero (~0.00000517); this corresponds to the minimum
plotted in Fig. 4 for l = 1.3. For a
uniform data distribution, µ should be zero, so this control
experiment confirms this supposition, at least within the finite
computational limits of real experiments. It will be shown in later
experiments (see, e.g., Table 2), that values of µ for different
networks are possible on the order of one, so this µ value of
0.00000517 is indeed a very small number in comparison to the range
of values possible, and thus can be effectively considered to be
zero. A test performed with twice the number of grid points in the
uniform grid with the same unit grid spacing produces an average
µ of about 0.00000487, which demonstrates only a modest
dependence of µ on *n* .

We also have tested the effects of the computational grid having some specific spatial relationship to the data sampling array. These "displacement" experiments consist of shifting the sampling sites in relation to the computational grid and evaluating the effect on µ . Five different displacements of the sampling sites are shown in Table 1, along with the corresponding average µ values. Because the sampling sites are uniformly distributed, the average values of µ are expected to be zero as in the control case, and indeed they are very small, albeit with slight variation. The results of these experiments reveal that the average µ and, hence, our choice of [lambda] and the guard barrier are not affected significantly by a displacement of the data points relative to the computational grid.

## c. Tests with artificial irregular distributions

We can create increasingly irregular distributions in a manner comparable to B94 to test the applicability of our measure. The distributions start with a uniform square array of sampling sites with unit spacing, which are displaced according to

where *x* and *y* are the new locations of the data
point originally located at (*x* _{o},*y*
_{o}), *n* _{r} and *n* _{r*} are
pseudorandom numbers uniformly distributed between 0 and 1, and
*D* is the *scatter distance,*
[3] the maximum amount the
point can be moved in either of the *x * or *y *
directions. For each grid point, four random numbers between zero and
one are generated: the first is the *amount* the grid point is
moved in the *x* direction and the second is the *sign* of
that movement (a random number less than 0.5 means movement in the
negative *x* direction); the third and fourth are the same
except they apply to the *y* direction. The algorithm to
generate random numbers is an adaptation of the method described by
Press et al. (1986), which they assert to be free of sequential
correlation. As *D* increases, so should the irregularity of the
distribution, at least up to a "saturation" point (see below). Some
examples of the artificial distributions are shown in
Fig. 6. We have created 20 realizations
for each size increment of *D* by starting the pseudorandom
number generator with a different seed for each realization and have
averaged the results over the set of twenty realizations to find
typical results for *D* values of that magnitude. The scatter
distance *D * is allowed to vary from 0.1 to 100 by increments
of 1.0, except between 0.1 and 2.0, where the increment is 0.1.

In the process of our experimentation, it became clear that we need to decide how to deal with points that are scattered outside of the original data boundaries. Therefore, all of our experiments are done with three different "boundary conditions": 1) "dispersive," in which the data points are allowed to be scattered outside of the original data boundaries; 2) "reflective," in which points that would have been scattered outside a boundary are reflected that same distance back inside the boundary; and 3) "periodic," in which the data points are allowed to exit a boundary but re-enter the domain at the opposite boundary, such that the point is as far inside the one boundary as it would have been outside the other boundary.

The results for the reflective and periodic boundaries tend to be
very similar in most cases, but the dispersive case behaves
differently, owing to a decrease in the number of points within the
original boundaries. That the dispersive case would behave
differently could have been anticipated just by looking at the 3
kinds of distributions at *D* = 10
(Fig. 7). In the dispersive case, the
overall density of data within the original data domain boundaries
decreases as *D* increases.

The maximum, minimum and average µ values using each of the
different boundary conditions (Fig. 8)
show that by a *D* ~ 1.5, the irregularity has attained a
maximum. This can be considered a sort of "saturation" of the
irregularity; increasing *D* further simply moves points around
without materially affecting the irregularity of the distribution.
Figure 10 in B94 shows basically the same
result.[4] Making *D *
> 1.5 reveals no discernible trend in the reflective and periodic
boundary cases; however, for the dispersive boundary case, the
average µ starts to decrease again. This effect obviously is due
to the decreasing number of points within the computational grid.

Thus, we have verified that µ increases when the irregularity
increases. Given this initially satisfactory result, it is useful to
evaluate how our artificial station distributions compare with those
characterized by complete spatial randomness (i.e., exhibiting
nearest-neighbor distributions described by a Poisson distribution,
as detailed in the appendix). Using the nearest neighbor
distributions for *D* = 0.1, 0.5,
1.0 and 5.0, it is clear that as
*D* increases, the distribution approaches that of a Poisson
random variable. Using the Pearson test (also described in the
appendix), the distributions for *D* = 0.1 and 0.5 are rejected
as being Poisson at the 0.01 significance level, but the
distributions for *D* = 1.0 and 5.0 are accepted as good fits to
the Poisson distribution at the 0.01 level. Thus, our method of
creating "random" distributions proves to be quite comparable to true
spatially random sampling for *D * __>__ 1.0.

A final test addresses the dependence of µ on n, the number
of data sampling sites in the distribution, but this time using an
irregular sampling distribution. Using *D* = 0.5, 20 different
realizations of irregular distributions are created using twice the
number of data points as used in the previous experiments. The
average µ = 0.939, which is only slightly different than the
value found above with half the sampling sites (0.954). We conclude
that µ does not depend strongly on *n* .

This section describes how one can use the proposed measure to create artificial distributions with the same amount of irregularity as real data networks. Once the artificial distributions have been verified to be as irregular as the data network in question, those artificial distributions can be used to test how well an OA scheme responds to irregularities in the data sampling.

## a. U.S. upper-air network

The appropriate l for the upper-air
network is computed as 1.3 times the median from the nearest-neighbor
distribution and is about 470 km. For the upper-air network, it is
necessary to change the guard barrier to 2D*d* (~723 km), because D*d* (the median of the station spacing) is
so large for this distribution that the 4D*d * value suggested earlier does not
leave much of an interior part of the dataset to evaluate. The
computational grid spacing is D*d* /6
, or about 60 km. The average µ is 0.24 for the upper-air
network; by comparing this value to the µ values in Fig. 8 for
the artificial distributions, we see that artificial distributions
having the same amount of irregularity can be created using *D*
~ 0.15. Thus, we are able to create artificial distributions that are
comparable in terms of irregularity to the upper-air network with
which to test an OA scheme.

It is of some interest to note, within this context, that the upper-air network has been undergoing some perturbations as a result of the modernization efforts within the National Weather Service. Using our method for characterizing the degree of irregularity of the distribution, Table 2 reveals that the changes in station siting have not significantly changed the regularity of the network yet. For those in the meteorological community who feel that the greater the degree of sampling irregularity, the lower confidence one can have in data analysis, any shuffling of the station sites is a major concern. With our proposed measure of irregularity, the irregularity of the resulting network can be monitored.

## b. U.S. surface network

After using our proposed measure of irregularity on the surface
network[5], we find the average
value of µ to be 2.69, using a l ~ 56
km, a guard barrier of 4D*d* , or
~173 km, and a computational grid spacing of D*d* /6, or ~ 7 km. Comparing this µ
value to those in Fig. 8, we find a curious result: we are unable to
duplicate the amount of irregularity in the surface network by the
random scattering process we have used to create the artificial
distributions.

To understand the reason for this result, we again fit Poisson curves to the surface network's nearest-neighbor distribution and judge the goodness-of-fit with the Pearson test. The surface network is rejected as being Poisson at the 0.01 level of significance (Fig. 10). Considering Fig. 10, the surface network appears to be too clustered (many stations have very close nearest neighbors) to be considered spatially random. It is this clustering that makes the surface network so very irregular, so much that we are unable to duplicate it with the artificial distributions.

Based on the preceding results, we decided to *decluster* the
surface network to decrease the irregularity. Our simple declustering
algorithm is as follows: A cluster is defined by counting the number
of stations within a certain distance, the *declustering radius*
, of any given station. More than one station within the declustering
radius constitutes a cluster. When a cluster is detected, a station
in the cluster is removed as determined by the original ordering in
the station listing. After the first of the stations in a cluster is
removed, the cluster is tested again and stations are removed
repeatedly until only one station in the original cluster remains.

Declustering the surface network does decrease the irregularity
(see Fig. 11). Using our simple
algorithm with different declustering radii, it was found that when
it is declustered to remove stations less than 60 km apart, the
irregularity is low enough to be duplicated by the artificial
distributions. This is revealed in Fig.
12, with µ values for the declustered surface network
overlaid onto the artificial network results originally shown in Fig.
8c. Thus, artificial distributions can be created with the same
amount of irregularity as that of the declustered surface network
using *D* ~ 0.85. The values of µ for the declustered
surface network and the total surface network are also presented in
Table 2 with the upper-air network results. It
should be noted that even the declustered surface network fails the
objective Pearson goodness-of-fit test for a Poisson distribution,
although the visual appearance of the fit (not shown) is considerably
better than that of Fig. 10.

We have shown that it is possible to create artificial networks
that closely match the characteristics of the U.S. upper-air and
*declustered* U.S. surface dataset s by starting with a uniform
grid of points and performing the appropriate perturbations.
Therefore, we believe that our method of characterizing the degree of
irregularity in a sampling array enables meteorologists to do
empirical experiments with artificial networks with some assurance
that their artificial networks have similar sampling characteristics
to the real networks. Our approach to measuring the degree of
irregularity of station distributions is simple both in principle and
in practice so that it should be possible to execute an analysis of
the irregularity in a dataset routinely before doing an objective
analysis, and we recommend that those doing OA make it a practice to
do so.

Future efforts in this area might well include a systematic exploration of analyses done with triangular computational grids. In B94, it was noted that in a triangular array of sites, each site has six equidistant nearest neighbors, whereas in a square array, each site has only four equidistant nearest neighbors. Hence, in this restricted sense, a triangular array is "more uniform" than a square grid.

It also would be useful to explore alternative methods for declustering data networks for the purpose of achieving a roughly uniform distribution of points for objective analysis purposes. For example, the "superob" method (DiMego 1988) of replacing station clusters with a single station having the average location coordinates of stations within a cluster might well give somewhat better results than the simple scheme we have used. Also, it remains to be seen how one might create an artificial network with the distribution characteristics of the actual surface network before declustering. We believe that a method for artificial clustering the results of a "perturbations" experiment can be developed.

Finally, we have indicated that station distributions might have important impacts on objective analysis, owing to the gradient of the sum of the weights term as described in the appendix of DC88. It would be useful to know precisely at what degree of irregularity the OA is affected significantly from this term. As noted in DC88, when this term becomes important, the ordering of objective analysis and differentiation becomes important in gradient computations. Most schemes computing derivatives diagnostically do the objective analysis first, which DC88 contended is the improper order for irregular station distributions. Thus, some empirical testing with quantitative knowledge of the degree of irregularity would be valuable in deciding the validity of doing the objective analysis first.

*Acknowledgments *. We appreciate the helpful critiques of an
earlier version of the manuscript by Dr. S. Barnes (NOAA - Forecast
Systems Laboratory) and Prof. J. J. Stephens (The Florida State
University). We benefited from discussions with Prof. M. Richman
(University of Oklahoma) and from the critical comments contributed
by Dr. H. Brooks and Mr. P. Spencer (NSSL). Finally, we wish to thank
the anonymous reviewers for their suggestions clarifying the
presentation. This work is based in part on the junior author's
Master's thesis research, which was partially supported by a Patricia
Roberts Harris Fellowship through the Department of Education. We
also obtained partial support from the Center for Analysis and
Prediction of Storms, CAPS (University of Oklahoma).

According to Cressie (1991, pp. 602 ff. and 633 ff.), "the distribution theory for nearest-neighbor distances ... under complete spatial randomness is well-known." In a two-dimensional Cartesian space, the distribution function of the station to station distance has a density given by

where *x * is the distance from a station to its nearest
neighbor and x is the intensity parameter,
which can be approximated by the average data density over the
domain. This distribution is derived by assuming that the station
distribution is described by a homogeneous Poisson process, whereby
the probability of having a station in a given small area *dx*
^{2} is given by x and that
probability is essentially constant over the domain.

Poisson curves are fit to the nearest-neighbor distributions computed from the distribution to be tested using the method of least squares, which involves solving

iteratively for x (the parameter of the
Poisson distribution); *g* is the distribution function
described above. Despite its formidable appearance, the iterative
solution converges rapidly. The fit of the Poisson curve to the
nearest-neighbor distribution is judged by the Pearson test
statistic, *C* _{1}, defined by

where *k* is the number of classes in the nearest-neighbor
distribution, *X _{i}* is the observed number in each
nearest-neighbor category,

at the a level of significance, with
*k* -2 degrees of freedom.

REFERENCES

Achtemeier, G. L., 1986: The impact of data boundaries upon a
successive corrections objective analysis of limited-area datasets.
*Mon. Wea. Rev.*, **114**, 40-49.

Barnes, S.L., 1964: A technique for maximizing details in
numerical weather map analysis. *J. Appl. Meteor.*, **3**,
396-409.

______, 1994: Applications of the Barnes objective analysis
scheme. Part I: Effects of undersampling, wave position, and station
randomness. *J. Atmos. Oceanic Technol.*, **11**, 1433-1448.

Buzzi, A, D. Gomis, M.A. Pedder, and S. Alonso, 1991: A method to
reduce the adverse impact that inhomogeneous station distributions
have on spatial interpolation. *Mon. Wea. Rev.*, 1991,
**119**, 2465-2491.

Caracena, F., S. L. Barnes, and C. A. Doswell III, 1984: Weighting
function parameters for objective interpolation of meteorological
data. *Preprints, 10th Conf. Weather Forecasting and Analysis*,
Clearwater Beach, Amer. Meteor. Soc., 109-116.

Cressie, N., 1991: *Statistics for Spatial Data*. John Wiley
and Sons, New York, 900 pp.

DiMego, G.J., 1988: The National Meteorological Center regional
analysis system. *Mon. Wea. Rev*., **116**, 977-1000.

Doswell, C.A. III, and F. Caracena, 1988: Derivative estimation
from marginally sampled vector point functions. *J. Atmos.
Sci.*, **45**, 242-253.

Grassberger, P., and I. Procaccia, 1983: Measuring the strangeness
of strange attractors. *Physica*, **9D**, 189-208.

Koch, S.E., M. des Jardin, and P.J. Kocin, 1983: An interactive
Barnes objective map analysis scheme for use with satellite and
conventional data. *J. Climate and Appl. Meteor.*, **22**,
1487-1503.

Korvin, G.D., M. Boyd, and R. O'Dowd, 1990: Fractal
characterization of the South Australian gravity station network.
*Geophys. J. Int.*, **100**, 535-539.

Larsen, R.J., and M.L. Marx, 1986: *An Introduction to
Mathematical Statistics and Its Applications*. Prentice-Hall, 630
pp.

Lovejoy, S., D. Schertzer, and P. Ladoy, 1986: Fractal
characterization of inhomogeneous geophysical measuring networks.
*Nature*, **319**, 43-44.

Pauley, P. M., 1990: On the evaluation of boundary errors in the
Barnes objective analysis scheme. *Mon. Wea. Rev.*, **118**,
1203-1210.

______, and X. Wu, 1990: The theoretical, discrete, and actual
response of the Barnes objective analysis scheme for one- and
two-dimensional fields. *Mon. Wea. Rev.*, **118**, 1145-1210.

Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P.
Flannery, 1986: *Numerical Recipes*. Cambridge University Press,
818 pp.

Smith, D.R., M.E. Pumphry, and J.T. Snow, 1986: A comparison of
errors in objectively analyzed fields for uniform and nonuniform
station distributions. *J. Atmos. Oceanic Technol*., **3**,
84-97.

Tessier, Y., S. Lovejoy, and D. Schertzer, 1994: Multifractal
analysis and simulation of the global meteorological network. *J.
Appl. Meteor*., **33**, 1572-1586.

FIGURE CAPTIONS

Fig. 1. (a) The U.S. surface observation network sites and (b) the U.S. upper-air network sites, as of fall 1993.

Fig. 2. Results of the fractal dimension method for (a) the U.S. upper-air , and (b) the U.S. surface observation networks pictured in Fig. 1. The slopes of the fitted lines are indicated in the legend boxes.

Fig. 3. Distributions of (a) the sum of the weights (nondimensional), and (b) the proposed measure µ (values shown are 10 times the value of a centered difference over two dimensionless grid intervals) based on Eq. (2) in the text, for the fall 1993 upper-air network. The dashed rectangle in (b) depicts the area within which the values shown in Table 2 were computed.

Fig. 4. The average µ versus
l for a uniform square grid (with D*d *=1.0) for three different values of
the guard barrier distance: 2 D*d*, 4
D*d* , and 6 D*d* .

Fig. 5. Maximum (squares) and average
(circles) values of µ for the upper-air network as a function of
computational grid spacing. Plotted points correspond to particular
values of D*d* /*N* , where
*N* ranges from 1 to 20 in unit steps, plus a last value (far
left) at *N * = 100. Arrows indicate values at a grid spacing of
D*d* /6.

Fig. 6. Examples of increasingly
irregular distributions created by perturbing uniform data points
from their original locations (represented by the grid).
Distributions are shown for (a) *D* = 0.1, (b) *D * = 0.5,
and (c) *D * =1.0.

Fig. 7. Examples of artificial
distributions for *D* = 10 using (a) dispersive boundaries, (b)
periodic boundaries and (c) reflective boundaries. The heavy solid
line in (a) represents the original 27 X 15 data domain.

Fig. 8. Averaged maximum (open
circles with dots), minimum (diamonds), and average (filled circles)
µ values for the 20 simulations of artificial distributions
versus *D* using (a) dispersive boundaries, (b) periodic
boundaries and (c) reflective boundaries.

Fig. 9. Poisson curves fit to the averaged artificial
distributions for (a) *D * = 0.1, (b)
*D* = 0.5, (c) *D* = 1.0 and
(d) *D* = 5.0. Solid lines denote the theoretical Poisson
values while filled circles represent the observed values.

Fig. 10. Poisson curve fit to the nearest-neighbor distribution for the U.S. surface network pictured in Fig. 1.

Fig. 11. Average µ as a function of the declustering radius used to decluster the U.S. surface network.

Fig. 12. As in Fig. 8c, except the declustered surface network values of µ (denoted by squares) are overlaid. The declustering radius is 60 km.

TABLES

Table 1. Displacement experiments and the corresponding average µ.

Experimentx-displacementy-displacementAvg µ -----------------------------------------------------------Control 0.0 0.0 0.000005 1 0.5 0.5 0.000010 2 -0.3 0.2 0.000006 3 -0.05 -0.4 0.000007 4 0.2 0.0 0.000010 5 0.2 -0.5 0.000005 --------------------------------------------------------------------

Table 2. Maximum, minimum and average values of µ for theoretically uniform and real sampling networks.

Sampling networkMax µMin µAvg µ ---------------------------------------------------------------Uniform square grid 0.00 0.00 0.00 U.S. upper-air network, fall 1993 0.51 0.01 0.24 U.S. upper-air network, Nov. 1994 0.55 0.01 0.23 U.S. upper-air network, Feb. 1995 0.51 0.01 0.23 Total U.S. surface network 34.04 0.00 2.69 Declustered U.S. surface network 5.67 0.01 1.38 ------------------------------------------------------------------------