June 21, 2007

A Gis-Based Approach to Watershed Classification for Nebraska Reservoirs1

By Bulley, Henry N N Merchant, James W; Marx, David B; Holz, John C; Holz, Aris A

ABSTRACT: The U.S. Environmental Protection Agency is charged with establishing standards and criteria for assessing lake water quality. It is, however, increasingly evident that a single set of national water quality standards that do not take into account regional hydrogeologic and ecological differences will not be viable as lakes clearly have different inherent capacities to meet such standards. We demonstrate a GIS-based watershed classification strategy for identifying groups of Nebraska reservoirs that have similar potential capacity to attain a certain level of water quality standard. A preliminary cluster analysis of 78 reservoirs was performed to determine the potential number of Nebraska reservoir groups. Subsequently, a Classification Trees method was used to refine number of classes, describe the structure of reservoir watershed classes, and to develop a predictive model that relates watershed conditions to reservoir classes. Results suggest that Nebraska reservoirs can be represented by nine classes and that soil organic matter content in the watershed is the most important single variable for segregating the reservoirs. The cross- validation prediction error rate of the Classification Tree model was 26.3%. Because all geospatial data used in this work are available nationally, the method could be adopted throughout the U.S. Hence, this GIS-based watershed classification approach could provide water resources managers an effective decision-support tool in managing reservoir water quality.

(KEY TERMS: water quality; lake classification; geographic information systems; classification tree modeling; reservoirs; watershed management.)

(ProQuest-CSA LLC: ... denotes formulae omitted.)


In recent decades, substantial progress has been made in improving the quality of surface waters in the United States (Hawkins et al., 2000; USEPA, 2000a,b); nevertheless, much work remains to be performed in assessing the state of impairment of lake waters. Lakes are characterized as impaired when existing water quality of the lake, as measured by some selected criteria (e.g., nitrogen, phosphorus, and chlorophyll-a), do not sustain, one or more designated uses, including aquatic life, recreation, and drinking water (USEPA, 2000a,b). The U.S. Environmental Protection Agency (USEPA) is charged with establishing national standards and criteria for assessing lake water quality. However, it is increasingly evident that a single set of national water quality standards that do not take into account the hydrogeologic and ecological differences among lakes will not be viable as lakes have different inherent capacities to meet such standards.

For example, the USEPA suggested criteria for lake phosphorus (i.e., 30 ppb) is inconsistent with the best attainable threshold for most Nebraska lakes, about 60 ppb. There is no evidence that phosphorus concentrations in Nebraska lakes were ever at 30 ppb. Hence, this is an unrealistically low target. An alternative is a method for denning "classes" or "groups" of lakes determined to be similar to one another in terms of their potential capacity to attain a certain water quality level. These classes could then be used as the primary framework for setting different water quality criteria or standards ("reference conditions") for lake management. This approach assumes that reference conditions are those that existed prior to extensive permanent settlement of Nebraska in the 19th century. The primary objective of this paper is to demonstrate a geographic information system (GlS)-based approach to reservoir watershed classification that could be useful for establishing lake water quality standards. The corresponding research questions are: what is the optimal number of Nebraska reservoir classes and what watershed characteristics mostly influenced the reservoir classification process?

To be effective, a classification system designed to assess potential lake conditions must be based on environmental variables that underlie, determine, and explain the patterns of change in chemical or biophysical water quality performances over seasonal or annual cycles (Warren, 1979). Previous efforts to classify lakes have been based either on actual (contemporary), measurable biophysical conditions of lakes, or on biogeographic characteristics of ecological regions or zones (Vollenweider, 1968; Carlson, 1977; Schindler, 1971; Jensen and Van der Maarel, 1980; Omernik, 1987; Omernik, et al., 1991; Lomnicky, 1995; Nues et al., 1996; Heiskary, 2000; Winter, 2001, USEPA, 2002; Jenerette et al., 2002; Robertson and Saad, 2003). For example, Schindler (1971), Carlson (1977), and Heiskary (2000) classified lakes based on indices of lake performance that required extensive and/or repeated sampling of lake water quality parameters such as nitrogen, phosphorus, chlorophyll- a, and turbidity. Other researchers have developed landscape classification systems based on the characteristics of biogeographic or hydrogeologic regions (e.g., ecoregions and hydrologie landscapes) that may relate to potential conditions of lakes and other water bodies (Omernik, 1987; Omernik et al., 1991; Maxwell et al., 1995; Hargrove and Luxmoore, 1998; Winter, 2001; McMahon et al., 2001; USEPA, 2002; Robertson and Saad, 2003).

Classification frameworks such as those mentioned above, while quite effective for a number of applications, do exhibit some shortcomings with regards to setting lake water quality standards. For example, most lake classifications are based on observed, extant water quality data or on environmental variables that are often impacted by human activities and, thus, cannot be used directly for determining lake classes. The USEPA suggested that, to the extent possible, classification or regionalization schemes used to assess potential capacity of lakes to attain a certain water quality level must be limited to environmental characteristics of lakes that are intrinsic, or natural, and are not the result of anthropogenic activities (USEPA, 200Ob). Moreover, in-situ water quality data collection is expensive and time consuming. Regionalization schemes, on the other hand, generally use subjective criteria for delineating boundaries (e.g., ecoregions), which do not coincide with watersheds. In both cases, there is an apparent arbitrary and often subjective choice of the number of classes.

The classification of Nebraska lakes focuses on factors were expected to have changed little since presettlement (soils, climate, and watershed configuration) and that are likely to be associated with lake water quality. At the scale of this analysis, it was assumed that Nebraska's land use was low intensity grazing and the land cover was grassland. This view is consistent with representations of pre-settlement vegetation of the Midwest and Great Plains used by other modeling at a regional scale (e.g., Burke et al., 1991; Foley et al., 2004; Robertson and Saad, 2003). It is true that some Native American groups practiced some form of agriculture and that the fine-scale land cover was more complex, with woody vegetation occurring along some rivers and streams and in a few areas of northwest Nebraska. Nevertheless, this work is an attempt to model "pre-settlement" conditions in order to establish classes of lakes that were likely to have similar inherent (potential) capacities to meet a set of water quality targets.

The watershed classification approach proposed in this study is based on the premise that in the absence of human interference, lake ecosystems evolve in response to biophysical and chemical processes in their watersheds. It reflects an emerging emphasis on the watershed framework for water resource management (e.g., Warren, 1979; Satterlund and Adams, 1992; USEPA, 1993, 1997; National Research Council, 1999; Mehan, 2002; Bohn and Kershner, 2002). A watershed is a topographically defined area that collects all surface runoff and ground water and discharges them into the lake up to the furthest downstream point (Ponce, 1989; Satterland and Adams, 1992). The lake watershed provides an important spatial framework to develop a classification system because it is the source of runoff water, sediments, and nutrients for lakes.

A vital step in developing a classification is to determine the number of classes to be used. This requires partitioning a dataset such that the entities in one group are more similar to each other than to those in other groups. Similarity refers to the distance between two data entities, where the distance decreases for entities that are most alike (Gordon, 1999). Cluster analysis has been commonly employed to group data without prior knowledge of the class structure (Tou and Gonzalez, 1974; Hartigan and Wong, 1975; Jain and Dubes, 1988; Eldershaw and Hegland, 1997; Legendre and Legendre, 1998; Gordon, 1999; Estivill-Castro and Houle, 2001). The most commonly used clustering techniques are the k-means and single linkage algorithms. Single linkage clustering algorithm is a non- iterative approach based on a local connectivity criterion (Jain and Dubes, 1988; Legendre and Legendre, 1998; Gordon, 1999). On the other hand, the kmeans algorithm is an iterative and non- hierarchical clustering method that produces compact and nonoverlapping clusters of a dataset (Tou and Gonzalez, 1974; Legendre and Legendre, 1998; Gordon, 1999). The fundamental issue in any clustering approach is to determine which number of clusters best describes the optimal number of classes of the dataset (i.e., cluster validation). Several approaches have been used to determine the number of classes for a dataset (Milligan and Cooper, 1985; Legendre and Legendre, 1998; Gordon, 1999; Theodoris and Koutroumbas, 1999; Halkidi et al., 2001; Halkidi et al, 2002; Tibshirani et al, 2001; Ujjwal and Bandyopadhyay, 2002). These can be grouped into three main categories: use of internal criteria, external criteria, and relative criteria. The internal criteria approach to cluster validation involves analyzing the clustering results based on indices derived from the data such as a proximity matrix, while the external criteria technique involves evaluating the clustering results based on a pre-defined structure that requires input from the analyst. These two approaches to cluster validation are based on statistical testing, hence they may be limited by high computational requirements (Legendre and Legendre, 1998; Gordon, 1999; Theodoris and Koutroumbas, 1999; Halkidi et al., 2001; Halkidi et al, 2002).

On the other hand, the relative criteria approach to cluster validation evaluates the clustering structure of a given clustering scheme by comparing it to other schemes that are based on the same algorithm, but with different parameter values (Gordon, 1999; Theodoris and Koutroumbas, 1999; Halkidi et al., 2002). For a given a set of parameters (p) associated with a particular clustering algorithm, the possible clustering scheme C, (i = 2, 3 .. .p) is defined by that clustering algorithm. The clustering algorithm (e.g., k-means clustering) is then run for all the clustering schemes, using number of clusters between 2 and n. A plot of a clustering index (e.g., Calinski-Harabasz statistic, Dunn index, Cluster Distance, R-Squared, Hubert A statistic, and Davies-Bouldin index) against the number of clusters usually highlights a point at which there is a significant local change in the clustering index. This change in value which occurs as a "knee" in the plot then represents the number of classes in the dataset (Milligan and Cooper, 1985; Halkidi et al., 2001; Halkidi et al., 2002).

For example, a comparison of different k-means cluster analyses based on different numbers of clusters fits the relative criteria scheme. Also, the CalinskiHarabasz statistic had been identified as one of the best performing indices, based on examinations of different cluster validation indices (Milligan and Cooper, 1985; Tibshirani et al., 2001; Ujjwal and Bandyopadhyay, 2002). Therefore, the CalinskiHarabasz statistic (represented by "Pseudo F" in SAS(R) output) was used in this study and it is defined as follows (Equation 1):

... (1)

where R2 is the observed overall R2, c is the number of clusters, and c is the number of observations (Calinski and Harabasz, 1974; SAS Institute, 2000).

Following an unsupervised cluster analyses, a decision tree classification model can be used in a supervised classification of reservoirs based on their watershed characteristics. Because traditional supervised classification approaches such as discriminant function analysis (DFA) are parametric and may not be well suited for ecological analyses, interest in decision trees has increased in recent years. Research has shown that decision tree algorithms outperform traditional statistical approaches in accounting for variations in complex datasets for classification tasks (e.g., Quinlan, 1986; Michaelson et al., 1987; Hansen et al., 1996; Emmons et al., 1999; German et al., 1999; De'ath and Fabricius, 2000). For example, classification tree methods are not limited by prior knowledge of dataset distributions, as modeling of these distributions is not required. Thus, classification tree algorithms can easily handle multimodal distributions and they have no restrictions on sample size, in contrast to Bayesian approximators (such as DFA) and maximum likelihood classifiers.

Decision trees are recursive partitioning non-parametric statistical methods, which can account for non-linear relationships, higher order interactions and missing values in a dataset (Breiman et al., 1984; Verbyla, 1987; Quinlan, 1986, 1993; De'ath and Fabricius, 2000). There are two types of decision tree models, i.e., classification trees and regression trees. Regression trees are appropriate when the independent variable is numeric, whereas classification trees are more germane to instances with categorical independent variables, e.g., lake class (Breiman et al., 1984; Verbyla, 1987; Quinlan, 1986, 1993; De'ath and Fabricius, 2000). Previous authors (Breiman et al., 1984; Verbyla, 1987; Quinlan, 1986, 1993; Mitchell, 1997; De'ath and Fabricius, 2000) have provided detailed descriptions of decision tree procedures. In this study, only the classification tree method is reviewed because it was used to implement the watershed-based classification in this study.

According to De'ath and Fabricius (2000), a classification tree model can be used for data description (i.e., represent the systematic structure of the data) and for prediction (i.e., accurately predict the class membership of new observations). Classification tree methods discriminate the attribute space of a dataset into K disjoint groups, Kr (r = 1, 2 ....k), based on decision rules that are parallel or orthogonal to the attribute axis. The classification tree identifies the best possible path (and attributes) to partition the feature space and traces a path down the tree from the root node (dataset) to leaves (classes). Each node of the tree represents a set of rules that progressively refines the classification in a top-down hierarchical approach. Classification trees can represent higher levels of complexity or deep trees (where the class separation is difficult) and more simplistic rule sets (short trees) when appropriate.

The classification tree process involves a binary recursive partitioning of the data into successive nodes. The process is binary because the parent nodes are always split into exactly two subsequent nodes and recursive because the process can be repeated by treating each subsequent node as a parent until there are no more splits (i.e., terminal nodes or reservoir classes) (Breiman et al., 1984; Quinlan, 1993). The basic components of classification tree building process include: a set of questions, splitting criteria, and rules for assigning a class to a terminal node. Attributes that do not seem to contribute to defining the terminal nodes are usually excluded in the final tree structure, leaving only those attributes that influence the overall classification process (Breiman et al., 1984; Quinlan, 1993).

The initial step in this study was to identify watershed characteristics that underlie and could explain patterns of change in chemical or biophysical water quality of lakes (e.g., watershed area, watershed slope, soil organic matter, soil pH, and soil credibility). A preliminary cluster analysis of 78 reservoirs was performed to determine the potential number of Nebraska reservoir groups. Subsequently, a classification tree model was applied to describe the structure of watershed classes, identify watershed variables that influenced the classification, and constitute the framework for a predictive model we can use to relate watershed conditions to reservoir classes. A final determination of the number of reservoir classes was based on both statistical modeling and water resource management considerations.


This research focuses on Nebraska, a state representative of many mid-latitude areas having agriculturally dominated economies (Figure 1). Nebraska encompasses a broad range of climatic, physiographic, land use, and water quality conditions. Elevations range from 256 meters in the east to 1654 meters in the west. About 30% of the state is dominated by sand hills, grass covered sand dunes mostly devoted to grazing. The climate is characterized by a gradient of rainfall and temperature regimes along an eastwest axis. Average annual precipitation varies from 36 cm in the northwest to 86 cm in the southeast; temperatures vary between -2O0C and 3O0C (Johnsgard, 2001; Kuzelka et al., 1993).

In semiarid agriculturally dominated environments such as Nebraska, water quality impairment stems primarily from the transport of soil sediments, agrochemicals, and animal wastes via runoff from croplands and livestock operations into streams and lakes. There are about 13,500 lakes in Nebraska including natural lakes, reservoirs, and sand pits. The condition of Nebraska's lake waters is largely unknown, although it is suspected that many are impaired to some degree. Over the past two decades, the Nebraska Department of Environmental Quality (NDEQ) and the School of Natural Resources (University of Nebraska-Lincoln) have sampled about 225 Nebraska lakes and also developed a database that describes lake chemical and biophysical water quality characteristics (HoIz, 2002). These data provide a valuable resource for studies of lake water quality.

The primary water source for the natural lakes and sand pits is ground water, while reservoirs are mostly fed by surface water via streams. As streamfed lakes tend to reflect their Hydrogeologie settings, i.e., watershed characteristics, the environmental conditions in the reservoir watershed can be good indicators of the reservoir water quality. Hence, the primary research focus is the watersheds of Nebraska reservoirs.


Mapping Nebraska's Lakes

As this research focuses on classification of lake watersheds, accurate identification of Nebraska reservoirs is necessary to delineate their watershed boundaries and determine groups of reservoirs based on their water quality potential. Because none of the existing maps provided a complete and accurate depiction of the number and locations of lakes in Nebraska, a database of lake locations was developed using several data sources and ArcGIS(R) software. Initially, all water features from the latest version of the Natural Resources Conservation Service (NRCS) Soil Survey Geographic Database (SSURGO) were extracted. This dataset was then updated using other data sources, including the U.S. Geological Survey (USGS) National Hydrography Dataset (NHD), USGS National Land Cover Data (NLCD) and U.S. Census Bureau TIGER (Topologically Integrated Geographic Encoding and Referencing) dataset. Lake polygons in the GIS coverage were filtered to remove polygons that were less than 0.8 hectares in area. A threshold of 0.8 hectares was used as it generally reflects the maximum size of polygons included in the data as a result of digitizing error or slivers from data transformations. The filtered coverage was then edited and populated with lake type information to generate a map of Nebraska lakes. All reservoirs larger than 4 hectares (10 acres) were extracted from the Nebraska lakes database. The 4 hectares size restriction corresponds to the minimum threshold required for USEPA lake nutrient criteria development (USEPA, 200Ob). The reservoir coverage was then used in a watershed boundary delineation process described below. Delineating Reservoir Watersheds

It is necessary to use a simple, automated means to delineate reservoir watershed boundaries that could, potentially, be applied nationally. Automated GIS-based procedures for watershed delineation often use digital elevation models (DEM). This work employed the USGS Elevation Derivatives for National Applications (EDNA) datasets derived from a seamless 30-meter resolution DEM available for the conterminous United States (Verdin and Verdin, 1999; Gesch et al, 2002; USGS, 200Ia). The EDNA datasets, obtained from EROS Data Center, include Pfafstetter sub-catchments, modified hydrologic unit boundaries, DEM generated (synthetic) streamlines, flow direction and shaded relief data for all areas that drain into Nebraska waters (Pfafstetter, 1989; Verdin and Verdin, 1999; Gesch et al, 2002).

The EDNA datasets were used in ArcView(R) GIS to delineate the watersheds of 88 Nebraska reservoirs. These reservoirs were selected because their locations and characteristics are well documented in an existing lake water quality database developed by NDEQ and the School of Natural Resources, University of Nebraska-Lincoln (HoIz, 2002). Watershed boundaries of these reservoirs were delineated using the EDNA stage-2 ArcView(R) GIS extension together with the ArcView "Hydro" extension (Olivera et al, 2000; USGS, 2001). This process identifies a reservoir's watershed based on stream network, streamflow direction and sub-catchments information available in the EDNA dataset (Verdin and Verdin, 1999; USGS, 200Ib). The watershed boundaries and reservoir attribute data were then used as main components of a GIS database.

Watershed boundaries of 18 randomly selected reservoirs were overlaid on digital raster graphics (DRG) and manually digitized watershed boundaries, obtained from the Nebraska Department of Natural Resources (DNR), to compare the effectiveness of the DEM- based automated watershed delineation process. The DRG was used as background data, while DNR-digitized watershed boundaries served as validation dataset. A visual examination showed no major differences between the DEM-derived and DNRdigitized watershed boundaries. Also, the percentage deviation of DEM-derived watershed boundaries from DNR-digitized watershed boundaries was determined using watershed topologic, geometric, and hydrographie parameters such as total drainage area, catchment slope, mean drainage density, and total and mean drainage length (Garbrecht and Martz, 2003). For example, the percent deviation of DEM-derived watershed boundaries from the digitized watershed boundaries, based on total drainage area was computed as follows (Equation 2):

... (2)

where ABS is a function used to transform the difference into absolute values.

Derivation of Watershed Characteristics

Environmental factors that affect lake water quality are usually interrelated and complex. A subset of 78 reservoir watershed boundaries, comprising reservoirs in the GIS database whose watersheds fall within the Nebraska state boundary, were identified and used for subsequent analyses. Environmental characteristics were extracted for each watershed (Table 1). These characteristics include watershed area, watershed slope and relief, soil erodibility, soil infiltration rate, soil organic matter, soil reaction (pH), soil cation exchange capacity, soil carbonate, soil clay content, soil water holding capacity, soil permeability, and climate (precipitation, temperature, and humidity) (Soil Survey Staff, 1993; Thornton et al., 1997). The selection of watershed characteristics was guided, in part, by the potential to obtain data nationally and what was available in the GIS database. As noted earlier, land cover was considered uniform (mostly grassland) as our interest was in modeling potential lake water quality groups using presumed pre-settlement watersheds conditions.

All data were rasterized and resampled to match the 30 meter resolution of DEM data. The watershed boundary coverage was then used to clip the raster layers for each watershed characteristic, e.g., soil erodibility. Next, summary or "zonal" statistics for each watershed variable (e.g., maximum, minimum, and mean erodibility values) were generated for each reservoir watershed using ArcMap(R) GIS. The above process was repeated to derive summary statistics for climate variables such as total precipitation, precipitation intensity, and maximum temperature. All the summary statistics were then appended to the GIS database and the resultant data was exported into spreadsheet for further statistical analyses.

Assessing the Number of Nebraska Reservoir Clusters

For lake management, one would like to have the fewest number of classes that can be used to effectively distinguish lakes that have similar intrinsic capacities to meet water quality standards. k- Means cluster analysis was used to determine the number of Nebraska reservoir watershed classes. Specifically, relative criteria approach to cluster validation was used to identify the potential number of clusters. A series of cluster analyses were performed for 78 selected reservoir watersheds in Nebraska. The variables (i.e., watershed characteristics) used in the cluster analysis included watershed size, mean watershed slope and relief, soil erodibility, soil infiltration rate, soil organic matter, soil reaction (pH), soil cation exchange capacity, soil carbonate, soil clay content, soil water holding capacity, soil permeability, and climate variables, e.g., precipitation, temperature, and humidity (Table 1). This dataset is not exhaustive but represent data that were available at the time of the study.

The "FASTCLUS" procedure in the SAS(R) software was used to cluster the watershed dataset into different number of classes ranging from 2 to 25. FASTCLUS finds disjoint and non-overlapping clusters of observations using k-means clustering method such that observations that are very close to each other (i.e., have the least "sum of squared distances") are usually assigned to the same cluster, while observations that are far apart are assigned to different clusters (SAS Institute, 2000). The maximum of 25 classes correspond to a reasonable uppermost limit of watershed management classes based on several clustering attempts. Pseudo F statistic was used to assess the different clustering outputs by plotting the Pseudo F values against number of classes (hereinafter referred to as NCL). This Pseudo F plot highlighted the potential NCLs of Nebraska reservoirs as 3, 5, 13, 17, and 19.

As there were more than one NCL that corresponded to the local changes in Pseudo F values, further testing was needed to identify a single NCL that best represents the number of classes. Consequently, the potential NCLs were evaluated using a predictive model (classification tree) to refine the selection of the NCLs with respect to their predictive effectiveness (Tibshirani et al., 2001). Although the predictive accuracy can be based on a single training model, the reliability of such accuracy estimate is usually increased by the use of an averaged weighted prediction error of several models as provided by crossvalidation error (Breiman et al., 1984; Goute, 1997; De'ath and Fabricius, 2000). According to Goute (1997), cross-validation error is a better indicator of model accuracy than that derived from the splitsample approach, especially when the sample size is relatively small (e.g., less than 100).

Consequently, a k-fold cross-validated error was employed to evaluate the predictive effectiveness of the classification tree model using the potential NCLs (3, 5, 13, 17, and 19) as the dependent variables. A 10-fold cross-validation was used in the classification tree model because it is the typical number of subsets (or partitions) often applied in k-fold crossvalidations (Breiman, 1984; Quinlan, 1993; De'ath and Fabricius, 2000; RuleQuest, 2003). The watershed characteristics that were employed in the cluster analyses were used in this study as independent variables. For a given potential number of classes (NCLs), see5(R) classification tree software was used to compute the error rates for each of 12 arbitrarily selected and separate 10-fold cross- validation trials. The mean cross-validation error rate was then estimated for each of the 12 trials.

AS 3 NCL was the minimum number of classes, the mean cross- validation error rates for the remaining potential NCLs (5, 13, 17, and 19) were normalized with respect to three NCL (i.e., the reference NCL). This was performed to determine which increase in NCL resulted in least corresponding increase in mean cross- validation error. The normalized mean cross-validation error rate (NME) was computed as follows (Equation 3): ... (3)

where ME is the mean cross-validation error rate, NCL is the number of classes, n is the potential NCL, and L is the reference NCL. Outputs of the above computation were plotted against the potential NCL in order to identify the preliminary number of Nebraska reservoirs classes.

Also, a map of the preliminary reservoir classes (i.e., 13 NCL) was compared with maps showing 3 and 5 NCLs to understand the relative importance of the different numbers of classes. Class membership information from FASTCLUS(R) output for the potential NCLs were exported into spreadsheet and appended to the GIS database. The updated database was then used in ArcMap(R) GIS software to generate maps showing reservoir watershed classes for each of the potential NCLs. However, maps of 17 and 19 NCLs were excluded from further consideration because they did not show any visible detail from the 13 NCL map.

Having identified the preliminary number of Nebraska reservoir classes, a classification tree model was generated to describe the structure of the different classes, as well as to identify the variables that contributed to the segregation of these classes. A review of the terminal nodes in the classification tree revealed that only 9 classes (10 nodes) were represented by the classification tree instead of 13 classes. Reservoir classes 3, 5, 6, and 11 were missing from the classification tree and had only one watershed in each class, an indication of non-compact classes or nodes. Usually classification tree nodes (classes) that are not sufficiently compact are subsequently split or recombined into other classes (Breiman et al., 1984; Quinlan, 1993; De'ath and Fabricius, 2000; RuleQuest, 2003). Based on mean class values for the respective watershed characteristics (Table 2) in the classification tree, the 13 NCL map was revised to show 9 optimal classes. ArcMap(R) GIS was used to update the attribute information of the 13 NCL map by reassigning reservoir watersheds from classes 3 to 4, 5 to 12, 6 to 13, and 11 to 8, respectively.


A final digital map of Nebraska lakes shows a total of 13,520 lakes; including 6,796 reservoirs, 3,644 natural lakes, 3,068 sand pits and 12 oxbow lakes. This map is believed to be the most comprehensive and accurate representation of Nebraska lakes at this time. Figure 1 shows all Nebraska reservoirs larger than 4 hectares. DEM-derived watershed boundaries of 78 selected reservoirs are shown in Figure 2. Although, these watershed boundaries have not been field-checked, preliminary accuracy evaluations indicate that the automated DEM-derived watershed boundaries are comparable with the manually digitized DNR watershed boundaries.

Furthermore, a comparison of topographic, topologic, and hydrologie parameters showed less than 10% deviation of DEM-derived from DNR-digitized watershed boundaries (Table 3). For example, the percent deviations based on total drainage area, drainage density, and mean watershed slope were 1.79, 4.12, and 1.84, respectively. It is important to note that the deviations of DEM-derived watersheds from DNR-digitized watersheds were less than 5%. This is because total drainage area, drainage density, and mean watershed slope are critical to the transport of sediment and agricultural pollutants via streams to reservoirs (Satterlund and Adams, 1992). These results underscore the potential of the automated watershed boundary delineation techniques, described in this paper, in eliminating the need for laborious digitization.

Results of the clustering process are shown in Figures 3 and 4. Figure 3 shows the potential number of classes (NCL) as 3, 5, 13, 17, and 19. A plot of normalized mean cross-validation error rate (NME) was used to identify the preliminary NCL (13) that might best represent Nebraska reservoir classes. Maps of the reservoir classes 3 NCL, 5 NCL, and 13 NCL are shown in Figure 5. Changes in spatial patterns in the maps of reservoir watershed classes in Figures 5a, 5b, and 5c appear to reflect a hierarchy of major environmental conditions that affect lake processes, as the NCL changed from 3 to 13 (see Maxwell et. al, 1995). The map of 3 NCL shows how the watershed classes mimic climate (such as maximum temperature) and related vegetation patterns in Nebraska (Figure 5a). Reservoir watersheds in class 1 occupy the tall grass prairie in eastern Nebraska, while reservoir watersheds in class 2 and 3 are dominated by the Sand Hills prairie and the Niobrara shrub land, respectively. The spatial pattern of reservoir watershed classes in the map of 5 NCL reflects the patterns of both climate and terrain characteristics (such as temperature and relief) on the watersheds (Figure 5b). Besides, reservoir classes in the 5 NCL map show additional segregation of classes in the 3 NCL map.

The map of 13 NCL shows spatial patterns in the reservoir watershed classes that are dominated by climate, terrain, and soil characteristics (Figure 5c). Reservoir watersheds in the northeastern part of Nebraska belonged to class 2. The average size of reservoirs in this group is in the lower 25% of the sampled reservoirs. The watersheds of these reservoirs are generally small and characterized by low relief, high soil erodibility, and high soil organic matter content (Table 2). Reservoir watersheds in classes 1 and 13 dominate southeastern Nebraska. Reservoirs in class 1 are, on average, smaller than those in class 13. Also, the average watershed size in class 13 appears to be larger than that of class 1. Both watershed classes have high soil organic matter content and relatively low soil erodibility; however, the watersheds in class 1 have steeper slope and higher relief compared with watersheds in class 13.

Reservoir watersheds in northwestern Nebraska belong to classes 3, 9, 10, and 11. Classes 3 and 11 have only one watershed each, while classes 9 and 10 have seven and two watersheds, respectively. Although classes 10 and 11 are adjacent, they are not similar. For example, the class 11 watershed has significantly larger area and traverse higher relief terrain than class 10 watersheds. In the north-central part of the state, there are two reservoirs in class 4. These reservoirs are characterized by large watersheds, with relatively low soil organic matter content and high relief. Reservoir watersheds in class 7 are aligned diagonally between the central and southwestern part of the state. These watersheds are fairly similar to class 4 watersheds, except that they exhibit relatively lower relief. The central and southwestern portions of Nebraska are dominated by classes 12 and 8, respectively. Class 8 reservoirs are larger and they have larger watershed size than reservoirs in class 12. Also, class 12 watersheds are found in low relief areas compared with those in class 8. The aforementioned descriptions of the spatial variability of watershed classes in the map of 13 NCL provide a synoptic overview of the general characteristics of these classes.

Additional discussions with respect to how the watershed characteristics influenced the segregation of these classes are provided below. A classification tree model (Figure 6) describes the hierarchical structure of Nebraska reservoir watershed classes, as well as variables that influenced the partitioning of these classes. The rectangular boxes in Figure 6 represent terminal nodes (i.e., no further division of the group) and are assigned a class number. At times, the same class number may be assigned to more than one terminal node. The oval boxes represent non-terminal nodes and require further splitting. The cross-validation prediction error of the classification tree model for reservoir watersheds was 26.33%. Because some reservoir classes (3, 5, 6, and 11) were missing from the classification tree, the map of 13 NCL was revised to show only nine optimal classes (Figure 7). Characteristics of the revised reservoir classes (9 NCL) are summarized in Table 4.

As can be seen from Figure 6, soil organic matter content was responsible for the initial split of watershed classes. Watersheds in classes 4, 7, 8, 9, 10, and 12 were relatively poor in organic matter, while watersheds in classes 1, 2, and 13 were rich in organic matter. The ability of soils to absorb agricultural effluents like pesticides decreases with a decrease in organic matter content (Kumada, 1987; Sparling et al., 2003). It is therefore important to note that most of the reservoir classes (i.e., classes 4, 7, 8, 9, 10, and 12) are inherently vulnerable to pollution from agricultural chemical effluents. Among these watersheds, soil cation exchange capacity (CEC) and drainage density were responsible for final splits into classes 9 and 12. Also, watershed relief, soil CEC, and pH influenced the final splitting into classes 4, 7, 8, and 10. Classes 4 and 7 differed primarily in their respective watershed relief. Despite their low drainage density, both groups have relatively acidic soils with correspondingly low buffering capacity (CEC of less than 12.3). Specifically, the low relief reservoir watersheds in class 7 (i.e., relief less than 247 meters) are even more vulnerable to pesticides or herbicide effluents from agricultural activities in their watersheds.

The separation of organic-rich reservoir watersheds into classes 1, 2, and 13 was influenced by soil erodibility, watershed slope, and organic matter content, respectively. By comparison, the reservoirs in these watershed classes are less susceptible to potential pollution from agricultural effluents like herbicides. The factors that influenced the final partitioning of classes 1, 2, and 13, emphasize the significance of land management practices aimed at controlling soil erosion in these watersheds. This is particularly true of the relatively high mean watershed slope and soil erodibility values for reservoirs in classes 1 and 2. Finally, it can be inferred from the revised map that the water quality of Nebraska reservoirs could be managed sufficiently based on 9 optimal classes (Figure 7). SUMMARY AND CONCLUSIONS

A GIS-based approach to watershed classification of Nebraska reservoirs was described in this paper. The GIS was first used to develop an updated digital map of Nebraska lakes and to differentiate reservoirs from natural lakes and sand pits. A DEM- based automated technique was also used to delineate reservoir watershed boundaries from nationally available EDNA (Elevation Derivatives for National Applications) dataset. The watershed boundaries of 78 selected reservoirs were then used to extract summary statistics for watershed characteristics datasets that were believed to influence biophysical and chemical water quality of lakes. As the primary interest was in modeling pre-settlement (potential) watershed conditions for the purpose of establishing lake water quality standards, human influence was considered to be minimal and hence Nebraska land use/land cover was treated as predominantly native grassland. This assumption was essential to the ultimate lake management goal of developing baseline information (water quality standards or reference conditions) against which to assess the extent of human impacts on different classes or groups of reservoirs.

A series of cluster analyses was performed on the watershed characteristics datasets in order to statistically determine the possible structure of Nebraska reservoir classes. A plot of the Pseudo-F values (obtained from the cluster analysis output) against the respective number of classes (NCL), suggested that Nebraska reservoirs could potentially be grouped into 3, 5, 13, 17, and 19 classes. Further analysis of the NCL was performed based on the predictive effectiveness of the potential NCLs using a classification tree model. The outcome of the classification tree modeling indicated that the preliminary number of Nebraska watershed classes was 13 NCL. Furthermore, the classification tree was used to assess the hierarchical structure of the Nebraska reservoir classes, and soil organic matter content was found to be the most important single variable for partitioning the watersheds. The cross- validation prediction error of the classification tree model for Nebraska reservoir watersheds was 26.3%. Finally, the preliminary 13 NCL map was revised to 9 NCL due to 4 missing classes in the classification tree.

Conclusions and Recommendations

The foregoing results suggest that Nebraska reservoirs can be characterized by nine optimal classes based on watershed characteristics. Moreover, soil organic matter content is the most important variable for segregating the reservoirs. Also, the watershedbased reservoir classification procedure described in this paper has potential national applications. This is because geospatial data used in this study are available for the entire United States. In particular, the automated DEM-based watershed boundary delineation method is nationally available

Through model refinement, outputs of the classification tree procedure for the watershed-based reservoir classification could provide water resources managers an effective decision-support tool in the management of reservoir water quality. For example, the classification results could inform resource managers in the reservoir nutrient criteria development process. Furthermore, an interpretive classification interface can be generated in the seeS(R) software based on a classification tree model to predict the classes to which new cases belong. This is particularly useful to water resource managers interested in identifying the class membership of a particular reservoir.

Although successful, this research clearly suggests the need for additional investigation. Future work should include expanding the STATSGO datasets to incorporate watersheds that extend into neighboring states (Colorado, Kansas, South Dakota, and Wyoming). This may highlight the impact of large reservoirs on the classification results, as watersheds of most of the large reservoirs in the GIS database fall outside the Nebraska state boundary. Equally important are the potential advantages of higher resolution data (watershed characteristics derived from SSURGO database) on the lake classification process. Further work is also needed to evaluate the performance of existing modifications or alternatives to kmeans clustering. This need stems from limitations of k-means clustering such as sensitivity to outliers or extreme values, susceptibility to the choice of starting points (cluster centroids), and tendency to produce classes with most data points concentrated in a few classes (Eldershaw and Hegland 1997; Legendre and Legendre, 1998; Gordon, 1999; Estivill-Castro and Houle, 2001).

Preliminary work by Bulley (2004) demonstrated how the reservoir classes derived from watershedbased classification tree procedure compares favorably with other classification approaches, namely discriminant analyses and Omernik's Level IV ecoregions. In particular, the results confirmed previous assertion that classification trees are most useful in dealing with complex datasets such as ecological data (Breiman et al., 1984; Quinlan 1993; German et al., 1999; De'ath and Fabricius, 2000; Robertson and Saad, 2003). Nonetheless, there is a need for additional research to refine the classification tree splitting process to enhance the predictive effectiveness of the classification tree output models (Breiman et al., 1984; Quinlan, 1993; RuleQuest Research, 2003; Bulley, 2004).


This research was partially supported by the U.S. EPA Science to Achieve Results (STAR) Program (Grant R828635). Kris Verdin and Norman Bliss, USGS EROS Data Center, provided critical assistance in using EDNA and STATSGO datasets, respectively, and their efforts are gratefully acknowledged. Thanks are also extended to Josh Lear, Nebraska Department of Natural Resources, for providing an initial GIS water coverage from the Nebraska SSURGO database. This paper is a contribution of the University of Nebraska Agricultural Research Division, Lincoln, Nebraska. Journal Series Number 14748.

Bulley, Henry N.N., James W. Merchant, David B. Marx, John C. HoIz, and Aris A. HoIz, 2007. A GIS-Based Approach to Watershed Classification for Nebraska Reservoirs. Journal of the American Water Resources Association (JAWRA) 43(3):605-621. DOI: 10.1111/ j.l752-1688.2007.00048.x

1PaPBr No. J04141 of the Journal of the American Water Resources Association (JAWRA). Received August 10, 2004; accepted August 5, 2006. (c) 2007 American Water Resources Association. No claim to original U.S. government works.


Bohn, B.A., and J.L. Kershner, 2002. Establishing Aquatic Restoration Priorities Using a Watershed Approach. Journal of Environmental Management 64:355-363.

Breiman, L., J.H. Friedman, R.A. Olshen, and C.J. Stone, 1984. Classification and Regression Trees. Wadsworth, Inc. Belmont, California, 358 pp.

Bulley, H.N.N., 2004. A Watershed-Based Classification System for Lakes in Agriculturally Dominated Ecosystems: A case Study of Nebraska Reservoirs. Ph.D. Dissertation. University of Nebraska- Lincoln, 213 pp.

Burke, I., T. Kittel, W. Lauenroth, P. Snook, C. Yonker, and W. Parton, 1991. Regional Analysis of the Central Great Plains. Bioscience 4(10):685-692.

Calinski, R.B., and J. Harabasz, 1974. A Dendrite Method for Cluster Analysis. Communications in Statistics 3:1-27.

Carlson, R.E., 1977. Trophic State Index for Lakes. Limnology and Oceanography 22:361-369.

De'ath, G., and K.E. Fabricius, 2000. Classification and Regression Trees: A Simple yet Powerful Technique for Ecological Data Analysis. Ecology 8(11):3178-3192.

Eldershaw, C., and M. Hegland, 1997. Cluster Analysis Using Triangulation. In: Computational Techniques and Applications: CTAC97, B.J. Noye, M.D. Teubner, and A.W. Gill (Editors). World Scientific, Singapore, pp. 201-208.

Emmons, E.E., M.J. Jennings, and C. Edwards, 1999. An Alternative Classification Method for Northern Wisconsin Lakes. Canadian Journal of Fisheries and Aquatic Sciences 56(4):661-669.

Estivill-Castro, V., and M.E. Houle, 2001. Robust Distance-Based Clustering with Applications to Spatial Data Mining. Algorithmica 30(2):216-242.

Foley, J.A., C.J. Kucharik, T.E. Twine, M. Coe, and S. Donner, 2004. Land Use, Land Cover and Climate Change across the Mississippi Basin: Impacts on Selected Land and Water Resources. In: Ecosystems and Land Use Change, R. DeFries, G. Asner, and R. Houghton (Editors). American Geophysical Union, Washington, D.C., pp. 249- 261.

Garbrecht, J., and L.W. Martz, 2003. Assessing the Performance of Automated Watershed Delineation Process from Digital Elevation Models. In: GIS for Water Resource and Watershed Management, J.G. Lyon (Editor). CRC Press, Boca Raton, FL, pp. 17-24.

German, G.W.H., G.A.W. West, and M.G. Gahegan, 1999. Statistical and AI Techniques in GIS Classification: A Comparison. Proc. of the llth Annual Colloquium of the Spatial Information Research Centre, University of Otago, Dunedin, New Zealand, December 1999. CD-ROM.

Gesch, D., M. Oimoen, S. Greenlee, C. Nelson, M. Steuck, and D. Tyler, 2002. The National Elevation Dataset. Photogrammetric Engineering and Remote Sensing 68(11:5-11.

Gordon, A., 1999. Classification, 2nd Edition. Chapman and Hall/ CRC, London, 256 pp.

Goute, C., 1997. Note on Free Lunches and Cross-validation. Neural Computation 9:1211-1215.

Halkidi, M., Y. Batistakis, and M. Vazirgiannis, 2001. On Clustering Validation Techniques. Journal of Intelligent Information Systems 17(21:107-145. Halkidi, M., Y. Batistakis, and Vazirgiannis, 2002. Cluster Validity Methods: Part I. SIGMOD Record 31(2):40-45.

Hansen, M., R. Dubayah, and R.S. DeFries, 1996. Classification Trees: An Alternative to Traditional Land Lover Classifiers. International Journal of Remote Sensing 17(5): 1075-1081.

Hargrove, W.W., and R.J. Luxmoore, 1998. A New High-Resolution National Map of Vegetation Ecoregions Produced Empirically Using Multivariate Spatial Clustering. ESRI ARC/INFO User Conference, http://gis.esri.com/library/userconf/proc98/proceed/ to350/pap333/ p333.htm. Accessed on July 27, 2003.

Hartigan, J.A., and M.A. Wong, 1975. A K-means Clustering Algorithm: Algorithm AS 136. Applied Statistics 28:126-130.

Hawkins, C.P., R.H. Norris, J. Gerritsen, R.M. Hughes, S.K. Jackson, R.K. Johnson, and R.J. Stevenson, 2000. Evaluation of Landscape Classifications for the Prediction of Freshwater Biota: Synthesis and Recommendations. Journal of the North American Benthological Society 19(3):541-556.

Heiskary, S.A., 2000. Ecoregional Classification of Minnesota Lakes. In: USEPA (U.S. Environmental Protection Agency). 2000. Nutrient Criteria Technical Guidance Manual for Lakes and Reservoirs. Report No. EPA-822-B00-001. Washington, D.C., pp. B4- B5.

Holz, J.C., 2002. Lake and Reservoir Classification in Agriculturally Dominated Ecosystems. EPA 2002 Aquatic Ecosystem Classification Workshop, Denver, CO, September, 2002. Invited Oral Presentation.

Jain, A.K., and R.C. Dubes, 1988. Algorithms for Clustering Data. Prentice Hall, Upper Saddle River, NJ.

Jenerette, G.D., J. Lee, D. Waller, and R.E. Carlson, 2002. Multivariate Analysis of the Ecoregion Delineation for Aquatic Ecosystems. Environmental Management 29(l):67-75.

Jensen, S., and E. Van Der Maarel, 1980. Numerical Approaches to Lake Classification with Special Reference to Macrophyte Communities. Vegetatio 42:117-128.

Johnsgard, P.E., 2001. The Nature of Nebraska: Ecology and Biodiversity. University of Nebraska Press, Lincoln, Nebraska, 402 pp.

Kumada, K., 1987. Chemistry of Soil Organic Matter. Elsevier, Amsterdam, 242 pp.

Kuzelka, R.D., C.A. Flowerdale, R.N. Manley, and B.C. Rundquist, 1993. Flat Water: A History of Nebraska and its Water. Resource Report No. 12. Conservation and Survey Division, IANR, University of Nebraska-Lincoln, 291 pp.

Legendre, P., and L. Legendre, 1998. Numerical Ecology, 2nd English Edition. Elsevier Science, BV, Amsterdam, 853 pp.

Lomnicky, G.A., 1995. Lake Classification in the Glacially Influenced Landscape of the North Cascade Mountains, Washington, USA. PhD. Dissertation. Oregon State University, Oregon.

Maxwell, J.R., C.J. Edwards, M.E. Jensen, S.J. Paustian, H. Parrott, and D.M. Hill, 1995. A Hierarchical Framework of Aquatic Ecological Units of North America (Nearctic Zone). Technical Report NC-176:l-76. United States Department of Agriculture, Forest Service, Washington D.C., USA.

McMahon, G., S.M. Gregonis, S.W. Walton, J.M. Omernik, T.D. Thorson, J.A. Freeouf, A.H. Rorick, and J.E. Keys, 2001. Developing Spatial Frameworks of Common Ecological Regions of the Conterminous United States. Environmental Management. 28(31:293-316.

Mehan, G.T., 2002. Committing EPA's Water Program to Advancing the Watershed Approach. EPA memo to Regional Water Division Directors. December 3, 2002. http://www.epa.gov/owow/ watershed/ memo.html. Accessed on February 18, 2004.

Michaelson, J., F. Davis, and M. Borchert, 1987. Non-parametric Methods for Analyzing Hierarchical Relationships in Ecological Data. Coenoses 1:97-106.

Milligan, G.W., and M.C. Cooper, 1985. An Examination of Procedures for Determining the Number of Clusters in a Data Set. Psychometrika 50:159-179.

Mitchell, Tom. M., 1997. Machine Learning. McGraw-Hill, New York, 414 pp.

National Research Council, 1999. New Strategies for America's Watersheds. Committee on Watershed Management, Water Science and Technology Board, Commission on Geosciences, Environment, and Resources, National Research Council. National Academy Press, Washington, D.C.

Niles, R.K., D.L. King, and R. Ring, 1996. Lake Classification Systems - part I. The Michigan Riparian, http://www.mslwa.org/ lkclassifl.html. Accessed on May 27, 2002.

Olivera, F., S. Reed, and D. Maidment, 2000. HEC-PrePro version. 2.0: An ArcView Pre-Processor for HEC's Hydrologie Modeling System. University of Texas at Austin - Center for Research in Water Resources, Austin, Texas, http://www.ce.utexas.edu/prof/ olivera/ esri98/p400.htm. Accessed on April 12, 2002.

Omernik, J.M., 1987. Ecoregions of the Conterminous United States. Annals of the Association of American Geographers 77:118- 125.

Omernik, J.M., C.M. Rohm, R.A. Lillie, and N. Mesner, 1991. Usefulness of Natural Regions in Lake Management: Analysis of Variation among Lakes in Northwestern Wisconsin, USA. Environmental Management 15:281293.

Pfafstetter, O., 1989. Classification of Hydrographie Basins: Coding Methodology. Unpublished Manuscript. DNOS, August 18, 1989, Rio de Janeiro. Translated by J.P. Verdin, U.S. Bureau of Reclamation, Brasilia, Brazil, September 5, 1991.

Ponce, V.M., 1989. Engineering Hydrology: Principles and Practices. Prentice-Hall, Inc., New Jersey, 627 pp. Quinlan, J.R., 1986. Induction of Decision Trees. Machine Learning 1(1):81-106.

Quinlan, J.R., 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., CA, 302 pp.

Robertson, D.M., and D.A. Saad, 2003. Environmental Water- Quality Zones for Streams: A Regional Classification Scheme. Environmental Management 31(5):581-602.

RuleQuest Research, 2003. See5(R): An Informal Tutorial, http:// rulequest.com/see5-win.html. Accessed on January 24, 2003.

SAS Institute Inc., 2000. SAS(c) Software Version 8: Users Manual. SAS Institute Inc, Gary, NC.

Satterlund, D.R., and P.W. Adams, 1992. Wildland Watershed Management, 2nd Edition. John Wiley and Sons, New York, NY, 436 pp.

Schindler, D.W., 1971. A Hypothesis to Explain the Differences and Similarities Among Lakes in Experimental Lakes Area, Northwestern Ontario. Journal of Fisheries Research, Canada 28:295- 301.

Sparling, G., R.L. Parfitt, A.E. Hewitt, and L.A. Schipper, 2003. Three Approaches to Define Desired Soil Organic Matter Contents. Journal of Environmental Quality 32:760-766.

Soil Survey Division Staff, 1993. Soil Survey Manual. Soil Conservation Service, U.S. Department of Agriculture, Handbook 18, Washington, DC.

Tibshirani, R., G. Walther, D. Botstein, and P. Brown, 2001. Cluster Validation by Prediction Strength. Technical Report, Department of Statistics, Stanford University, 21 pp. http:// www- stat.stanford.edu/~tibs/ftp/predstr.pdf. Assessed on May 23, 2003.

Theodoris, S., and K. Koutroumbas, 1999. Pattern Recognition. Academic Press, San Diego, CA, 625 pp.

Thornton, P.E., S.W. Running, and M.A. White, 1997. Generating Surfaces of Daily Meteorological Variables Over Large Regions of Complex Terrain. Journal of Hydrology 190:214-251.

Tou, J.T. and R.C. Gonzalez, 1974. Pattern Recognition Principles. Addison-Wesley, Reading.

Ujjwal, M., and S. Bandyopadhyay, 2002. Performance of Some Clustering Algorithms and Validity Indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12): 1650-1654.

USEPA (U.S. Environmental Protection Agency), 1993. TAe Watershed Protection Approach. EPA 840-S-93-001, Washington, D.C.

USEPA (U.S. Environmental Protection Agency), 1997. The Index of Watershed Indicators. EPA841-R7-010, Washington, D.C.

USEPA (U.S. Environmental Protection Agency), 200Oa. A Summary of the National Water Quality Inventory: 2000 Report to Congress. EPA841-S-00-001, Washington, D.C.

USEPA (U.S. Environmental Protection Agency), 200Ob. Nutrient Criteria Technical Guidance Manual for Lakes and Reservoirs. Report No. EPA-822-BOO-OOl. USEPA Washington, D.C.

USEPA (U.S. Environmental Protection Agency), 2002. Levels III and IV Ecoregions of the Continental United States (revision of Omernik, 1987). EPA National Health and Environmental Effects Laboratory, Western Ecology Division, Corvallis, Oregon.

USGS (U.S. Geological Survey), 2001a. Elevation Derivatives for National Applications (EDNA), http://edna.usgs.gov. Accessed on July 25, 2003.

USGS (U.S. Geological Survey), 2001b. EDNA Stage 2 Tool Overview, http://edcntsl2.cr.usgs.gov/ned-h/stage2/stage2.htm. Accessed on December 22, 2003.

Verbyla, D.L., 1987. Classification Trees: A New Discrimination Tool. Canadian Journal of Forestry Research 17:1150-1152.

Verdin, K.L., and J.P. Verdin, 1999. A Topological System for Delineation and Codification of the Earth's River Basins. Journal of Hydrology 218:1-12.

Vollenweider, R.A., 1968. Scientific Fundamentals of the Eutrophication of Lakes and Flowing Waters, with Particular Reference to Nitrogen and Phosphorus as Factors in Eutrophication. Tech. Report. DAS/SCI/68.27. Organization for Economic Cooperation and Development (OECD), Directorate for Scientific Affairs, Paris, France, 192 pp.

Warren, C.E., 1979. Toward Classification and Rationale for Watershed Management and Stream Protection. EPA - 600/ 3-79-059. U.S. Environmental Protection Agency, Corvallis, Oregon, 143 pp.

Winter, T.C., 2001. The Concept of Hydrologic Landscapes. Journal of the American Water Resources Association 37(2):335-349.

Henry N.N. Bulley, James W. Merchant, David B. Marx, John C. HoIz, and Aris A. HoIz2

2 Respectively, (Bulley) Department of Geography and Geology, DSC 260, University of Nebraska-Omaha, Omaha, Nebraska 68182; (Merchant) Center for Advanced Land Management Information Technologies, School of Natural Resources, 306 Hardin Hall, University of Nebraska- Lincoln, Lincoln, Nebraska 68583; (Marx) Department of Statistics, 342 Hardin Hall, University of Nebraska-Lincoln, Lincoln, Nebraska 68583; and (HoIz and HoIz) School of Natural Resources, 507 Hardin Hall, University of Nebraska-Lincoln, Lincoln Nebraska 68583 (E- mail/Bulley: [email protected]).

Copyright American Water Resources Association Jun 2007

(c) 2007 Journal of the American Water Resources Association. Provided by ProQuest Information and Learning. All rights Reserved.