Remote Sensing of Soils in the Santa Monica Mountains:
II.  Hierarchical Foreground and Background Analysis

Alicia Palacios-Orueta, Jorge E. Pinzon, Susan L.Ustin, and Dar A. Roberts
alicia.palacios@uv.es    jepinzon@ucdavis.edu    slustin@ucdavis.edu
Center for Spatial Technologies and Remote Sensing (CSTARS)
Department of Land Air and Water Resources
University of California, Davis
Davis, CA 95616
Address for Contact:
Alicia Palacios-Orueta
Departament de termodinàmica. Facultat de Física.
Universitat de Valencia
C/ Doctor Moliner 50
46100 Burjassot, Valencia, Spain
Phone: (34) (6) 386 43 50/ 398 3110
Fax: (34) (6) 364 23 45
e-mail: alicia.palacios@uv.es

Abstract

Hierarchical foreground and background analysis (HFBA) was used to discriminate soil properties from two valleys in the Santa Monica Mountains Recreation Area, California.  The analysis was organized in two levels.  First, spectral data from laboratory measured soil samples were used to train a vector in AVIRIS data for classifying the soils between valleys.  The prediction of organic matter and iron contents is performed at a second level of resolution.  Results showed that in the laboratory soils could be classified at a high level of accuracy.  When applied to the image, the spatial predictions of organic matter and iron content were consistent for the first level of classification. The ranges of predicted organic matter and iron contents developed at the second level of classification were also consistent with the magnitude and distribution of field samples.  The presence of vegetation and the steep terrain affect adversely the ability to resolve these soil properties.

Introduction

Soil variability is a critical issue when modeling at landscape or regional scales.  Since soils are the buffer between surface processes and the underlying rock, they are continuously adjusting to changing environmental conditions.  The extreme spatial and temporal variability of these processes makes soil properties extremely variable and therefore, difficult to measure.  Because surface processes occur at different scales, it is necessary to work at sufficiently large spatial resolution and coverage for generalizations to be made. Certain soil characteristics can be used as indicators of long or short term landscape stability at different time or spatial scales.  Although there is a large set of soil properties that could be used to study soil changes, the key to finding a practical way to measure them at large scales is to select the properties most related to soil changes for a specific environment.  Imaging spectrometry, a new development in satellite and airborne remote sensing offers a potential way to map certain soil properties that are relevant to surficial processes at the landscape scale.
 
Organic matter is a soil property closely related to soil quality, not only as an indicator of soil erosion and degradation, but also as a regulating factor of processes such as nutrient availability, water holding capacity, and permeability.  Because values of organic content are highly variable and react very quickly to external changes (Gerrard, 1992), decomposition rates show high spatial variability.  The spatial distribution of organic matter content can be an indicator of the rate of decomposition and other processes happening on the soil surface.  Another property related to soil surface characteristics is the iron content.  Since iron content varies with erosion class (Latz et al., 1984), and weathering level (Coleman et al., 1991), this property can be used to monitor changes in soil quality.
 
Remote sensing techniques are a relevant tool for dealing with these issues.  In the last few years these techniques have improved in such a way that they offer the potential of direct analysis of soil properties.  Several multispectral sensors have already been used for discrimination between soils (Lewis et al., 1975; Agbu et al., 1990; Coleman et al., 1993).  Specifically, there are several studies where organic matter and iron content have been analyzed in terms of their reflectance properties (Al-Abbas et al., 1972; Stoner and Baumgardner, 1981; Latz et al., 1984; Coleman and Montgomery, 1987; Henderson et al., 1989; Coleman et al., 1991).
 
Imaging spectrometer data are both a set of spatially contiguous spectra and a set of spectrally contiguous images (Kruse et al., 1996).  Hyperspectral data has been shown to be useful for improved discrimination of minerals (Clark et al., 1990; Kruse et al., 1990; Kruse et al., 1993; Cloutis, 1996); the possibilities for using these sensors are more extensive than applications for broad band sensors such as Landsat TM or SPOT.  Palacios-Orueta and Ustin (1996) found that Advanced Visible/Infrared Imaging Spectrometer (AVIRIS), a 224 band airborne imaging spectrometer, data were suitable for discriminating between similar bare soils.  They compared the distribution of AVIRIS data with laboratory spectrometry data, concluding that the distributions of both data sets followed the same statistical patterns.  Other studies have dealt with soil identification and discrimination directly or indirectly using AVIRIS data (Smith et al., 1990; Mustard, 1993; Roberts et al., 1993).
 
Another issue when dealing with remote sensing is the problem of scale.  An important concern is to not lose soil process information when spatially averaging at pixel sizes.  If the variability within pixels is small, or if high within-pixel variability is due to random processes, then decreasing the resolution of the observations can improve the chance of detecting significant sources of variation.  Conversely, if between pixel variability is high and detectable, increasing spatial resolution will improve analysis.  While tradeoffs can be made between spatial and spectral resolution, the basis for selecting the optimal dataset characteristics is not clear at present.
 
Because imaging spectrometry entails the processing and analysis of potentially hundreds of spectral bands, traditional image processing methods are not practical for analyzing these types of data.  New models need to be developed to optimize the information that can be extracted from these sensors.  Ideally, imaging spectrometry analysis would treat the spatial and spectral patterns in the data simultaneously (Kruse et al., 1996).  A significant problem for soil analysis is the presence of vegetation in pixels.  Because the signatures of soils and vegetation are so different, the lesser variability contained within the soil spectral fraction is not significant enough for soil discrimination when vegetation is also present in the pixel.
 
New methods that are able to diminish the vegetation signature and extract the variability of the soil signatures need to be investigated and developed.  Foreground/Background Analysis is a new technique developed by Smith et al., (1994) as an improvement of Spectral Mixture Analysis.  In this technique, spectral measurements are divided into groups of foreground and background spectra that comprise a selected subset of spectra that emphasizes the presence of a signature of interest, i.e., the desired characteristics (Smith et al., 1994).  FBA is a modified sequential spectral mixture analysis that uses Singular Value Decomposition (SVD) to derive a series of vectors by extracting user defined sources of “foreground” spectral variation while simultaneously minimizing “background” spectral variation.  Pinzón et al. (1995) found that the method presents good predictions and good r2 statistics but predictions were not robust in this form.  In order to address this concern, FBA was modified to project the spectra into a property-specific axis of continuous variation (Pinzón et al., 1998).  In HFBA, the FBA equation is applied at several levels in a hierarchical way, thus, the variability is confined at each step, making it possible to extract subtle absorption features.  Further explanation of this method is found in (Pinzón et al., 1998).
 
To examine the application of this method to improve detection of soil properties using an imaging spectrometer, we chose to map the spatial distribution of organic matter content and total iron from soil samples in two watersheds in the Santa Monica Mountains Recreation Area.  Palacios-Orueta and Ustin (1998) tested the potential of AVIRIS wavelengths for discriminating soils in these watersheds using laboratory spectrometer data.  They found that soils from each location could be discriminated based on organic matter and iron contents.  The purpose of this work is to test the performance of HFBA (Hierarchical Foreground and Background Analysis) applied to AVIRIS data for the discrimination of these soils.

Study Site

The soils analyzed belong to two watersheds: La Jolla Valley and Serrano Valley, within the Point Mugu State park in the western (coastal) region of the Santa Monica Mountains National Recreation Area, California (Figure 1).  This mountain range is located between Ventura and Solano Counties along the coast of the Pacific Ocean.  The climate is typically Mediterranean with dry summers and warm winters.  Late 1993 a wildfire removed most of the vegetation in both valleys.  Further information about this area is in Palacios-Orueta and Ustin (1998).

Soils and Parent Material

The soils were formed from weathered sandstone, shale and basic igneous rocks, and from alluvium derived from these mixed sources (Dibblee and Ehrenspeck, 1990). These materials are distributed over three geologic units: surficial sediments of Pleistocene age, the Lower Topanga formation, and the Conejo Volcanics.  The Conejo Volcanics formation occurs mainly in the vicinity of Serrano Valley while the Topanga formation and the surficial sediments are found in La Jolla Valley.  Serrano Valley is composed mainly of basic igneous rocks.  The soil moisture regime is xeric and the soil temperature regime is considered as thermic.  The steep terrain and the distance to the ocean create different environments which as a consequence, causes high variability in the soils.  Most soils are classified as Mollisols, in the Xerolls suborder.  Some soils from the valley bottoms are richer in clay and are classified as Vertisols.  There is also a large area covered by Inceptisols in La Jolla Valley (Edwards et al., 1970).

Data and Methodology

Data

Four data sets are used in this analysis:  (1) soil physicochemical properties, (2) soil laboratory spectral properties, (3) AVIRIS image data, and (4) spatial data organized in a GIS database.

Field Data

Field Soil Data Collection.

Eighty-three soil samples of approximately 0.5 L volume, were collected from the surface top three cm of soil from La Jolla and Serrano Valleys.  Only Seventy-four samples were used in the analyses because only these could be accurately located by differentially corrected Global Positioning Satellite (GPS) measurements.  The sample sites were selected to represent the range of aspect, slope, elevation and parent materials within the area, although this goal could not be completely achieved due to the roughness of the terrain.

The locations of the soil samples were identified using a Global Positioning System unit (Trimble Navigation PROXL) with +/- 1 meter accuracy after differential correction.

Laboratory analyses

Physico-chemical analyses.

The soil samples were analyzed by the DANR (Division of Agriculture and Natural Resources), Analytical Laboratory at U.C. Davis, to determine the organic matter content, particle size distribution, and total iron content.

Statistical analyses showed that the mean iron content in Serrano Valley soils were significantly higher than in La Jolla Valley.  In contrast, organic matter content was significantly higher in La Jolla Valley.  Variances were similar in both valleys.  There was no significant difference in particle size fractions between either location.  Other relevant information about soil characteristics is in Palacios-Orueta and Ustin (1998).

Spectroscopic Analysis.

The spectral data set includes laboratory reflectance spectra measured in a Varian Cary 5E spectrophotometer, and two merged AVIRIS scenes that include the valleys.

Sample Preparation and Spectroscopic Technique

The soil sample preparation followed the standardized procedure from (Henderson et al., 1992).  Further information about this procedure and the spectroscopic technique can be found in Palacios-Orueta and Ustin (1998).

Geographic Information Systems Database

The geographic information was organized in a GIS (Arc/Info) database.  The AVIRIS scenes were georeferenced using control points and combined in the database with ancillary information composed of a U.S.G.S. Digital Elevation Model and digitized geologic map (Dibblee and Ehrenspeck, 1990).  The chemistry data was also included in the database.  The output maps were included as layers of the geographical database as well.

Image Data

The AVIRIS imagery for this study was acquired on April 11, 1994.  The AVIRIS sensor acquires 224 contiguous spectral bands with spectral resolution of 10 nm, between 400 and 2500 nm.  Its nominal spatial resolution is 20 m (Vane et al., 1993).  Two adjacent image scenes were used for this analysis.  The two scenes covered an area of 9 km east-west by 3 km north-south.

Apparent surface reflectance retrieval was accomplished using a radiative-transfer based atmospheric model (MODTRAN 2) that accounts for spatial variation in atmospheric conditions (Green et al., 1993).

Methodology

HFBA was developed by Pinzón et al. (1995) as an improvement of FBA (Smith et al., 1994).  HFBA sequentially derives a series of SVD vectors by extracting spectral information at different levels of chemical variation.  Therefore, the procedure highlights subtle absorption features that can be directly related to a particular soil property.  These vectors were calculated using a training set derived from laboratory data.  The performance of the analysis was tested with the whole laboratory spectral data set.  The vectors for identifying organic matter and iron content were applied to the AVIRIS image in a second step.  The vectors were derived by clustering samples with similar strong spectral features at different levels of the analysis.  Singular Value Decomposition (SVD) is used to solve the HFBA equation at each level in the analysis.  The general form of the equation is:
 
R*V = P
 
where R is matrix of the reflectance spectra, V is the classification vector and P is the chemical property.  Each spectrum is normalized in order to reduce dependencies on the conditions under which the measurements are made (Pinzón et al., 1998).  The approach taken in this work (Figure 2) is to narrow the variance by stratification of the soil population into small but reliable ranges of variability that can be consistently detected.  This process is done by first discriminating the soils between the valleys, and second, by investigating the variability related to iron and organic matter content within each group.  By using a two step hierarchical process, there is a smaller range of spectral variability in the second level.  In the first step, a vector is derived to minimize variability inside the groups and maximizing it between groups.  A subset of soils (11) was chosen to train the vector to separate soils of Serrano and La Jolla Valleys.  To avoid confusion with names of unique soil phases and series or the older definition of soil type, hereafter, we refer to soil samples expressing the physiochemical properties of Serrano Valley or La Jolla Valley as “Serrano soils” or “La Jolla soils.”  The complete set of 74 soil samples was used to test the results of the classification scheme.  Finally, the vector chosen was the one that best stabilizes the system, in the sense of summarizing the greatest variability using the smallest number of samples in the training set.  The physicochemical data were quantified in level two based on their distribution and variance in each soil property domain in order to increase reliability and robustness.  Again, the complete data set was used to identify the vector that obtained the most robust results.
 
Once the vectors were derived using the laboratory data, they were applied to classify each of the pixels into one of the two valleys at the first level, then into a class level of total iron content or organic matter content at the second level.  The analyses were done in Matlab (1994).

Results and Discussion

Laboratory Spectrometry Data

First Level of Soil Classification: Vector Training

Spectra of six soil samples from La Jolla Valley and five from Serrano Valley were chosen to train the HFBA system at the first level.  The soils from Serrano Valley have significantly higher iron content and soils from La Jolla have significantly higher organic matter content (Palacios-Orueta and Ustin, 1998).  Although other sources of variability between the soils at these two locations are likely, the spectral variability due to the combination of these two properties is summarized in this step.  HFBA uses a supervised classification scheme where each valley was represented by SVD values in which Serrano soils ranged from 0 to 7 and La Jolla from 7 to14.  Then, the spectra in the training set were projected by the HFBA vector to the center of each class.  Only four spectral samples were mis-classified (Table 1) and the rest were assigned to intermediate values between classes.  Because of this, an intermediate class was defined and seven samples were located in this class.  Out of the four spectra mis-classified, three were collected in La Jolla Valley but were allocated to Serrano class.  These samples show low or intermediate organic matter contents, which is characteristically within the range of organic matter contents found in Serrano Valley.  The only sample from Serrano Valley that was assigned to the La Jolla class had low iron and high organic matter content, a pattern that was characteristic of soils from La Jolla Valley.  The soil samples classified in the intermediate class were found to show spectral features intermediate between both valleys, as well as having transitional values of iron and organic matter content.
 
Figure 3 shows the mean and standard deviation spectra for each location and the HFBA vector that yielded the best discrimination between valleys.  It can be observed that the two spectral areas most important for discrimination between the valleys were near 1000 nm and 2200 nm.  Although the highest weights resulting from the HFBA SVD were assigned to the band at 2200 nm, a wide area between 700 and 1400 nm and centered at 1000 nm was consistently negatively weighted.  This means that a wide area around 1000 nm is important in the discrimination while only a few bands around 2200 nm are significant.  Palacios-Orueta and Ustin (1998) found that in these soils, the area around 1000 nm was not only related to iron but also to organic matter content, thus low reflectance in this band by itself is not sufficient to determine the iron or organic matter contents.  From the mean spectra, it is observed that the reflectance at 1000 nm is significantly different between valleys.  The absorption bands centered at 2200 and 2300 nm are most likely due to the presence of Al-OH and Mg-OH in dioctahedral and trioctahedral clays respectively (Hunt and Salisbury, 1970).  The differences in parent material between valleys could produce this effect.  These results combined with the analysis of classification errors support the idea that although there must be other sources of variability, organic matter and iron contents play a critical role in the spectral discrimination between valleys.

Second Level: Organic Matter and Iron Contents

At the second HFBA level, the analysis focused on extracting information related to smaller, subtler sources of variability within each valley.  In order to do this, two analysis tools were used: the quantization of the range of chemical data and the selection of the soil samples in the training set.  Since the distributions of the variables are different between valleys, the way that the classes are grouped can be a significant factor when searching for the variability of that specific property.  The selection of the soil samples for the training set is a determining factor as well, hence, the samples chosen should be representative of the distribution of the range of the chemical variables.  The training set selected for each group and each property is the one that produces the vector that best stabilizes the system.  Since the distributions of iron and organic matter were different between valleys, the assigned training sets were different as well.  In each of the valley groups defined at the first level, two new vectors were trained to classify spectral samples for  organic matter and iron contents.  The results from this analysis are: the HFBA vectors, the regression between the measured values and the predicted soil property values, and the comparison between the original and the predicted distributions. Vectors a, b, c, and d (Figure 4) were trained with the soils classified as either Serrano (a,c) or La Jolla (b,d) for both organic matter (a,b) and iron (c,d).

Organic Matter determination: Vector Training

Table 2 shows the four quantization ranges (R1-R4) for organic matter content, mid-class percentage value, the number of samples in the training set, and the whole data set.  Figure 5 shows the distributions of the measured and the predicted values for organic matter in both valleys and the results from the regression analysis.  The continuous line represents the predicted values and the dashed line represents the measured data.  The r2 value is 0.72 (n=74; p<0.001), and only five samples are outside one standard deviation.  The distributions of the predicted and the measured data follow similar patterns assigning more samples to the centrally placed values.  Both HFBA organic matter vectors (Figure 4, a, b) show a concave shape around the 700 nm region although in La Jolla the minimum value is slightly shifted towards 800 nm.  In these soils the SVD weights increase until reaching the highest value at 1400 nm.  The band at 2200 nm is highly weighted in Serrano, while in La Jolla the band centered at 2300 nm gets highest positive weights. In La Jolla the vector is smoother over a wider range of wavelengths.  This may be due to the higher organic matter and lower iron contents of soils in this valley.  Organic matter features are stronger and more clearly observed.

Iron Determination: Vector Training.

Table 3 shows the centers of the quantized levels for iron content in Serrano and in La Jolla Valleys, the mid-class percentile, as well as the distribution of the training set, and complete data set over this range. Figure 6 shows the distributions of the predicted and the measured data and the regression analysis for iron content for the whole data set.  It is observed that both distributions follow the same pattern although the classification system again over-predicts intermediate values (Figure 6).  The r2 value is 0.46 (p<0.01), and most of the predicted samples fall within one standard deviation (shown as dashed line on figure).  The HFBA iron vector in Serrano (Figure 4, c) shows sharp wavelength features, mainly in the short-wave infrared and in the 2000 to 2400 nm regions.  The highest vector weights are given to the bands near 500 and 800 nm but with opposite signs. The vector weightings markedly show the shape of the ferrous absorption band at 1000 nm.  Hunt and Salisbury (1970) reported the existence of three absorption bands at 450, 510 and 550 nm due to electronic transitions in the ferrous ion.  They also reported an absorption due to ferric iron in the region between 2000 and 2400 nm, the highest vector values are for bands at 2200 and 2300 nm with opposite signs. In dioctahedral clays the hydroxyl groups are coordinated around aluminum, and in trioctahedral clay are coordinated around magnesium or iron.  When magnesium is present the most intense feature appears at 2300 nm while if aluminum is present another feature appears at 2200 nm (Hunt and Salisbury, 1970).  Most of the soils from Serrano Valley were formed on mafic parent material, which is rich in iron and magnesium thus, this is likely to be the reason for the high weights given to these bands.  The negative weights given to the three bands at 500, 1000 and 2300nm could be due to the positive correlation between the ferrous iron content and magnesium from the presence of ferromagnesic materials.
 
The highest weights in the HFBA vector from La Jolla are given to the band at 2200 nm and to bands at 1000 and at 500 nm.  This vector does not show the characteristic sigmoidal shape with a minimum at 1000 nm, like the one found for Serrano.  Although the area still shows negative weights at 500 nm, the region near 1000 nm shows lower weightings spread over more bands.  Since this region of the spectrum is highly affected by organic matter and La Jolla soils are higher in this component, the lower weights given to La Jolla HFBA vector are probably due to the higher level of organic matter content which masks iron absorption features.  The band at 2200 nm has higher weights and is opposite in sign to the band at 2300 nm.  This fact is possibly due to the lower amount of magnesium and iron and higher presence of aluminum.  Since the effect of organic matter is not as strong in this area as in the shortwave region, these features are possibly better for discrimination of ferrous iron.

Image Analysis

The vectors trained using laboratory data were applied to the AVIRIS scenes.  The vector obtained in the first step was applied to the complete scene, while the vectors obtained at the second level were applied only to pixels classified in the corresponding class (e.g., organic matter of soils classified as Serrano was predicted using the vector obtained from the Serrano training set).  This identification is based on the spectral classification and is applied to pixels independent of their actual spatial location in the image.
 

First Level:  Classification between valleys

The first vector was trained to assign each pixel a classification value that will locate it as belonging to the soils of one of the two valleys.  Since there are many pixels within the valleys that are composed of mixed materials (i.e., have litter, green vegetation, or other materials present) these are classified outside the range of the original classes (0-14) defined in the vector training process.  Although the area under study was burned in a major wildfire only a few months before the image data were acquired, there was already a considerable amount of vegetation in some areas, mainly in the moister valley bottoms.  Since the image was acquired in the spring following the wildfire, we expected the amount of dry vegetation to be low, thus, decreasing the possibility of confusing plant litter with soil.
 
There are also some terrestrial areas in the image that were not affected by the wildfire and were largely vegetation covered.  One way to deal with vegetation is to mask pixels where vegetation is present.  Adopting this solution, greatly decreases the surface to be analyzed but pixels containing small amounts of vegetation would still be present.  Masking the vegetation using an NDVI threshold is an arbitrary decision and pixels with different but undetermined amounts of vegetation would remain.  This residual contamination is a common concern when analyzing soils from remote sensors.  Our interest lies in discriminating soil properties in pixels over a range of partial vegetation cover.  Since the vectors are trained with pure soils we expect that in pixels with some vegetation, soil characteristics will be emphasized and pixels with higher levels of vegetation will be classified as out of the range of the predicted soil property values.  This allows an a posteriori decision about vegetation cover that is derived from the soil information rather than an a priori vegetation based decision.  The NDVI (Figure 7.1) was calculated as a reference to compare to the spatial distribution of the vegetation pixels derived from the HFBA but was not used directly to identify vegetation.  Our results showed that the negative (out-of-range) values projected by the soil classification vector were pixels with high NDVI (>0.5).  All pixels with values less than 0 were classified as vegetation.
 
Another factor affecting the range of the HFBA at the image scale is the presence of the Pacific Ocean due to its low reflectance in all bands.  We expected that many of the extreme classification values in our results were from ocean pixels due to measurement and calibration errors associated with the near zero reflectance.  A histogram of the SVD results (Figure 8) shows that the distribution forms a long tail with only a few pixels having values higher than 21.  Nearly all the pixels with values higher than 14 were located in the ocean, therefore we used this criteria to remove them from further consideration in the analysis.
 
The remaining “potential soil” pixels in the image were classified at several levels after examining the distribution of HFBA values in the training set.  Pixels with values between 0 and 7 were assigned to the Serrano soil class, i.e., clearly expressing the physicochemical properties of Serrano Valley soils, and pixels with values between 7 and 11 clearly expressing the physicochemical characteristics of soils from La Jolla Valley.  Pixels with values between 11 and 14 were located in the beach areas, and although they express soil properties due to the high albedo of the sand, they are projected into the high extreme of the soil range, and can be classified as beach pixels (see distribution in Figure 8).  In the map (Figure 7.2) the white color corresponds to the ocean and to areas classified as vegetation dominated.  Comparing these results with the NDVI it is observed that areas with NDVI > 0.5 (green color, Figure 7.1) follow the same spatial pattern as the pixels that were not classified (i.e. white) in Figure 7.2.  The image shows that the La Jolla soil pixels are clustered in patches and the pixels classified as Serrano soils are more continuously distributed over the image.
 
The results from the laboratory analyses showed that separation of the classes was based primarily on differences in organic matter and iron content among the samples (Palacios-Orueta and Ustin, 1998).  Our results support this distinction, because AVIRIS is not simply mapping different soils in separate locations but instead, the HFBA system is mapping soil properties as continuous variables.   The actual distribution of organic matter and iron in these valleys is possibly related to the steep terrain and local microclimates.  The properties are not unique to their respective valley soils but instead the variability within the valleys is high and representative of the larger region.
 
As we have seen this first level of classification allows us to select pixels classified as soils having the properties associated with Serrano Valley or La Jolla Valley. Only the pixels with sufficient soil component expressed to fall within the laboratory data range are retained in the analysis.  Thus the portion of image variability due to soils is highlighted.  Also, by first dividing variability into La Jolla and Serrano classes the application of  the organic matter and iron vectors is optimized.

Organic Matter Content Determination

Organic matter was estimated applying the vectors trained with laboratory data.  Vector a was used to predict organic matter content in pixels classified as Serrano soils in the first level classification (Figure 2).  Vector b was used for pixels classified as La Jolla soils.  Figure 9 shows the distribution of the AVIRIS results obtained from this analysis.  Although the predicted values range from –15 to 10, nearly all pixels are within 1% to 6% organic matter (the same range as laboratory data).  We observed that pixels in which soil is not the main spectral component (e.g., ocean or vegetation pixels) the vectors are projected to organic matter values outside of this range.    The pixels corresponding to the ocean were projected to have extremely high values of organic matter content while high values of NDVI (higher than 0.5) are projected into low or negative values of organic matter content.  For a given pixel, as NDVI increases, the probability of being projected onto negative values of organic matter also increases.  Figure 10.1 shows the results of the analysis only for those pixels that had a high soil component (i.e. organic matter between 0 and 6).  The blue colors indicate low organic matter and orange/brown colors  indicate high organic matter;  white pixels were out of range in the first classification level or were pixels that had negative values of organic matter content at the second level.  Although the soils with high organic matter content are not uniquely associated with La Jolla Valley, it is observed that pixels mapped as La Jolla soils in Figure 7.2 also show high content of organic matter in Figure 10.1 (e.g., in the northeast area), which is coincident with locations of soil samples having high organic matter.  Palacios-Orueta (1997) performed the same analysis for pixels having higher vegetation cover.  She found that most pixels with high vegetation were classified as having low or negative values of organic matter. While it seems inconsistent that areas of higher vegetation cover would have soils of lower of organic matter content, our assumption is that as vegetation increases and soils are covered by plant canopies, the remote sensing estimations of organic matter are not reliable.
 
The spatial distribution of organic matter is related to the aspect (Figure 11).  The aspect is a factor affecting soil properties in such a way that slopes facing north and east are generally cooler and more humid, frequently accumulating higher levels of organic matter. Figure 11 shows the histogram of the distribution of organic matter content for different aspects.  It can be seen that on north facing aspects high values tend to predominate while on south facing slopes lower values of organic matter  predominate.
 
Since in the spring at the time when the image was acquired many of the soil sample point locations were covered by some vegetation, the regression analyses between the AVIRIS predicted and the organic matter values of the field samples were not significant.  However, the distribution of the AVIRIS soil organic matter for the two soil groups and their intermediates (Figure12) follow the same trend as the laboratory data.  Although it is not defined in the image classification process, we show here the distribution of organic matter for the intermediate class defined in the training process.  The purpose is to show the trend of organic matter content.  It can be seen that although the range for the three classes is the same (0 - 6) the shape of the curves are different, increasing the higher values of this property from Serrano to La Jolla soils.

Iron Content Determination

The same procedures were followed for iron content, with most of the values ranging from 2% to 6% (Figure 13).  In this case, the ocean pixels were projected onto negative iron contents while pixels with high NDVI’s are classified as having high iron content.  The analysis was done for both valleys, using only pixels where soil was the main component or were classed as vegetation covered.  The colors are the same as those for organic matter (Figure 10.1).  In this case there were only a few pixels with iron contents less than 2%. Figure 14 shows the distribution of iron content in the three classes defined in the training process.  The frequency of pixels with high levels of iron content decreases from Serrano soils to La Jolla soils; these results are in agreement with the laboratory data.  Another point that we are interested in is the limit of vegetation at which soils can be clearly discriminated. These results showed that there is an r2=0.35 (P< 0.05) correlation between organic matter and iron content for pixels with NDVI < 0.15. These areas correspond to the beaches and some scattered pixels.  When the NDVI is > 0.15 soil organic matter and iron are not correlated.  Most pixels with iron > HBFA 5 also have NDVI > 0.15.  Although the regression analyses between the estimated iron and organic matter contents were not significant for NDVI > 0.15 the image shows that the areas where organic matter is high also tend to be low in iron content.
 
The HFBA vectors for iron and organic matter behave differently with respect to the presence of different amounts of vegetation.  In the 700 nm region (in vegetation this corresponds to the red edge) the iron vectors show positive or near zero weights while the organic matter vectors have highly negative weights.  This is the reason why high values of vegetation are projected to low or negative HFBA values of organic matter while the iron vectors project pixels with some amount of vegetation present to high HFBA values of total iron.  This can explain the opposite patterns that we observed in the estimated iron and organic matter contents for areas with high vegetation cover.
 
An alternate explanation for this pattern could be a confounding effect that iron and organic matter content may have on the spectra.  Perhaps soils which have one property  in high concentration also show characteristics of the other property but it is masked or is not expressed in the spectrum, therefore the masked property is under estimated.

Summary of the classification process

The classification process showed that in the two levels, discrimination of the soils from the valleys and discrimination of the biochemistry contents (see Figure 2), there were pixels classified out side the ranges of the laboratory data.   After the first step for classifying the valleys, pixels with values higher than 14 or less than 0 were removed from the analysis (Figure 7.2) before applying the vectors for organic matter and iron content.  Pixels located between 11 and 14 were not masked but were identified as beach soils.  The results obtained at the second step (biochemistry discrimination) showed that a few pixels classified within the expected range in the first step were distributed outside of the laboratory range values at the second step.  These values ranged from -4 to 10 for iron and from –15 to 15 for organic matter content.  These pixels were masked at the second level of the classification.  Results show that the combination of using extracted features at the two levels helps to provide information about other properties of the study area, in this case, about the levels of vegetation cover.

Conclusions

HFBA was found to be a suitable method for soil analysis to determine relative changes in physicochemical properties because it is structured so that at each level, soil properties can be grouped into different quantized ranges according to the variance.  There is a combination of features that makes this model work more efficiently than other standard classification methods:
 
  1. It is a mixture model therefore it can use continuous data over the whole spectrum.
  2. Since it is a statistical model as well, it can focus on maximizing variability between classes while minimizing variability within classes, optimizing the amount of information extracted.
  3. As a supervised classification algorithm, it can be focused on soil variability by training the vectors for specific soil properties.
  4. The Singular Value Decomposition equation efficiently discriminates between foreground soil properties and background conditions.
  5. Since it is organized in a hierarchical way the variability is reduced at each level allowing subtle absorption features to be extracted.
 
The results obtained when training the vectors with the laboratory data showed that the organization of the system and the Singular Value Decomposition transformation work effectively in predicting organic matter from spectral data.  Results from the image analysis showed that the HFBA vectors when applied to the image discriminates effectively between soils and vegetation, and between different soil properties although the presence of vegetation is still a confounding factor.  Although the classified soils were not uniquely associated with either valley, the predictions of organic matter and iron oxide contents from the image agreed with the soil characteristics from the locations where we collected the soil samples.
 
To understand the soil variability on the landscape more completely, it would be informative to analyze the data more completely in a geographic context, e.g., using a relational GIS database, where other types of information such as terrain properties are explicitly included.
 
This methodology is based on a hierarchical analysis, which implies that variability is reduced at several steps, each time becoming more specific.  This makes difficult to extend the analysis to different areas without some a priori knowledge of the soils. One way is to build or expand upon a library of spectral properties of soils, such as the one developed at Purdue University (Stoner and Baumgardner, 1981).  As better soil spectral libraries develop the training data limitation may be minimized.  Nevertheless the use of HFBA would allow a mechanism to efficiently reduce the number of field measurements by calculating the vector from a small sample set or by using a vector developed from an area with similar soil variability.  HFBA would be useful for analyzing changes in soil properties in a temporal framework.
 

Acknowledgments

This research was supported by a fellowship to APO from the Instituto Nacional deTecnología Agraria y Alimentaria and by a NASA EOS grant NAS-31359  to SLU and NASA Terrestrial Ecosystems and Biogeochemical Dynamics Branch, NAGW-4626-I to SLU and DAR.  We wish to thank the Digital Equipment Corporation for providing the DEC Alpha computers under the Sequoia 2000 grant Cooperative Research Agreement #1243.

References

Agbu, P. A., Fehrenbacher, D. J. and Jansen, I. J. (1990), Statistical comparison of SPOT spectral maps with field Soil maps. Soil Sci. Soc. Am. J. 54:  812-818.
 
Al-Abbas, A. H., Swain, P. H. and Baumgardner, M. F. (1972), Relating organic matter and clay content to the multispectral radiance of soils. Soil Science. 114:  477-485.
 
Clark, R. N., Gallagher, A. J. and Swayze, G. A. (1990). Material absorption band depth mapping of imaging spectrometer data using a complete band shape least-squares fit with library reference spectra. Second Airborne Visible/Infrared Image Spectrometer (AVIRIS) Workshop., Pasadena, California.
 
Cloutis, E. A. (1996), Hyperspectral geological remote sensing: evaluation of analytical techniques. International Journal of Remote Sensing. 17:  2215-2242.
 
Coleman, T. L., Agbu, P. A. and Montgomery, O. L. (1993), Spectral differentiation of surface soils and soil properties - is it possible from space platforms?. Soil Science. 155:  283-293.
 
Coleman, T. L., Agbu, P. A., Montgomery, O. L., Gao, T. and others (1991), Spectral band selection for quantifying selected properties in highly weathered soils. Soil Science. 151:  355-361.
 
Coleman, T. L. and Montgomery, O. L. (1987), Soil moisture, organic matter, and iron content effect on he spectral characteristics of selected Vertisols and Alfisols in Alabama. Photogrammetric Engineering and Remote Sensing. 53:  12,1659-1663.
 
Dibblee, T. W. and Ehrenspeck, H. E. (1990). Geologic Map of the Point Mugu and Triunfo Pass Quadrangles. Ventura and Los Angeles Counties, California. Santa Barbara, Dibblee Geological Foundation.
 
Edwards, R. D., Rabey, D. F. and Kover, R. W. (1970). Soil Survey Ventura Area, California,  United States Department of Agriculture. Soil Conservation Service.
Gerrard, J. (1992), Soil Geomorphology. An integration of pedology and geomorphology , Chapman and Hall, London.
 
Green, R. O., Conel, J. E. and Roberts, D. A. (1993). Estimation of aerosol optical depth and calculation of apparent surface reflectance from radiance measured by the Airborne Visible-Infrared Imaging Spectrometer (AVIRIS) using MODTRAN 2. SPIE Conf. 1937, Imaging Spectrometry of the terrestrial environment, 2-5.
 
Henderson, T. L., Baumgardner, M. F., Franzmeier, D. P., Stott, D. E. and others. (1992), High dimensional reflectance analysis of soil organic matter. Soil Sci. Soc. Am. J. 56:  865-872.
 
Henderson, T. L., Szilagyi, A., Baumgardner, M. F., Chen, C. T. and Landgrebe, D. A. (1989), Spectral band selection for classification of soil organic matter content. Soil Sci. Soc. Am. J. 53:  1778-1784.
 
Hunt, G. R. and Salisbury, J. W. (1970), Visible and near-infrared spectra of minerals and rocks: i silicate minerals. Modern Geology. 1:  283-300.
 
Kruse, F. A., Boardman, J. W. and Farrand, W. H. (1996). Advanced hyperspectral remote sensing analysis workshop. ERIM second International airborne remote sensing conference and exhibition, San Francisco, California.
 
Kruse, F. A., Kiereinyoung, K. S. and Boardman, J. W. (1990), Mineral mapping at Cuprite, Nevada with a 63-channel imaging spectrometer. Photogrammetric Engineering and Remote Sensing. 56:  83-92.
 
Kruse, F. A., Lefkoff, A. B. and Dietz, J. B. (1993), Expert system-based mineral mapping in northern Death Valley, California  Nevada, using the Airborne Visible  Infrared Imaging Spectrometer (AVIRIS). Remote Sens. Environ. 44:  309-336.
 
Latz, K., Weismiller, R. A., Van Scoyoc, G. E. and Baumgardner, M. F. (1984), Characteristic variations in spectral reflectance of selected eroded alfisols. Soil Science Society Am. Journal. 48:  1130-1134.
 
Lewis, D. T., Seevers, P. M. and Drew, J. V. (1975), Use of satellite imagery to delineate soil associations in the sands hills region of Nebraska. Soil Science Society America. Proc. 39:
Matlab (1994). , The Mathworks, Inc.
 
Mustard J. F.  (1993), Relationships of soil, grass, and bedrock over the Kaweah serpentinite melange through spectral mixture analysis of AVIRIS data.  Remote Sens. Environ., 44 (2-3): 293-308.
 
Palacios-Orueta, A. (1997), Soil Discrimination with Laboratory Spectra and Airborne Imaging Spectrometer Data (AVIRIS). Land, Air and Water Resources. Davis, University of California: 120p.
 
Palacios-Orueta, A. and Ustin, S. L. (1996), Multivariate classification of soil spectra. Remote Sens. Environ. 57:  2,108-118.

Palacios-Orueta, A. and Ustin, S. L. (1998), Remote sensing of soil properties in the Santa Monica Mountains: I. Spectral Analysis  Remote Sens. Environ. 65:170-183.
 
Pinzón, J. E., Ustin, S. L., Castaneda, C. M. and Smith, M. O. (1998), Investigation of leaf biochemistry by hierarchical foreground/background analysisIEEE Trans. Geosci. and Remote Sens. 36: 1-15.
 
Pinzón, J. E., Ustin, S. L., Hart, Q. L., Jacquemoud, S. and Smith, M. O. (1995). Using foreground/background analysis to determine leaf and canopy chemistry. 5th annual JPL Airborne Earth Science Workshop: AVIRIS workshop, p.129-132.
 
Roberts, D. A., Smith, M. O. and Adams, J. B. (1993), Green vegetation, nonphotosynthetic vegetation, and soils in AVIRIS data. Remote Sens. Environ. 44:  255-269.
 
Smith, M. O., Roberts, D. A., Hill, J., Mehl, W., Hosgood, B., Venderbout, J., Schmuck, G., Koechler, C. and Adams, J. (1994). A new approach to quantifying abundances of materials in multispectral images. IGARSS 94: Proceeding International Geosciences Remote Sensing Symposium, Pasadena, (CA).
 
Smith, M. O., Ustin, S. L., Adams, J. B. and Gillespie, A. R. (1990), Vegetation in deserts I. A regional measure of abundances from  multispectral images. Remote Sens. Environ. 29:  1-26.
 
Stoner, E. R. and Baumgardner, M. F. (1981), Characteristic variations in reflectances of surface soils. Soil Science. Society of America Journal. 45:  1161-1165.
 
Vane, G., Green, R. O., Chrien, T. G., Enmark, H. T. and others. (1993), The Airborne Visible Infrared Imaging Spectrometer (AVIRIS). Remote Sens. Environ. 44:  127-14.

1998, Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis