| Title | Sensitivity of surface meteorological analyses to observation networks |
| Publication Type | dissertation |
| School or College | College of Mines & Earth Sciences |
| Department | Atmospheric Sciences |
| Author | Tyndall, Daniel Paul |
| Date | 2011-12 |
| Description | A computationally efficient variational analysis system for two-dimensional meteorological fields is developed and described. This analysis approach is most efficient when the number of analysis grid points is much larger than the number of available observations, such as for large domain mesoscale analyses. The analysis system is developed using MATLAB software and can take advantage of multiple processors or processor cores. A version of the analysis system has been exported as a platform independent application (i.e., can be run on Windows, Linux, or Macintosh OS X desktop computers without a MATLAB license) with input/output operations handled by commonly available internet software combined with data archives at the University of Utah. The impact of observation networks on the meteorological analyses is assessed by utilizing a percentile ranking of individual observation sensitivity and impact, which is computed by using the adjoint of the variational surface assimilation system. This methodology is demonstrated using a case study of the analysis from 1400 UTC 27 October 2010 over the entire contiguous United States domain. The sensitivity of this approach to the dependence of the background error covariance on observation density is examined. Observation sensitivity and impact provide insight on the influence of observations from heterogeneous observing networks as well as serve as objective metrics for quality control procedures that may help to identify stations with significant siting, reporting, or representativeness issues. |
| Type | Text |
| Publisher | University of Utah |
| Subject | Data assimilation; Mesonet; Mesoscale analysis; Meteorology; Surface analysis; Variational assimilation methods |
| Dissertation Institution | University of Utah |
| Dissertation Name | Doctor of Philosophy |
| Language | eng |
| Rights Management | Copyright © Daniel Paul Tyndall 2011 |
| Format | application/pdf |
| Format Medium | application/pdf |
| Format Extent | 61,565,504 bytes |
| Identifier | us-etd3,60959 |
| Source | Original in Marriott Library Special Collections, QC3.5 2011 .T96 |
| ARK | ark:/87278/s6q534cg |
| DOI | https://doi.org/doi:10.26053/0H-AZRB-KZ00 |
| Setname | ir_etd |
| ID | 194518 |
| OCR Text | Show SENSITIVITY OF SURFACE METEOROLOGICAL ANALYSES TO OBSERVATION NETWORKS by Daniel Paul Tyndall A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Atmospheric Sciences The University of Utah December 2011 Copyright © Daniel Paul Tyndall 2011 All Rights Reserved Th e Uni v e r s i t y o f Ut a h Gr a dua t e S cho o l STATEMENT OF DISSERTATION APPROVAL The dissertation of Daniel Paul Tyndall has been approved by the following supervisory committee members: John D. Horel , Chair 28 July 2011 Date Approved Thomas Haiden , Member 28 July 2011 Date Approved Jan Paegle , Member 28 July 2011 Date Approved Zhaoxia Pu , Member 28 July 2011 Date Approved W. James Steenburgh , Member 28 July 2011 Date Approved and by Kevin Perry , Chair of the Department of Atmospheric Sciences and by Charles A. Wight, Dean of The Graduate School. ABSTRACT A computationally efficient variational analysis system for two-dimensional meteorological fields is developed and described. This analysis approach is most efficient when the number of analysis grid points is much larger than the number of available observations, such as for large domain mesoscale analyses. The analysis system is developed using MATLAB software and can take advantage of multiple processors or processor cores. A version of the analysis system has been exported as a platform independent application (i.e., can be run on Windows, Linux, or Macintosh OS X desktop computers without a MATLAB license) with input/output operations handled by commonly available internet software combined with data archives at the University of Utah. The impact of observation networks on the meteorological analyses is assessed by utilizing a percentile ranking of individual observation sensitivity and impact, which is computed by using the adjoint of the variational surface assimilation system. This methodology is demonstrated using a case study of the analysis from 1400 UTC 27 October 2010 over the entire contiguous United States domain. The sensitivity of this approach to the dependence of the background error covariance on observation density is examined. Observation sensitivity and impact provide insight on the influence of observations from heterogeneous observing networks as well as serve as objective iv metrics for quality control procedures that may help to identify stations with significant siting, reporting, or representativeness issues. TABLE OF CONTENTS ABSTRACT ....................................................................................................................... iii ACKNOWLEDGEMENTS .............................................................................................. vii 1 INTRODUCTION .........................................................................................................1 Background and Motivation ..........................................................................................1 Objectives and Outline ...................................................................................................8 2 VARIATIONAL ASSIMILATION THEORY ...........................................................12 Introduction ..................................................................................................................12 Analysis Space .............................................................................................................14 Observation Space .......................................................................................................17 Implementation of Constraints .....................................................................................19 3 IMPLEMENTATION OF VARIATIONAL ASSIMILATION THEORY WITHIN THE UU2DVAR ..........................................................................................26 Introduction ..................................................................................................................26 Background Error Covariance .....................................................................................26 Computation and Storage of the Background Error Covariance Matrix .....................28 Background, Terrain, and Land/Water Mask ..............................................................34 Usage of Observations and Quality Control ................................................................35 UU2DVar General Characteristics and Analysis Cycle ..............................................39 4 OBSERVATION IMPACTS AND SENSTIVITY .....................................................53 Introduction ..................................................................................................................53 Computation .................................................................................................................53 Observation Networks .................................................................................................56 5 RESULTS AND DISCUSSION ..................................................................................72 Description of Case Study ...........................................................................................72 Evaluation of Data Density Constraints .......................................................................75 Sensitivity and Impact to Observation Networks ........................................................78 vi 6 CONCLUSION ..........................................................................................................133 Summary ....................................................................................................................133 Recommendations and Future Work .........................................................................137 REFERENCES ................................................................................................................141 ACKNOWLEDGEMENTS I would like to thank my advisor, Dr. John Horel, for his support and guidance throughout this entire project, as well as providing the original NetCDF input/output routines and the asymmetric quality control used for wind observations used by the University of Utah Variational Surface Analysis (UU2DVar). I would also like to thank the other four members of my committee-Drs. Jim Steenburgh, Jan Paegle, Zhaoxia Pu, and Thomas Haiden, for their valuable contributions to this project. Dr. Haiden was especially helpful in providing background material and data from his own analysis system, as well as many helpful discussions on surface analyses in regions of complex terrain. I would also like to thank Dr. Manuel de Pondeca for his help on understanding the covariance computations within the Real-Time Mesoscale Analysis, as well as providing me with the background fields used for the analyses computed as part of this study. MesoWest support, specifically Mr. Chris Galli and Dr. Judy Pechmann, were tremendously helpful in the development of a routine to download observations out of MesoWest to be used by UU2DVar. WeatherFlow, Inc., and the Oklahoma Mesonet also deserve special recognition in allowing their observing networks to be used free of charge for the case study presented here. I would also like to thank Dr. Fred Carr for guiding me to adjoint analysis as a technique to be used to compute observation sensitivity and impact. The Center of High viii Performance Computing at the University of Utah greatly aided this project by providing server and desktop support for this work. Finally, I would like to thank my friends and family, as I wouldn't have gotten this far without them. This research is supported by the National Ocean and Atmospheric Administration under grants NA07NWS468003 and NA10NWS4680005 as part of the CSTAR program. CHAPTER 1 INTRODUCTION Background and Motivation High spatial and temporal objective surface analyses have become increasingly vital during the past decade. Such mesoscale analyses are needed in nowcasting and short-range forecasting for wind power management, transportation safety, wildfire management, dispersion modeling, as well as defense applications (Horel and Colman 2005). Some of these high resolution real-time objective analyses are generated by tools that are not part of a fully integrated analysis/forecast data assimilation cycle, as most numerical models fail to capture adequately many surface weather features due to insufficient spatial resolution as well as incomplete parameterization of boundary layer processes (Uboldi et al. 2008; Glowacki et al. 2011). Instead, surface grids from short-range forecasts are often used as a starting point in the objective analysis process and then adjusted on the basis of high density mesonet observations. Lazarus et al. (2002) reviewed many of the operational and research mesoscale analysis systems available during the late 20th century. Some of these systems are no longer undergoing further development or have been officially retired. Examples of current operational high resolution objective analyses developed internationally include the Vienna Enhanced Resolution Analysis (VERA; Steinacker et al. 2006) and the 2 Integrated Nowcasting through Comprehensive Analysis (INCA; Haiden et al. 2010) systems for Austria and the Mesoscale Surface Analysis System (MSAS; Glowacki et al. 2011) run over Australia. All 3 of these analysis systems incorporate high density mesonet observations and generate surface analyses of temperature, moisture, and wind at resolutions of 1-4 km. Mesoscale objective analysis systems available in the United States of particular note include MatchObsAll (Foisy 2003), the Space and Time Multiscale Analysis System (STMAS; Xie et al. 2011), and the Real-Time Mesoscale Analysis System (RTMA; de Pondeca 2011). STMAS and the RTMA are run at regular intervals (15 minutes for STMAS; 1 h for the RTMA) over the contiguous United States (CONUS) domain. MatchObsAll is run at the discretion of National Weather Service (NWS) forecasters over local domains, which typically extends slightly beyond their areas of forecast responsibility. MatchObsAll and the RTMA are used operationally by NWS forecasters to help create and verify high resolution gridded forecasts of near-surface conditions across the United States (Glahn and Ruth 2003). The methodologies used by these analysis systems can be categorized into two general classes. The first type consists of interpolation techniques (VERA, spline; INCA, inverse distance; and MatchObsAll, serpentine curve) that strive to have the analysis agree very closely with the available observations (Daley 1991). These approaches tend to be computationally efficient and work very well when the observations are spread relatively uniformly across the analysis domain and erroneous observations are identified and rejected as part of preprocessing quality control procedures. These approaches tend to suffer if the density of observations varies widely within the analysis domain as the interpolation techniques may tightly constrain the analysis where the observations are 3 plentiful leading to overfitting in nearby data void regions (Myrick et al. 2005; Barker et al. 2007). Approaches that fall within the second general class of analysis system (MSAS, optimum interpolation; RTMA, two-dimensional variational, 2DVar; and STMAS, three-dimensional variational, 3DVar) assume that observations may contain errors arising from the representativeness of the observations within their surrounding environment as well as instrumentation errors (Daley 1991; Kalnay 2003). These approaches are particularly appropriate for analysis systems that rely on observations from heterogeneous networks with differing quality control standards that are often distributed unevenly within the analysis domain. The RTMA serves as a reference analysis system for the research undertaken in this study. The National Centers for Environmental Prediction (NCEP) developed the RTMA to support the needs of NWS forecasters (Pondeca et al. 2011). The RTMA is an objective surface analysis system with the ability to assimilate tens of thousands of surface observations collected from many different data providers to yield analysis grids of 2-m temperature, 2-m dewpoint, surface pressure, and 10-m and winds over a CONUS domain as well as Alaska, Hawaii, Puerto Rico, and Guam domains. The analysis grids of the RTMA at resolutions of 2.5- and 5-km conform to the National Digital Forecast Database (NDFD) grid described by Glahn and Ruth (2003). The computational resources required to compute mesoscale surface analyses such as the RTMA (with ~107 gridpoints) are considerable. In addition, techniques to manage appropriately the diversity of observational assets that lead to variations in data density and quality have heretofore remained largely unexplored. Tyndall et al. (2010) examined the sensitivity of the RTMA and a comparable 2DVar system to assumptions 4 about the observational and background error covariances in part as a function of observational type. Mesoscale data assimilation depends to a great extent on the number of observations available to modify the specified background field. For example, the 5-km resolution RTMA has approximately 15,000 surface observations available to adjust the background fields at over 700,000 gridpoints, while nearly 2.4 million gridpoints are required for the 2.5-km resolution CONUS RTMA. In addition, since these ~15,000 surface observations are not evenly spread throughout the entire analysis grid and are often clustered near urban areas, the number of observations providing independent observations is often substantially less. The Integrated Data Influence (IDI), as described by Uboldi et al. (2008), can be used as a nondimensional measure of data density. Figure 1.1 depicts the IDI of all surface observations used by the RTMA to compute the 1400 UTC 27 October 2010 temperature analysis, using the RTMA's assumptions regarding the observational and background error covariances, i.e., the factors that affect the influence of observations on the analysis. Subject to the aforementioned assumptions related to the error covariances, regions of the domain with IDI values approaching one have more complete data coverage, while regions with low IDI values have few observations available. The inequitable distribution of observations is of concern everywhere, but the complex underlying terrain of the western United States results in localized microclimates that remain difficult to resolve on the basis of the present observational network. Proposed improvements to the current approach through the development of a Nationwide Network of Networks (NNoN; National Research Council 2009) are unlikely to provide the 5 number of observations necessary to resolve all of these local weather features around the country. Variational data assimilation systems suffer from the necessity to specify the spatial scales of the background error covariance. Specifying large spatial scales for those errors appropriate for regions where few observations are available may lead to the inability to capture small-scale structures evident in data-rich areas. For example, Figure 1.2 presents an artificial 2-m temperature analysis and corresponding analysis increments (adjustments to the background field) where a relatively dense observation network is embedded within a data sparse region. All of the observations (outlined circles) in Figure 1.2a generally have good agreement with each other, except at the very center of the observing network, where there are three observations that are warmer than those surrounding it. The temperature analysis in this case fails to capture the higher temperature feature here, as the assimilation scheme is tuned to extend the influences of the observations to the data sparse areas at the edge of the domain. Further, the "washed out" nature of the analysis increments (Figure 1.2b) near the center of the domain is due to the large number of cooler observations surrounding the three warmer observations, which limit the influence of the warmer observations to properly adjust the background field to the observed temperatures in the center of the domain. Tuning the assumptions about the background error covariance to resolve the small scale features would help to define the higher temperatures near the center, but would degrade the analysis in the surrounding data-sparse areas. Hence, adjusting the background error covariance as a function of data density may help to extend the influence of observations in otherwise 6 data sparse regions while maintaining the ability to resolve smaller-scale features where the observing network is capable of resolving them. The application of data density dependent observation weights or background error covariances has been studied previously for several different assimilation methods with mixed results. Lorenc et al. (1991) implemented decreased weights for observations located in data dense areas in the United Kingdom Meteorological Office's Analysis Correction data assimilation scheme, to improve the influence of observations in nearby data sparse regions. Later research using the European Centre for Medium-Range Weather Forecasts (ECMWF) 3DVar assimilation system showed that shorter spatial scales used to specify the background error covariance improved forecasts in data dense areas, while longer spatial scales improved the forecast in data sparse areas (Andersson et al. 1998). Unfortunately, the assimilation system used in that research could only utilize a single structure function at a time (which is used to specify the spatial scales and construct the background error covariance), and Andersson et al. were unable to evaluate the impacts to the forecast on using an observation density dependent structure function. However, their research notes that implementation of this feature into the 3DVar system would likely be beneficial. The ~15,000 observations that were used to generate the IDI analysis in Figure 1.1 come from over 100 different mesonets across the United States. The impact on analyses of the quality of observations resulting from networks with differing reporting practices, instrumentation, maintenance, siting, and representativeness is of great interest, especially for the development of the NNoN (National Research Council 2009). The National Research Council report, motivating the necessity of the NNoN, emphasizes the 7 need for improved and ongoing documentation of metadata regarding existing mesonets for such applications. For example, observations from the Remote Automated Weather Station (RAWS) mesonet are typically sited on southern slopes with anemometer heights of 6 m, instead of the 10 m height standard utilized by observations from the NWS (Horel and Dong 2010; Tyndall et al. 2010). Observations from the Citizen Weather Observing Program (CWOP) typically come from consumer grade instrumentation and may be sited on the roof of or next to a building, unlike the mandatory field of clearance and professional grade equipment required for NWS observations. Observations from different mesonet providers with differing instrumentation, standards, and siting can be used by the analysis, provided that the assumptions about the observation errors for each network are appropriately evaluated. Defining those assumptions is facilitated by determining the impact of each network on the analysis. As described by Tyndall et al. (2010) and Horel and Dong (2010), the Local Surface Analysis (LSA), a 2DVar analysis tool written in MATLAB that utilizes an assimilation scheme similar to the RTMA, has been used on local computer nodes maintained by the Center for High Performance Computing (CHPC) at the University of Utah. Although examination of appropriate error covariances for the LSA (and correspondingly the RTMA) as well as analysis sensitivity to selected observation networks was shown to be possible with the LSA, that approach is practical only for limited regional domains (approximately 6° latitude by 6° longitude) due to the computational requirements of the assimilation algorithm. In order to be able to efficiently compute analyses over continental scale domains, the development of a new variational surface analysis tool was initiated as part of this research and herein referred 8 to as the University of Utah Variational Surface Analysis (UU2DVar). This development included parallelizing the assimilation computation, implementing highly efficient programming practices using modest computer resources, as well as shifting the computation of the analysis from analysis space to observation space. The adjoint of the UU2DVar has also been developed as part of this research so that it may be used in future research to efficiently assess the impact of observation networks as part of efforts related to the NNoN. The UU2DVar, as well as differences between it and the RTMA, are described in a Chapter 3. Objectives and Outline The objectives of this study are: · To document the algorithms used by the UU2DVar to efficiently produce continental-scale surface analyses. · To show that specifications of the background error covariance based on observation density allow relatively small-scale features to be resolved in areas of high data density while allowing the limited observations in data-sparse regions to influence analyses on broader scales. · To apply the adjoint of the UU2DVar to assess analysis sensitivity to and impacts of individual mesonets on analyses. Chapter 2 of this document discusses and describes variational assimilation theory, which is used by the UU2DVar to generate surface analyses. Variational theory is discussed in both the analysis space framework (utilized by the LSA) and the observation space framework (utilized by UU2DVar). Chapter 3 discusses the computational 9 implementation of the variational framework used by the UU2DVar, namely the mathematical technique used to simplify the background error covariance matrix, the parallelization technique and usage of sparse matrices to decrease both wall clock time (the time needed to compute a quantity) and memory usage, formulation of the background error correlations, as well as adjustments to and quality control of observations utilized by the data assimilation tool. The adjoint of the UU2DVar, its derivation, and its application to specify background error covariance as a function of data density as well as the analysis sensitivity to differing observation networks is described in Chapter 4. A particular case study is used in Chapter 5 to demonstrate the use of this methodology. Finally, a summary and conclusions follow in Chapter 6. Future work is also presented in that chapter. 10 Figure 1.1. IDI analysis computed using 2-m temperature observations available to RTMA on 1400 UTC 27 October 2010 along with the RTMA's spatial scales used to specify the background error covariance. The 25 largest metropolitan within CONUS areas (by population) are also noted on the map. High data densities (blue) are clustered around urban areas, while rural areas are often data sparse (light red) or data void (dark red). 11 a. b. Figure 1.2. Artificial 2-m temperature analysis and observations, along with corresponding analysis increments and observation innovations. a. Analysis temperature, shaded in °C, along with temperature observations (circle markers, shaded in °C). b. Analysis temperature increments in °C, along with temperature observation innovations (circle markers, shaded in °C). CHAPTER 2 VARIATIONAL ASSIMILATION THEORY Introduction Gridded objective analyses are generated from typically irregularly distributed observations combined with a background field on a continuous grid subject to statistical assumptions and constraints (McPherson 1975; Talagrand 1997; Kalnay 2003). Such data assimilation algorithms have been necessary since the advent of meteorological modeling in the 1950s and many of those early approaches (e.g., Cressman method [Gilchrist and Cressman 1954], and successive corrections [Bergthórsson and Döös 1955; Cressman 1959]) continue to be used. As computational resources have improved, techniques such as optimal interpolation (Gandin 1963) and time-independent 2DVar and 3DVar assimilation (Sasaki 1958) have been introduced. Most recently, time-dependent (four-dimensional) variational assimilation (4DVar; Sasaki 1970) and ensemble-based data assimilation (e.g., Evensen 1994) are used by some operational centers and by many research groups. While some research groups are beginning to study 4DVar and ensemble based techniques for high resolution surface assimilation (N. Baker, personal communication; Ancell et al. 2011), these approaches are too computationally expensive to be used for real-time, high resolution, large scale mesoscale analyses. Instead, time-independent 13 variational approaches remain the most computationally affordable solutions for mesoscale analyses and will be studied here. Following Kalnay (2003), all 2DVar and 3DVar approaches seek to minimize the cost function, , 2 = + (2.1) where the terms and penalize the analysis for differences from the background field and observations respectively. As will be discussed later in this chapter, additional weak constraints ( ) based on the underlying terrain or flow dependencies can be introduced: 2 = + + (2.2) To minimize the cost function, Equation 2.1 is expanded: 2 = − TPb − + H − TPo H − (2.3) where and correspond, respectively, to the background field and observation vectors, Pb and Po define, respectively, the background and observation error covariance matrices, and H is an operator that maps the analysis onto the observations. There are two widely used approaches to minimize Equation 2.3 to yield an analysis: (1) solve iteratively for a solution on the analysis grid (analysis space, Parrish and Derber 1992; Courtier et al. 1998), or (2) solve iteratively for a solution at the observation locations 14 (observation space, Lorenc 1986; Daley and Barker 2001). These two methods are discussed in the next sections. Analysis Space In the analysis space framework, the relationship H − = H − + − = H + H − − (2.4) is substituted into Equation 2.3 yielding: 2 = − TPb − + H − + H − TPo H − + H − (2.5) The right side of Equation 2.5 is algebraically expanded yielding Equation 2.6: 2 = − TPb − + H − TPo H − + H − TPo H − + H − TPo H − + H − TPo H − (2.6) To reduce the expense of computing the inverse of Pb (due to its large size), the cost function is transformed into a function of : 15 = Pb − (2.7) yielding: 2 = TPb T + TPb THTPo HPb + TPb THTPo H − + H − TPo HPb + H − TPo H − (2.8) The minimum of the cost function from Equation 2.8 is computed by finding where the gradient of the function is 0: 0 = ∇ = Pb T + Pb THTPo HPb + Pb THTPo H − (2.8) yielding: −Pb THTPo H − = Pb T + Pb THTPo HPb (2.9) Equation 2.9 is solved iteratively for by the conjugate gradient solution method (CGS; Hestenes and Stiefel 1952) or the generalized minimum residual method (GMRES; Saad and Schultz 1986). The analysis is computed from Equation 2.10, which reflects that the background field is modified by the innovation (Pb ): 16 = + Pb (2.10) Although equations 2.9 and 2.10 simplify the analysis by eliminating the computation of the inverse of Pb, the computation and storage of Pb itself is no trivial task. Pb is a matrix of size × , where is the number of gridpoints in the analysis. For a continental scale two-dimensional analysis, can be on the order from 105 to 106 depending upon the horizontal spacing of the grid. Storage of a double precision matrix of these sizes ranges from 74 GB to 7.4 TB, which can be difficult to store in memory even on supercomputers. Furthermore, computation of the full background error covariance matrix is expensive; e.g., computing Pb generally takes about 5.5 h on 8 processor cores (unless otherwise specified, all wall clock times were measured using a compute node with 2 Xeon hex-core processors clocked at 2.80 GHz) for the types of cases studied here, which is not suitable for an analysis that might be needed for real-time applications. Although wall clock time can be reduced by using more processors, operational centers often have serious constraints on computing resources due to the large number of numerical products needed to run on the same supercomputer (for example, the 5-km RTMA is only run on 16 processors of NCEP's 4,992 processor computer cluster [M. de Pondeca, personal communication]). The background error covariance matrix is often approximated to circumvent these storage and computation problems (Fisher 2003). There are many different ways to approximate Pb, such as modeling the matrix in both spectral and spatial coordinates using the wavelet formation (Buehner and Charron 2007), using a diffusion operator (Weaver and Courtier 2001), or using a recursive filter (Lorenc 1986; Purser et al. 2003a, 17 2003b; de Pondeca et al. 2011). Since the background error covariance matrix defines the spatial scales over which observations influence the analysis, it is important to approximate the matrix as accurately as possible (Daley 1991; Fisher 2003). Since much of the covariance modeling research (Weaver and Courtier 2001; Purser et al. 2003a, 2003b; Buehner and Charron 2007) has focused on global analysis products, simplifications of the covariance matrices over such large domains take advantage of synoptic-scale balances, such as geostrophic and hydrostatic balance. Unfortunately, for mesoscale conditions within the planetary boundary layer, such balances are not appropriate (Bannister 2008a, 2008b). Observation Space In the observation space framework, the gradient of the cost function, presented in Equation 2.3, is computed immediately: 0 = 2∇ = Pb − + HTPoo H − (2.11) Equation 2.11 is multiplied by the background error covariance to avoid computing its inverse yielding: − = PbHTPoo − H (2.12) Similar to the derivation in physical space, the substitution 18 = Poo − H (2.13) is introduced into Equation 2.12 to yield Equation 2.14: − = PbHT (2.14) Equation 2.14 is transformed into observation space by multiplying the equation by the forward transform operator, H: H − H = HPbHT (2.15) Additional simple manipulation leads to: H − + − H = HPbHT (2.16) −Po + − H = HPbHT (2.17) Finally, the terms multiplied by in Equation 2.17 are separated to one side of the equation, which allows for to be solved iteratively: − H = HPbHT + Po! (2.18) The analysis is then computed by rearranging Equation 2.14: 19 = + PbHT (2.19) The observation space approach eliminates the storage problem of the background error covariance since the transpose of H filters unneeded information from the background error covariance. Rows of Pb can be computed individually, multiplied by their respective columns of HT, and stored. The product, PbHT has dimensions × # , where # is the number of observations assimilated by the analysis (see Figure 2.1). Memory efficiency is greatly improved by using this approach, provided that the analysis is under-sampled (i.e., the number of observations is much less than the number of gridpoints [Daley and Barker 2001]). Memory requirements for the computation in observation space for double precision data can range from 7.4 GB to 74.5 GB for analyses of 105 to 106 gridpoints, which is within the memory capacity of many computer clusters. While these memory requirements are still significant, additional approximations can be made and additional computational methods can be implemented to allow analyses to be generated using modest computing resources. These methods and approximations used by the UU2DVar are covered in Chapter 3. Implementation of Constraints Although the undersampling of observations is exploited by the analysis technique presented by Equations 2.18 and 2.19, undersampling remains a significant problem for high resolution analyses. If the observations are sparse in a particular area of the analysis domain, then the data assimilation system depends upon the background field to produce the analysis (Equation 2.1). Many high resolution surface analysis frameworks 20 and systems (Myrick et al. 2005; de Pondeca et al. 2011; Haiden et al. 2011) downscale coarse resolution background fields to the analysis grid using a variety of methods. Unfortunately, the downscaled background field may not resolve many small-scale weather features (Myrick et al. 2005), and in some cases, may produce erroneous features in these areas through the downscaling process. The usage of constraints can help improve the analysis by supplying information to the data assimilation system not provided by the background field or observations (Lorenc 1986; Xie et al. 2002). The constraint can either be formulated as a weak constraint or a strong constraint (Zhu and Yan 2006). Strong constraints modify either the background error covariance or the background itself. The strong constraint may add balanced coupling between two different assimilated fields, add background error correlation to a meteorological parameter or topography field, or may impose some other fundamental limit or law to the analysis (Lorenc 1986; Protat and Zawadzki 1999; Xie et al. 2002). As will be discussed in Chapter 3, the UU2DVar (as well as the RTMA) uses differences in elevation as an anisotropic constraint. The addition of a strong constraint defined by the density of observations is introduced in Chapter 3 and tested in Chapter 4 as part of this research. Because of the direct modification to the background error covariance or background field, the strong constraints are assumed to be perfect and force the subsequent analyses to meet the balance requirements of the specific constraint (Lorenc 1986; Xie et al. 2002). In contrast, a weak constraint does not force the analysis to meet the constraint exactly, which can be advantageous if the constraint is only an approximation (Xie et al. 21 2002). One formulation of the weak constraint, , presented in Equation 2.2 is often expanded in the form: = − $ TPcc − $ (2.20) where $ is the constrained field and Pc is a term that describes the error covariance of the constrained field (Lorenc 1986). The term Pc describes the weighting of the minimization of the difference between the analysis and the constraint relative to the difference between the analysis and the observations and the analysis and the background field. The weak constraint may also be implemented as additional artificial observations in in the cost function. The solution to the variational analysis equation with a weak constraint as described by Equation 2.2 becomes more complicated than the basic observation space equations presented by Equations 2.18 and 2.19 and also doubles the memory cost of the analysis. Equation 2.21 presents the variational cost function with the explicit weak constraint implementation presented by Equation 2.20: 2 = − TPb − + H − TPo H − + − $ TPc − $ (2.21) The gradient of Equation 2.21 is computed to find the minimum of the cost function: 22 0 = 2∇ = Pb − + HTPo H − + Pcc − $ (2.22) As with the observation space framework presented in the last section, Equation 2.22 is multiplied by the background error covariance to simplify the computation of the analysis: 0 = − + PbHTPo H − + PbPc − $ (2.23) Equation 2.23 is multiplied by the observation operator H to simplify the terms involving Pb: 0 = H − + HPbHTPo H − + HPbPc − $ (2.24) Expansion and rearrangement yields: H + HPbHTPo + HPbPc $ = H + HPbHTPo H + HPbPc (2.25) Finally, the analysis vector is factored out of the right hand side of the equation, yielding: 23 H + HPbHTPo + HPbPc $ = H + HPbHTPo H + HPbPc (2.26) In this form, the analysis is computed by directly solving for the analysis vector through an iterative solution method (as in Equations 2.8 and 2.18). Equation 2.26 can be generalized for & multiple constraints: H + HPbHTPo + HPb'P() $* + ,- = .H + HPbHTPo H + HPb'P() + ,- / (2.27) The preceding derivation assumes that Pc is a diagonal matrix, as it is not computationally feasible to calculate the inverse of a × matrix. As in the basic observation space framework, the combined matrix PbHT can be stored efficiently. However, the computational memory required doubles using Equation 2.27 because another product must be stored: HPb. This product also does not require storing explicitly the entire background error covariance matrix as individual rows of the H matrix can be multiplied by individual columns of the Pb matrix to yield the product HPb. Although the memory cost for an analysis using Equation 2.27 doubles compared to Equations 2.18 and 2.19, wall clock time increases only slightly. Because Pb is symmetric, individual rows of Pb computed during the computation of PbHT can be transposed to yield the 24 individual columns of Pb needed for the computation of HPb (computation of these matrix products is further discussed in Chapter 3). The UU2DVar supports both the usage of strong and weak constraints in the computation of the analysis; however, weak constraints are not investigated as part of this research. The UU2DVar could be utilized to study weak constraints as described by Equation 2.21 such as the additional utilization of a statistical model of orographic flow appropriate to the underlying terrain, as the product HPb is computed as part of the assimilation cycle to compute the adjoint (described in Chapter 4). The implementation of the strong constraints used as part of this research, which includes basic terrain anisotropy and a data density term, are discussed in Chapter 3. 25 0 1 1 1 1 2 3 4 5 6 7 4 3 8 9 & : 5 8 3 ; < = 9 ; 3 > ? 6 & < > 3 @ 7 : = ? @ 3 A B B B B C Pb × 0 1 1 1 1 2 D E F ℎ H I J K L M N OA B B B B C HT = 0 1 1 1 1 1 1 1 2 PbHT , PbHT ,Q PbHT Q, PbHT Q,Q PbHT R, PbHT R,Q PbHT S, PbHT S,Q PbHT T, PbHT T,Q PbHT U, PbHT U,QA B B B B B B B C PbHT , = 3 D + 4F + 5H + J + 6L + 7N PbHT ,Q = 3 E + 4ℎ + 5I + K + 6M + 7O PbHT Q, = 4D + 3 F + 8H + 9J + &L + :N PbHT Q,Q = 4E + 3 ℎ + 8I + 9K + &M + :O PbHT R, = 5D + 8F + 3 H + ;J + <L + =N PbHT R,Q = 5E + 8ℎ + 3 I + ;K + <M + =O PbHT S, = D + 9F + ;H + 3 J + >L + ?N PbHT S,Q = E + 9ℎ + ;I + 3 K + >M + ?O PbHT T, = 6D + &F + <H + >J + 3 L + @N PbHT T,Q = 6E + &ℎ + <I + >K + 3 M + @M PbHT U, = 7D + :F + =H + ?J + @L + 3 N PbHT U,Q = 7E + :ℎ + =I + ?K + @M + 3 O Figure 2.1. Methodology to compute each row of the background error covariance individually (shaded) to yield PbHT. Variables 4 through 7 and D through O are dummy variables. CHAPTER 3 IMPLEMENTATION OF VARIATIONAL ASSIMILATION THEORY WITHIN THE UU2DVAR Introduction As mentioned previously, the UU2DVar solves the variational cost function in observation space. Chapter 2 presented a derivation of the observation space solution to the cost function, and the UU2DVar's implementation of those equations (e.g., the specification of the error covariances, quality control of observations, analysis computation cycle) is covered here. Background Error Covariance Individual elements of the background error covariance used by the UU2DVar are computed by: PbVW = 3 exp [− \VW Q ℛ, ?V Q^ exp [− _VW Q `, ?V Q^ (3.1) where \ and _ are the horizontal and vertical great circle distances between gridpoints I and K. The horizontal and vertical decorrelation length scale terms ( ℛ, and `,, 27 respectively) in the denominators of the exponential terms in Equation 3.1 are not constants as used in prior studies (Myrick et al. 2005; Tyndall 2008; Horel and Dong 2010; Tyndall et al. 2010), but are instead are assumed to be functions of the data density at gridpoint I as measured by the dimensionless IDI (?). ℛ, and `, are defined for the Mth sixth order polynomials of the form: ℛ, ?V = a ∙ c,, ?V U + c,,Q?V T + c,,R?V S + c,,S?V R + c,,T?V Q + c,,U?V + c,,d (3.2) `, ?V = e ∙ c,, ?V U + c,,Q?V T + c,,R?V S + c,,S?V R + c,,T?V Q + c,,U?V + c,,d (3.3) where c,, through c,,d are coefficients of the Mth polynomial that determine its shape, and a and e, respectively, are horizontal and vertical decorrelation length scales of the type used in the previously cited studies. The polynomial functions presented in Equations 3.2 and 3.3 were selected due to their ease in modifying their shapes by simply changing the polynomial coefficients. Figure 3.1 depicts the various forms of ℛ, and `, as a function of the IDI that were studied as part of this research. In this study, a and e are set to 80 km and 200 m, respectively; these values were determined by Tyndall et al. (2010) for the CONUS domain and were tested in a case study over the area surrounding the Shenandoah Valley, VA. Similarly, the background error variance, 3 , is set to 1°C for 2-m temperature and 2-m dewpoint, and 1 m/s for and winds. The IDI, as defined by Uboldi et al. (2008), is computed by generating an analysis where all of the background values are assumed to be zero and all observations are assumed to be one. The IDI is completely dependent upon the assumptions made regarding the observation and background error covariances. In this study, the 28 background error covariance used by the IDI is always specified by Equation 3.1 using the set of polynomial coefficients corresponding to M = 0 (see Figure 3.1). The ratio of the observation error variance to background error variance (3 Q/3 Q) is also always 1 for all IDI computations. The IDI is a measure of the influence on the analysis by the observations; however, it also a measure of observation density (Horel and Dong 2010). Regions of the analysis where the IDI is near 1 indicate data rich areas with multiple stations in close proximity, while values near 0 indicate data void regions. With 3 Q/3 Q set to 1, the value of the IDI for an analysis with a single observation near that particular observation's gridpoint is 0.5 (i.e., the observation and background contribute equally to the final analysis). As a test of the use of the IDI, 0.5 is used as a point of inflection for the M = 1 and M = 2 polynomials. These polynomials force the decorrelation length scales to decrease significantly as the IDI approaches 1 and thereby allow finer-scale structures in the analysis than when M = 0. In the case of the M = 2 polynomial, the decorrelation length scales are substantially increased as the IDI approaches 0 for completely data void areas and thereby allow deviations between isolated observations and the background to influence a broader region. Specifying the background error covariance as a function of data density will be examined in Chapter 5. Computation and Storage of the Background Error Covariance Matrix As mentioned in Chapter 2, the computation and storage of the background error covariance matrix is one of the most significant challenges in variational data 29 assimilation. Even with the usage of the observation space framework, additional approximations must be made and advanced computation methods must be implemented to compute and store the background error covariance over a continental scale domain within real-time analysis constraints. The UU2DVar utilizes the following four methods to improve the computation and storage of the background error covariance matrix (in the form of PbHT): 1. Usage of sparse matrix mathematics 2. Variational localization 3. Computation of only needed elements of Pb 4. Parallel computing Although the largest matrix stored by the UU2DVar is of size × # instead of × ], a significant amount of memory is required to store this matrix for continental scale variational data assimilation problems. For the CONUS 5-km resolution domain used in this research and the 15,000 observations assimilated each hour, storage of the full PbHT matrix requires approximately 75 GB of memory. Although this is feasible for large supercomputers, it is not necessary to store the full PbHT matrix, as sparse matrices can be utilized to reduce memory requirements as well as wall clock time. Unlike the full matrix, which explicitly stores every element of a matrix, the sparse matrix only stores nonzero elements of the matrix, along with the index locations of those nonzero elements. Using sparse matrices only saves significant memory if the matrix to be stored has enough nonzero elements. For example, PbHT is a two-dimensional matrix; therefore the sparse form of PbHT must store both row and column indices as well as the values of the 30 nonzero elements within the matrix. For memory savings to be realized, PbHT must be at least 66.7% element sparse (i.e., at least 66.7% of its elements must be 0). Unfortunately, PbHT does not meet this requirement, even though HT is generally an extremely sparse matrix (as discussed below, only one value is nonzero in each row for this study). To force element sparseness, variational localization can be used to add additional zero elements to the matrix product. Depending on how the background error covariance is specified, an observation assimilated using variational methods can influence analysis gridpoint values thousands of kilometers away. These extremely large scale correlations may not be accurate (Hamill et al. 2001), especially in the case of undersampled assimilation problems, which is typical with surface observations. Variational localization refers to the elimination of extremely small error correlations. In the UU2DVar, small error correlations are not even computed as part of the specification of the covariance matrix, which not only reduces memory requirements (from sparse matrix implementations), but also improves computational time. The UU2DVar implements variational localization through a maximum radius of influence, which in this study is set to 3.75 times the maximum horizontal decorrelation scale, i.e., 300 km for polynomial coefficients M = 0 and M = 1, and 600 km for M = 2. This corresponds to removing all correlations that are smaller than 7.8×10-7 . This maximum radius of influence was chosen somewhat arbitrarily, and reducing it further will decrease both wall clock time and memory requirements and may have little impact on the resulting analyses. Figure 3.2 depicts the difference between an IDI temperature analysis computed without variational localization and one with variational localization, using a western 31 United States domain centered over Utah. As shown by Figure 3.2, differences between the two analyses are negligible, with a maximum difference on the order of 10-6. The majority of the largest differences are located in data sparse and data void regions; however, these differences are extremely small. Furthermore, the usage of variational localization is supported by other analysis systems that also utilize the technique, e.g., the RTMA (M. de Pondeca, personal communication). Usage of variational localization is supported by the functional form of Equation 3.1, which assumes background error correlations asymptote to 0 at large distances. In addition to reducing the memory requirements to store PbHT, there is also a need to significantly reduce its computation time. Computation of PbHT on a single processor for a continental scale problem can take days; however, the wall clock time can be significantly reduced by only computing required elements of the background error covariance matrix Pb that correspond to the nonzero rows of HT. As illustrated in Figure 3.3, only the first and fifth columns of Pb actually need to be computed to yield the full PbHT matrix for the simple example depicted in Figure 2.1. Wall clock time using the true covariance and forward operator is significantly reduced from days to less than 30 minutes, as the number of surface observations is typically 2-3 orders of magnitude smaller than the number of analysis gridpoints. The wall clock time needed to compute the background error covariance matrix can further be reduced through parallel computing. The computation of Pb is typically classified as an embarrassingly parallel computing problem, as the only interprocessor communication is at the start of the routine, to distribute pieces of information used to compute Pb to individual workers, and at the end of the routine, to gather up the final 32 result from the individual workers to assemble a full matrix. Embarrassingly parallel computing problems typically have near perfect speedup; i.e., wall clock time is reduced by half when the number of available processors doubles. Because the UU2DVar utilizes sparse matrices in its computation of the background error covariance matrix, the problem is more complicated than the simple parallel for loop, as large amounts of full matrices must be computed and then converted to sparse matrices, instead of making all of these computations using sparse matrices (due to reallocation of memory). For completeness, the algorithm used to compute PbHT (and HPb) is depicted in Figure 3.4 and described below (variable names used by the code are italicized): 1. Nonzeros rows HT are identified to determine which columns of Pb must be computed. 2. Resulting indices from (1) are divided into equal parts by the number of processors (nprocs) used by the analysis. 3. Each individual section of indices (owned by a particular processor) is divided into further subsections, based on the value of the user tunable variable numpbrowscompatonce. This particular variable controls how many columns of Pb the computer as a whole (not individual processor) is allowed to operate on at once. Therefore, the length of each subsection each individual processor may operate on at one time is numpbrowscompatonce/nprocs. 4. Each processor computes each subsection of its assigned indices of Pb column by column, through Equation 3.1, using a full matrix to store the resulting computations. When a processor reaches the end of the subsection, the entire 33 Pb subsection is converted from a full matrix to a sparse matrix, and the processor moves on to the next subsection. 5. When all subsections have been completed, each individual section of Pb owned by each processor is gathered into a single sparse matrix. Note that this matrix is not the full covariance matrix, as only elements that would not be reduced to 0 by multiplication of the transpose of the forward operator are computed. 6. The matrix product PbHT is computed. The matrix storing the needed elements of Pb is transposed, which is required for the computation of HPb. Figure 3.5 shows the speedup (black thin line) of the algorithm used to compute PbHT and HPb as a function of the number of processors. The speedup is a ratio of the computer time required for a code to run on a single processor versus the time required to run on multiple processors. Parallel algorithms with perfect speedup (depicted in Figure 3.5 as a thick grey line) have wall clock times that are halved when the number of processors used to compute the algorithm is doubled. Perfect speedup can be difficult to achieve due to communication overhead between processors. The speedup depicted in Figure 3.5 measures the average of 10 trials computing PbHT and HPb for all CONUS temperature observations for the 1400 UTC 27 October 2010 analysis (approximately 14,000 observations over 740,000 gridpoints). Speedup of the UU2DVar's PbHT and HPb computation is significantly less than ideal for larger numbers of processors because all processors used as part of this test share the same memory. This forces each processor to operate on a smaller piece of the background error covariance matrix at one time. Speedup can be improved by scaling the numpbrowscompatonce with the number of 34 processors, but this requires increasing the memory of the system with the number of processors as well. Although the computer system used in this research has significantly more memory than other compute nodes, it was decided that the UU2DVar would be tested using numpbrowscompatonce corresponding to a moderately powered compute node. Although the speedup of the UU2DVar's computation of PbHT and HPb is not perfect, the parallel implementation of this code still scales reasonably well and allows the entire tool to be run within real-time constraints. Computation of a single set of PbHT and HPb arrays takes approximately 4 minutes on 8 processor cores for approximately ~15000 observations. Background, Terrain, and Land/Water Mask The UU2DVar is designed to use the background fields, topography, and land/water mask of the RTMA. This background field consists of the 12-km resolution Rapid Update Cycle (RUC; Benjamin et al. 2004) 1 h forecast downscaled to either 2.5- km or 5-km resolution, depending on the resolution of the analysis. The downscaling process of the RUC background field attempts to modify the meteorological fields based on differences between the 12-km and the 2.5-km or 5-km terrain; this process is fully described by Benjamin et al. (2007) and Jascourt (2007). The terrain field used in this research for computing the background error covariance is modified from its original format; the elevation of gridpoints that are classified as water points as specified by the land/water mask is lowered 500 m for the 2-m temperature and 2-m dewpoint background error covariance computation. This technique is similar to that used by the 35 RTMA. The terrain field and land/water mask of the entire domain used in this study is shown in Figure 3.6. Input of the background fields is accomplished through a Network Common Data Form (NetCDF; Rew and Davis 1990) interface within MATLAB. The background fields can also be retrieved from the University of Utah server running Thematic Realtime Environmental Distributed Data Services (THREDDS; Caron et al. 2006) Server. Although CONUS domain background fields have been used in this research, the UU2DVar has also been configured to use background fields over an Austrian domain for comparison to the INCA system (Haiden et al. 2011). Usage of Observations and Quality Control Observations of 2-m temperature and 10-m and winds are used by the UU2Dvar without any additional pre-processing and are assimilated in terms of their metric units (°C for temperature, m/s for winds). Pressure observations are assimilated in mb; however, the UU2DVar can either assimilate the raw observation or apply an elevation correction term, as is sometimes necessary when there are large differences between the observation elevation and the analysis gridpoint elevation. This pressure correction modifies an individual raw surface pressure observation to a corrected surface pressure (g h ) by using the hypsometric equation: g h = g hi exp j E a ∙ g kl g m − n m (3.4) 36 where a is the ideal gas constant for dry air, E is the constant of gravitational acceleration, g k and g m correspond to the 2-m temperature and elevation of the pressure observation respectively, and n m is the elevation of the nearest analysis gridpoint to the observation. Since moisture is analyzed in terms of dewpoint temperature, mixing ratio values provided by some sources are converted to relative humidity to be consistent with the majority of mesonet observations available in terms of relative humidity. Because surface pressure is not available for all reports and to provide a consistent conversion from relative humidity to dewpoint temperature, relative humidity observations (g op are converted to dewpoint temperature observations (g kq) using an empirical formula: g kq = g op r 112℃+ 0.9g k + 0.1g k − 112℃ (3.5) The UU2DVar can be configured to use one of three different observation sources: (1) the MesoWest database (Horel et al. 2002), (2) the observation data file used by the RTMA, or (3) a flat file generated by the user. Observations acquired using the UU2DVar's MesoWest interface must fall within a ±30 min time window centered about the analysis hour. The UU2DVar uses the time window used by the RTMA for observations acquired from the RTMA's observations data files. The UU2DVar does not use a time window for the usage of an observation flat file; therefore, the time window used by the analysis is configurable by the user in this instance. For all three of these configuration options, only one observation is used per station. In the case of stations that record several observations within the time window, only the observation closest to the 37 analysis hour is used. If two observations from a station are separated by an equal amount of time about the analysis hour, the later observation is used. All observations also undergo quality control during the assimilation cycle within the UU2DVar. Temperature, dewpoint, and pressure observations undergo a simple quality control check which rejects any observations that fail to meet the criteria |g − n | ≤ yzstdev (3.6) where yz refers to a tunable error multiplier factor and the function stdev is the standard deviation of the background field over the entire domain (note that as in Equation 3.4, n is the value of a single background value nearest to the observation). Although this quality control may be rudimentary, it is effective in removing gross errors from the observation dataset. The error multiplier yz is set to 3 for temperature, dewpoint, and wind speed in this research. An additional quality control step for wind observations is also available for UU2DVar. Wind observations still must meet the requirements as specified by Equation 3.6 ( and wind components and wind speed must satisfy the criteria of Equation 3.6, or the entire observation is rejected), but additional light wind observations can be rejected if g i < i ⋃ n i > i (3.7) 38 where g i is the observation wind speed, n i is the value of the nearest background gridpoint to the observation, and i and i are wind speed observation and background quality thresholds, respectively. This additional quality control check helps to identify erroneously calm winds where the background field is specifying stronger winds. Although the asymmetric wind quality control was not used in the research presented here, it is mentioned here to present a complete description of the UU2DVar. A minimum and maximum threshold quality control can also be applied to pressure observations within the UU2DVar. When this quality control is used, pressure observations that are used by the analysis must meet the criteria zV hi ≤ g hi ≤ z hi (3.8) where zV hi and z hi are minimum and maximum surface pressure thresholds (as specified by the user), respectively, and g hi is an observation's surface pressure. This quality control was added after evaluation of several pressure analyses, as the quality control specified by Equation 3.6 fails to remove many unphysical surface pressure observations. Surface pressure analyses are not studied within this research, but this particular quality control is mentioned here as well to present a complete description of the UU2DVar. A simple forward operator is used to interpret analysis values to observation locations, as well as a simple observation error covariance to assimilate the observations. The forward operator, H, interprets analysis values using a nearest neighbor approach. As mentioned earlier in this chapter, the simplicity of this forward operator allows it to be 39 exploited as a filter to reduce the storage required for the background error covariance matrix. The observation error covariance matrix, Po, is simply a diagonal matrix, and in this research, the observation error variance is set to 1°C for temperature and dewpoint and to 1 m/s for and wind components and wind speed. Setting all off diagonal elements of Po to 0 assumes that all observation errors are uncorrelated with each other; however, this may not be true for all observations. Errors of observations within individual mesonets may actually be correlated with each other through siting or instrumentation biases of a particular mesonet. As will be discussed in Chapter 6, it is possible to use the methodology presented in Chapter 4 to help determine mesonet biases on the basis of large samples of analyses. Using the same value for all diagonal elements of observation error covariance matrix also implies that all observations have equal observation errors, which may also not be accurate. A particular mesonet may have significantly higher observations errors than others; the same may also apply for an individual observation when compared to other observations within the mesonet. While it is straightforward to implement varying observation errors dependent upon the mesonet or the individual observation in UU2DVar, the tuning of Po requires extensive research and additional testing that was beyond the scope of this study. UU2DVar General Characteristics and Analysis Cycle As a result of the simple specification of the background error covariance by Equation 3.1, the UU2DVar produces univariate surface analyses. Because wind is a vector measurement, it was believed assimilating and wind components separately 40 would yield analyses that would be less accurate than analyses generated using a multivariate method. This approach involves generating and unit wind component analysis fields, along with analyses of wind speed. The vector wind field is produced by multiplying the unit and wind components analyses by the wind speed analysis. In the case of the wind speed analysis, the rare negative values within the analysis are set to 0 after the analysis has been computed. The approximations and parallelization techniques listed in this chapter allow the UU2DVar to generate 2-m temperature, 2-m dewpoint, surface pressure, and 10-m and wind component analyses in approximately 25 minutes when run on a compute node of 8 processors. This is comparable to the computation cycle of the RTMA, which needs approximately 15 minutes when run on 16 processors on the NCEP development supercomputer (M. de Pondeca, personal communication). Computing the observation sensitivity and observation impact (which is defined in the next chapter) for all fields adds an additional 8 minutes to the UU2DVar's computation cycle. The complete computation cycle, along with the parallelization scheme across the entire cycle, is depicted in Figure 3.7. While many data assimilation tools and systems, such as the RTMA, are written in FORTRAN, the UU2DVar is written in MATLAB. The MATLAB programming language offers several advantages over FORTRAN compilers commonly supported at research universities. MATLAB has built-in support for sparse matrix mathematics and optimizations through the Linear Algebra Package (LAPACK; Anderson et al. 1999) and the Basic Linear Algebra Subprograms (BLAS; Dongarra et al. 1988). Relative to the commonly used Message Passing Interface (MPI), MATLAB's parallel computing 41 implementation is also easier to use since MATLAB does not require a developer to explicitly code communication and work distribution between processors. The MATLAB programing language also offers intrinsic subroutines essential for solving the variational assimilation equation, including an efficient GMRES function used to solve Equation 2.18. The cross-platform compiling abilities of MATLAB allowed the UU2DVar to be developed and tested on two different operating systems (Windows and Linux) with very little additional development work. The ability to compile the MATLAB code using the freely available MATLAB Compiler Runtime also makes the UU2DVar available on systems without MATLAB licenses. MATLAB also offers built-in support of the standard NetCDF file format, which allows efficient input/output of the background and analysis. Although there were initially some reservations regarding computational overhead requirements of MATLAB, this research has demonstrated those reservations to be unfounded as the UU2DVar's wall clock time is comparable to the FORTRAN-based RTMA. Figure 3.8 contrasts the analysis increments (analysis minus background) for 2-m air temperature at 1400 UTC 27 October 2010 for the UU2DVar (Figure 3.8a) and the RTMA (Figure 3.8b). For this example, the decorrelation length scales used by the UU2DVar have been set to match the RTMA's equivalent decorrelation length scales along with using the same observation dataset as the RTMA. Figure 3.8c shows the difference field between the RTMA and the UU2DVar analysis increments. Many of the differences between the two analyses are in regions of orography, i.e., along the Sierra- Nevada, Rocky, and Appalachian mountain ranges. While differences in the quality control procedures may cause some differences in the increments, the majority of the 42 differences are due to the coarse computation grid used by the recursive filters within the RTMA as opposed to the approach used by the UU2DVar. The coarse computation grid requires specifying a smoothed terrain for the background error covariance (M. de Pondeca, personal communication), which causes many of the minor terrain features to have limited impact on the computation of analysis increments within the RTMA. 43 Figure 3.1. Horizontal (ℛ,) and vertical (`,) decorrelation length scales specified by Equations 3.2 and 3.3. 44 Figure 3.2. Difference (shaded) of IDI fields of temperature observations without and with localization. Grey contours denote 500 m terrain contours and black squares denote locations of temperature observations. 45 0 1 1 1 1 2 3 4 5 6 7 4 3 8 9 & : 5 8 3 ; < = 9 ; 3 > ? 6 & < > 3 @ 7 : = ? @ 3 A B B B B C Pb × 0 1 1 1 1 2 1 0 0 0 0 0 0 0 0 1 0 0A B B B B C HT = 0 1 1 1 1 1 1 1 2 PbHT , PbHT ,Q PbHT Q, PbHT Q,Q PbHT R, PbHT R,Q PbHT S, PbHT S,Q PbHT T, PbHT T,Q PbHT U, PbHT U,QA B B B B B B B C PbHT , = 3 PbHT ,Q = 6 PbHT Q, = 4 PbHT Q,Q = & PbHT R, = 5 PbHT R,Q = < PbHT S, = PbHT S,Q = > PbHT T, = 6 PbHT T,Q = 3 PbHT U, = 7 PbHT U,Q = @ Figure 3.3. Pictorial demonstration of exploitation of the forward operator using the example presented in Figure 2.1. Only a fraction of the columns of Pb (shaded) actually need to be calculated to compute PbHT; these columns correspond to nonzero rows of HT. 46 Figure 3.4. Slicing of background error covariance matrix for computations and storage, using a domain of 20 gridpoints on 2 processors with numpbrowscompatonce set to 4. Distribution of the work to the processors is based on an equal division of the number of columns that must be computed (light red and light blue), and not on a division of the entire covariance matrix itself. Dark red and dark blue columns of the background error covariance are not computed. Each processor works on computing two columns at a time (outlined in green) using full matrices before converting to a sparse matrix type. 47 Figure 3.5. Speedup of background error covariance computation (thin black line), plotted against perfect speedup (thick grey line). 48 Figure 3.6. CONUS 5-km terrain and land/water mask used by UU2DVar in this study. Gridpoints in blue denote water grid points. 49 Figure 3.7. Computation cycle and parallelization within UU2DVar, computing temperature, dewpoint, wind, and surface pressure 5-km resolution analyses for the CONUS domain using 8 processors using the MesoWest observation dataset valid for 1400 UTC 27 October 2010. Times for each subtask were computed from averages of 10 trials, rounded to the nearest 5 s. 50 Figure 3.8. UU2DVar and RTMA 2-m air temperature analysis increments, and the difference between the two analysis increments for 1400 UTC 27 October 2010. a. UU2DVar analysis increments, using decorrelation length scales approximately equal to those used by the RTMA. Analysis increments shaded in °C. 51 Figure 3.8. continued. b. RTMA analysis increments in °C. 52 Figure 3.8. continued. c. Difference between RTMA and UU2DVar analysis increments in °C. CHAPTER 4 OBSERVATION IMPACTS AND SENSITIVITY Introduction The National Research Council (2009) discussed the need for assessing the extent to which observations collected from disparate sources can be used to meet a variety of needs. Building on prior work (Myrick and Horel 2008; Horel and Dong 2010; Tyndall et al. 2010), this study is aimed at providing a better foundation for addressing the sensitivity and impacts of observations as a function of observation source. The objective of this chapter is to describe an appropriate approach and the data sources that will be evaluated in the next chapter. Computation Withholding a subset of observations from analyses and comparing the differences between the withheld observations and the resulting analyses is commonly used to assess analysis accuracy and uncertainty as well as the impacts of different types of observations on the analysis (Seaman and Hutchinson 1985; Zapotocny et al. 2000; Hiemstra et al. 2006; Myrick and Horel 2008; Tyndall 2008; Horel and Dong 2010; Tyndall et al. 2010). The choice of which observations to withhold depends on the application. For example, Zapotocny et al. (2000) differentiated by instrumentation type, 54 Hiemstra et al. (2006) used observations from networks generally not used operationally in their data assimilation system, while Myrick and Horel (2008) randomly withheld 30% of all surface observations. Horel and Dong (2010) applied the most extensive approach by sequentially withholding each of ~3000 observations from each of ~9000 analyses resulting in over 500,000 cross-validation experiments. This leave-one-out cross validation approach (Wilks 2006) is far too computationally expensive to be used for real-time applications. This research utilizes the analysis adjoint to efficiently compute analysis sensitivity to observations without the need to perform cross validation experiments. Adjoints of forecast models are now used routinely to assess where "targeted" observations might reduce model forecast errors (Palmer et al. 1998; Buizza and Montani 1999; Langland et al. 1999). Following the derivation by Baker and Daley (2000), Equations 2.18 and 2.19, which are used to compute the analyses within the UU2DVar, can be combined into a single equation (4.1): = + PbHT HPbHT + Po! − H (4.1) The adjoint of an analysis system can be viewed as the adjustment to the background field by the observations necessary to yield the resulting analysis, i.e., solving for the right-most term in Equation 4.1 (Kalnay 2003). The sensitivity of the analysis to the observations is calculated from the derivative of Equation 4.1 with respect to the observation vector and through use of the chain rule: 55 = PbHT HPbHT + Po! T = KT (4.2) KT is the transpose of the weight matrix K, which can be simplified using the distributive property of the transpose operator and the symmetry of the background and observation error covariance matrices to yield: KT = HPbHT + Po!HPb (4.3) A cost function (different from the cost function in Equation 2.1 used as the foundation for variational analysis) is specified that is a measure of a quantity of interest within the analysis domain. Forecast model adjoint sensitivity studies may choose a parameter such as air temperature or sea level pressure over a limited domain of interest in order to highlight what additional targeted observations might help reduce the forecast error of that parameter in that region (Langland et al. 1999; Baker and Daley 2000; Zhu and Gelaro 2008). Since the objective of this research is to assess analysis sensitivity to differences between observation networks, is defined with respect to the entire analysis domain as the squared differences between the analysis and the background field: = 1 2 − Q (4.4) The observation sensitivity vector, / , is computed using Equation 4.2 and the chain rule: 56 = = KT − (4.5) The observation sensitivity is defined at each observation location and does not depend on the specific values of observations at those locations (Baker and Daley 2000), i.e., it is a measure of the sensitivity of the analysis to having observations at that location, not the sensitivity to the specific observation reported at that location. The observation impact, , as defined by Zhu and Gelaro (2008), considers the value of observations as well as their locations: = 1 2 〈 , − H 〉 (4.6) Since is the scalar product of the observation sensitivity and the observation innovations, the contribution of specific observations to the analysis can be assessed. Observation Networks As a test of the methodology described in the previous subsection, observations are examined for 1400 UTC 27 October 2010. The analysis is restricted to observations accessible via MesoWest that are publicly available via the Meteorological Assimilation and Data Ingest System (MADIS; Miller et al. 2005). With permission of the data providers, observations available to MADIS that are subject to usage restrictions from the Oklahoma Mesonet and WeatherFlow Inc. were added for this analysis as they reflect networks with very good maintenance, equipment, and siting standards. A total of 13,763 57 temperature, 11,201 humidity, and 11,728 wind observations were available for this analysis As discussed by the National Research Council (2009), metadata on the equipment, siting, and reporting standards used by many data providers are incomplete. The MesoWest developers identify mesonets by their source, which in the cases of large networks is often in turn an aggregate of many different sources. For the purposes of this study, the source networks are grouped into 10 general categories based subjectively on the characteristics of the networks known to the MesoWest developers. Figure 4.1 depicts the locations of each observation available at this specific time and grouped by mesonet and network category. Providing such information routinely is one of the recommendations of the National Research Council (2009). Horel et al. (2002) provide a description of observing networks provided by MesoWest, but because those descriptions may no longer be up to date and because many new networks have been added to MesoWest since the publication of that article, the networks are described here for completeness. Figure 4.1a shows station locations that are classified as primarily agricultural (AG). The AG networks monitor standard meteorological parameters and often report soil temperature and moisture as well. Wind sensors are typically mounted at 3 m to facilitate surface evaporation estimates. The majority of this network is made up of observations from the Soil Climate Analysis Network (Schaefer and Paetzold 2001), the California Irrigation Management Information System (Snyder 1984), and the U.S. Bureau of Reclamation AgriMet (U.S. Bureau of Reclamation 2011). Most of the 58 observations within this category are located within well irrigated areas and are collected in real-time. Air quality monitoring networks are aggregated into the AQ network category, as shown in Figure 4.1b. The AIRNow network combines air quality stations from many state and local agencies. MesoWest and MADIS continue to access selected air quality networks directly that do not provide their complete suite of weather data to the AirNow program. Figure 4.1c depicts stations that are primarily external (EXT network category) to the contiguous United States. This category includes observations from Environment Canada, which is the Canadian equivalent of the United States' NWS and Federal Aviation Administration (FAA) observing network. As with many networks, there are increasing concerns about the quality of these observations due to siting and quality control issues (Environment Canada 2008). Observations from Mexico are provided by the Servicio Meteorológico Nacional (SMN) de México. The SMN network is a synoptic-scale observation network. The limited available documentation suggests there are quality control and reliability issues also associated with this network (Servicio Meteorológico Nacional 2011). Observations from network providers that are primarily located along the coast or offshore (with a few interior exceptions from the commercial WeatherFlow network) are also included in this category with the majority provided by the National Oceanic and Atmospheric Administration. The FED network category consists of land based observations from federal agencies, excluding observations maintained by the NWS/FAA and those used primarily for agricultural, hydrological, fire weather, or air quality purposes. As shown in Figure 59 4.1d, most of these observations come from local and regional networks, except for the nationwide MADIS Non-commissioned Automated Weather Observing System and the Climate Reference Network. Generally all of these networks have well-defined siting and maintenance procedures, but it should be recognized that even the highest quality networks can have standards different from what many users might expect. For example, wind direction is not available from the Climate Reference Network since the wind speed sensor at 3 m is intended to estimate under catch of precipitation (National Environmental Satellite‚ Data and Information Service 2003). Hydrometeorological networks are grouped into the HYDRO network category (Figure 4.1e). The majority of observations are supplied by the Hydrometeorological Automated Data System (HADS), which is itself an aggregate of stations owned and maintained by many different agencies (Office of Hydrologic Development 2011). Many more observations in the HADS network report precipitation only and do not appear in this figure; however, the ones shown here report at least air temperature. The Snowpack Telemetry network observations (SNOTEL; Schaefer and Paetzold 2001) of the National Resources Conservation Service are a very important resource due to their locations generally at high elevation within the western United States. Due to the meteor burst data communication required for these remote locations, observations often are not available until a few hours after the valid time, and may not be available for the RTMA. In addition to precipitation measurements, all SNOTEL observations collect 2-m air temperature. Stations supplied to MesoWest and MADIS by a number of local, state, and regional sources are aggregated into the LOCAL network category (Figure 4.1f). The largest network within this category is the Oklahoma Mesonet (Brock et al. 1995), which 60 has been described as the "most prominent state mesonet" in the country (National Research Council 2009) for its high quality instrumentation, siting practices, and documentation of observation metadata (Brock et al. 1995). The West Texas Mesonet (Schroeder et al. 2005) was modeled after the Oklahoma Mesonet and follows similar instrumentation and siting protocols. Some of the other LOCAL networks are aggregates of stations often including a mix of stations directly maintained by the network provider as well as stations managed by other local data providers. For example, there is considerable advantage to having WFOs work with local data providers to locally access their observations and then disseminate those observations to MesoWest and MADIS for other users. Figure 4.1g depicts the NWS network category, which is composed of Automated Surface Observing Stations (National Oceanic and Atmospheric Administration 1998) and Automated Weather Observing Stations (Federal Aviation Administration 2011). The majority of these observations are located in the urban regions of the eastern United States. All stations within this network category are commissioned by the NWS and FAA (although AWOS observations are maintained by state or local agencies), and must meet ASOS or AWOS equipment and siting standards. Observations from the Automatic Position Reporting System Weather Network/Citizen Weather Observing Program (APRSWXNET/CWOP; Gladstone 2000) make up the majority of the PUBLIC network category as shown in Figure 4.1h. APRSWXNET/CWOP stations are owned, installed and maintained typically by private citizens with varied siting and reporting practices (Chadwick 2005). Concerns over the quality of observations from the APRSWXNET/CWOP network have been raised in the 61 past, especially due to representativeness errors associated with siting issues (Tyndall et al. 2010). While Tyndall et al. showed that temperature observations appeared to be of similar quality to observations from the NWS network, concerns remain regarding wind measurements due to the frequent occurrence of nearby obstructions. The asymmetric wind quality control described in Chapter 3 was developed to mitigate some of these issues. The Remote Automated Weather Stations (RAWS) network (Figure 4.1i) is designed for wildfire management applications and supported by the U.S. Forest Service and many other federal, state, and local land and wildfire management agencies (Zachairassen et al. 2003; Horel and Dong 2010). RAWS are often located in remote locations preferably on slight south-facing slopes with limited nearby vegetation. Wind sensors are located at 6 m instead of 10 m anemometer heights used by their NWS observation counterparts. The lower anemometer heights and 10 minute averaging interval have contributed to the perception that their wind speeds are less than what might be expected leading to many RAWS being excluded from the RTMA. Finally, Figure 4.1j depicts the collective availability of stations located adjacent to roads and railways (TRANS network category). Many Union Pacific Railroad stations report 2-m air temperature only, as their primary interest for deploying the equipment is related to monitoring the expansion and contraction of the rails. Road Weather Information Systems (RWIS) generally report all standard meteorological variables as well as other measurements from additional road sensors. 62 Figure 4.1. Observations available for 1400 UTC 27 October 2010 analyses for each of the 10 network categories used in this research. Quantities in parentheses denote the number of observation each individual network contributed. a. AG network category. 63 Figure 4.1. continued. b. AQ network category. 64 Figure 4.1. continued. c. EXT network category. 65 Figure 4.1. continued. d. FED network category. 66 Figure 4.1. continued. e. HYDRO network category. 67 Figure 4.1. continued. f. LOCAL network category. 68 Figure 4.1. continued. g. NWS network category. 69 Figure 4.1. continued. h. PUBLIC network category. 70 Figure 4.1. continued. i. RAWS network category. 71 Figure 4.1. continued. j. TRANS network category. CHAPTER 5 RESULTS AND DISCUSSION Description of Case Study This research relies on the 1400 UTC 27 October 2010 CONUS analysis to demonstrate the methodology presented in Chapter 4. This particular analysis occurred during a period of extremely active weather for the eastern United States caused by an extratropical cyclone that progressed across the Great Lakes region prior to the case study time period (Figure 5.1). This cyclone was one of the strongest noncoastal low pressure systems observed within the United States with the lowest sea level pressure (955.2 mb) recorded at Big Fork, MN. The storm was accompanied with sustained winds in excess of 20+ m/s over the northwestern Great Lakes and Dakotas regions and nearly 5 inches of rain in Minnesota. The cold front, which extended south from the low center, brought severe thunderstorms and tornados from the southern Great Lakes region down through the southeast United States. Figure 5.2 shows the downscaled RUC background fields used in the case study for 2-m air temperature, 2-m dewpoint temperature, and 10-m wind speeds. The stationary and cold fronts are evident in the background fields in terms of gradients in temperature, moisture, and wind speed from central Mississippi to Virginia and from 73 Pennsylvania into Canada. The high winds associated with the cyclonic circulation across the Dakotas and Great Lake region are quite evident as well. As mentioned in Chapter 4, there are more than 10,000 observations available to adjust the background fields. Figure 5.3 depicts those corrections to each of the background fields, as well as the resulting analyses using the control case (M = 0 for both ℛ, and `, from Figure 3.1). The adjustments to the background field appear relatively modest on the scale of the entire continental United States when comparing the final analyses (Figure 5.3b, 5.3d, and 5.3f) to the comparable background grids in Figure 5.2. However, the analysis increments shown in Figure 5.3a, 5.3c, and 5.3e are substantive throughout much of the analysis domain, reflecting the impact of including the observations. The background error decorrelation length scales used in this control case are designed to allow observation innovations to influence the analyses over relatively broad areas (~100 km) in areas without significant topographic relief. Positive (orange) analysis increments denote where temperatures or wind speeds in the background are too low, while negative (purple) increments indicate where the background fields are too high. The largest analysis increments are concentrated near the areas of high impact weather (the strong extratropical cyclone over the Great Lakes and the stationary front in the southeast United States) as well as over the complex terrain of the western United States. Figure 5.3a shows that the background field underestimates the intensity of the stationary front across the southern states while overestimating the intensity of the cold front across New York. The background tended to be too cold throughout much of the Midwest and southern Plains with complex adjustments in temperature across the western 74 United States. The RUC background tended to be too moist in the upper Midwest and too dry over the Dakotas and most of Texas (Figure 5.3c). The wind speed analysis increments (Figure 5.3e) exhibit a general tendency to analyze lower wind speeds nearly everywhere with the most significant adjustments to the background field in the areas of high winds associated with the cyclone as well as on the warm side of the stationary front in the southeast. Considerable uncertainty exists whether the general tendency for negative wind increments when surface wind observations are used here or in the RUC or RTMA results from the complex mix of issues related to wind sensor siting as well as the representativeness of those observations in forested and areas of complex terrain. The solution adopted for the RUC and RTMA has been to restrict severely the mesonet wind observations used in those analysis systems (de Pondeca et al. 2011). All possible wind observations were used in this study specifically to help investigate this issue. Evaluation of this case, as well as many others not shown here, suggests that wind observations assigned here to the PUBLIC category have a negative speed bias likely due to siting. However, the large wind speed innovations leading to the large analysis increments in Figure 5.3e in the Great Lakes regions are found not only in the PUBLIC category, but in nearly all network categories. Hence, this tendency for the analysis wind speeds to be less than the background wind speeds may reflect insufficient downscaling of the RUC winds to conditions appropriate for the 10-m level. 75 Evaluation of Data Density Constraints Figure 5.4 shows the IDI fields for 1400 UTC 27 October 2010 for 2-m air temperature, 2-m dewpoint, and 10-m wind speed using the horizontal and vertical background error decorrelation length scales of 80 km and 200 m, respectively, that are used for the control analyses shown in Figure 5.3. (The IDI field for temperature in Figure 1.1 used horizontal and vertical background error decorrelation length scales of 40 km and 100 m that are used by the RTMA.) The IDI serves as a metric of observational data density but depends on the specifications of the observational and background error covariances (Equation 3.1). Not surprisingly, observational coverage as defined by the IDI in the eastern United States is greater than that in the western United States due to the smaller number of stations in the west as well as the assumption that background errors at two gridpoints become less related to one other when the two gridpoints are separated in elevation. Although not particularly evident in Figure 5.4, IDI values in the west for wind tend to be a bit smaller than for temperature or dewpoint temperature since many hydrologically-oriented networks (e.g, SNOTEL, Figure 4.1e) in the western United States are not equipped with anemometers. Many of the very small apparent data voids in the eastern United States in Figure 5.4a and 5.4b result from the assumption that background errors on- and off-shore of water bodies (whether oceans or small lakes) are unrelated to one another for temperature and moisture (Chapter 3). This land/water contrast assumption is not used for the wind background error covariance matrix due to the presumption that the background errors remaining well correlated across coastlines (M. de Pondeca, personal communication). 76 As discussed in Chapter 3, the IDI field can be used as a constraint on the analysis such that the background errors in data rich regions are assumed to be less correlated with one another, hence allowing smaller scale deviations between the observations and the background to be reflected in the final analysis. The background errors in data voids can be assumed to be more correlated with one another, allowing deviations between the sparse observations in such regions and the background to be felt over broader distances. In the context of Figure 5.4, the decorrelation length scales are narrowed for gridpoints that are dark blue for the M = 1 and M = 2 polynomials described in Chapter 3, and broadened for gridpoints that are red for the M = 2 polynomial. Figure 5.5 shows the increments and analyses for temperature, dewpoint, and wind speed using the M = 2 polynomial IDI constraint. Differences between Figures 5.3 and 5.5 can be seen in the increments of all fields from Texas to the northern Midwest, as well as from southern Appalachia to the northeast United States. Smaller scale increments, collocated with areas of high observation density, are evident in Figure 5.5, such as in Wisconsin, Iowa, Texas, and western California, which lead to small scale, but large differences between the analyses. These differences are not necessarily associated with a better analysis since narrowing the background error decorrelation length scales tends to lead to overfitting. As described by Daley (1991) for polynomials and Tyndall et al. (2010) for variational analyses, overfitting artificially creates false maxima and minima that are not representative of the data, or in the case of variational analyses, the background field and the observations. In variational methods, the likelihood of overfitting errors appearing in data sparse areas increases as the analysis is constrained more tightly to the observations 77 (Tyndall et al. 2010). Overfitting errors are especially evident in Figure 5.5e as an offshore band of much larger negative increments from Virginia to Maine relative to that seen in Figure 5.3e. This band and the smaller secondary negative increment band farther offshore is a result of the extreme gradient of observation density from the coast to the offshore zone. While not shown here, these overfitting errors also appear, not surprisingly, in the same region of the wind analysis if the M = 1 polynomial is used to compute the horizontal and vertical decorrelation length scales since that applies the same constraint in data rich areas. Besides the overfitting errors presented above, bull's-eye features throughout the Great Lakes region in the wind analysis of Figure 5.5f result from weak wind speed observations where the background wind speeds are quite large. This apparent noise is lessened in the control wind analysis in Figure 5.3f, which suggests that the data density constraint, if applied for wind, must be accompanied with effective removal of unrepresentative observations as part of a quality control procedure. As applied here, the broadening of the background error decorrelation length scales in data sparse regions does not appear to have a large impact on the analysis increment fields (compare Figure 5.3 and 5.5). Most of the analysis increments that appear to be affected are located along the Mexico border and the United States coastline from buoy observations. Although data density is low in many mountainous areas of the western United States, the apparent limited impact of broader decorrelation length scales in many of those areas is likely due to the continued dominance of the elevation constraint on the decorrelation length scale. 78 However, broadening the decorrelation length scales (using the M = 2 polynomial instead of the M = 1 polynomial) comes at significant computational expense as well. Increasing the variation localization threshold distance (Chapter 3) increases the computational time to compute the arrays PbHT and HPb and significantly increases the memory requirements of the analysis. For example, computing PbHT and HPb with the broadened decorrelation length scales for 2-m air temperature for this particular case on 8 processors requires an additional 4 minutes and triples the memory requirements. Hence, the potential benefits of increasing decorrelation length scales in data sparse areas must be weighed against these increased computational requirements. Sensitivity and Impact to Observation Networks As described in Chapter 4, observation sensitivity ( / ) and observation impact ( ) were computed for every observation in the case study. Because of the large number of figures that would be required to describe the sensitivity and observation impacts for all variables and all networks, the approach will be demonstrated here in terms of observation impact for 2-m air temperature for the control case (Figure 5.3) and the M = 2 polynomial case (Figure 5.5). The impact of dewpoint and wind speed observations will be briefly summarized near the end of this section followed by an even more cursory summary of observation sensitivity results. As specified by Equation 4.6, the observation impact, , is the product of an observation's innovation and its sensitivity, / . As a result of the cost function used in this study (Equation 4.4), the sensitivity depends on the analysis increment at the observation location. Hence, the impact will tend to be positive since the innovation and 79 sensitivity will usually have the same sign, i.e., in the case of an isolated observation, a positive (negative) innovation will tend to lead to locally a positive (negative) increment. Negative impacts are likely to occur only where the deviation of an observation from the background differs in sign and has a large magnitude relative to its neighbors, which may reflect either an observation in error or a realistic weather phenomenon on a scale smaller than that assumed a priori for the background errors. Since negative observation impacts are found to only comprise ~20% of the total and their magnitudes tend to be generally smaller than the corresponding positive ones, the impacts are ranked in terms of their absolute value from smallest to largest based on the entire sample of all observations. The methodology to evaluate observation impact is illustrated in Figure 5.6, which plots the observation impact percentile by observing network category for the control case (Figure 5.3). Figure 5.6 focuses on observations in the upper and lower quartiles (i.e., observations that had the most and least impact on the analysis, respectively). Because the percentiles of observation impact depicted in Figure 5.6 may overlap each other in data rich areas, the impacts are plotted in order from least impact to the most impact in order to identify those regions where those particular observations tend to have the greatest affect in adjusting the background field. Many of the first panels in Figure 5.6 tend to demonstrate the strong dependence of observation impact on station density as can be seen by comparing Figure 5.4 and Figure 5.6. The preponderance of high percentile (red) vs. low percentile (blue) impacts of agricultural (AG) temperature observations relative to the entire sample of observations is evident in Figure 5.6a. In contrast, air monitoring stations (AQ; Figure 5.6b) have more stations in the bottom quartile, i.e., AG stations tend to be in more 80 remote locations than AQ stations located in urban areas where many other data assets are generally available. In addition, it is possible to infer from Figure 5.3a that the analysis increments in this case tend to be small in the vicinity of the AQ observations, e.g., along the northeast United States coast, in the Central Valley of California, or along the coast of southeast Texas. Similarly, Figure 5.6c shows the frequently high impact of temperature observations from outside the continental United States (EXT category networks) where data density tends to be low. Offshore stations in the Gulf of Mexico tend to have relatively low impact, even though their sensitivity is high (not shown), since the observations and background do not differ substantively. Networks grouped into the FED category (Figure 5.6d) also have a higher percentage of high observation impact temperature observations than low impact observations. Many of these high impact observations are found in data sparse areas, specifically, in Nevada, Utah, and Idaho. The importance of observations in generally data sparse regions is particularly evident in Figure 5.6e for the HYDRO network category. SNOTEL observations at high elevations in the Sierra, Cascade, and Rocky Mountains tend to exhibit very high impact. The broad range of networks aggregated into the LOCAL category exhibit regional dependencies due to the type of weather event underway at this time as well as station density (Figure 5.6f). The West Texas Mesonet contains the majority of the high impact temperature observations in this category due to their large positive observation innovations leading to large temperature analysis increments over much of this region (Figure 5.3a). In contrast, observations from the northern half of the Oklahoma Mesonet and Florida Automated Weather Network tend to have low impact because of the small departures in temperature from the background in those regions. The contribution of 81 NWS network observations to the analysis is seen by the high impact observations across the Midwest and southward into Texas (Figure 5.6g). However, the NWS category also has a large number of low impact observations concentrated along the coast of the northeast United States, which are the result of small observation innovations in the area and high data densities. Although there is a prevailing tendency to dismiss observations provided by the general public through the CWOP program (PUBLIC network category), Tyndall et al. (2010) showed that the error characteristics for temperature observations from those stations are similar to those from other network sources. Similarly, Figure 5.6h shows that the impact of temperature observations from PUBLIC stations can be high and consistent in terms of their locales with those provided from other networks (e.g., compare to the NWS observations in Figure 5.6g). In other words, if the background field differs significantly from the actual weather, then PUBLIC observations can be quite useful, especially if there are relatively few other observations nearby. However, as will be shown later, the vast majority of PUBLIC temperature observations have low impact and those stations are simply obscured in Figure 5.6h. The impact of temperature observations from the RAWS network is depicted in Figure 5.6i. Since RAWS temperature observations contribute significantly to the negative temperature analysis increments in the mountainous regions in the western United States (Figure 5.3a), many of those stations exhibit high impact. In addition, RAWS stations extending northeastward from eastern Texas have a large impact in this case consistent with those of NWS and PUBLIC stations along this swath. Finally, Figure 5.6j shows the impact of temperature observations from the transportation network 82 category. Due to their coverage in otherwise data void regions, TRANS temperature observations contribute frequently to the temperature analysis in the western United States as well as to the broad region of the positive temperature analysis increment found across the Midwest. To evaluate the influence of the background error decorrelation length scales on observation impact, results from the M = 2 polynomial case (Figure 5.5) are now presented in Figure 5.7. Not surprisingly, this modification of the decorrelation length scales increases the observation impacts for networks that are primarily located in data rich regions. For example, the percentage of high impact observations within the PUBLIC network (Figure 5.7h) increases compared to the control case (Figure 5.6h), with observations near many urban areas (Dallas, TX; Detroit, MI; Chicago, IL, San Diego CA; San Francisco, CA; Denver, CO; Portland, OR) increasing their impact relative to all the other observations in this case. The improvement in the number of high impact observations in the PUBLIC network comes at the expense of the observation impacts of the RAWS, EXT, and HYDRO network categories. For example, the impact of RAWS temperature observations is significantly reduced along the Appalachian and Sierra Nevada Mountains (compare Figure 5.7i to Figure 5.6i) while the impact of HYDRO observations is reduced in the mountainous regions of the Intermountain West. Figure 5.8 summarizes the percentile rank of observation impact for all available stations aggregated into the 10 network categories that are computed from analyses of the 3 variables (temperature, dewpoint temperature, and wind speed) using 5 distinct background error decorrelation length scales. The upper left panel of Figure 5.8a summarizes the results previously shown in Figure 5.6 while the lower right panel of that 83 figure summarizes the results shown in Figure 5.7. The count of stations (g axis) with observation impacts that fall into each decile category (n axis) is color coded for each of the 10 network categories. First, the preponderance of observations available from the PUBLIC category tends to dominate all panels in Figure 5.8. For temperature (Figure 5.8a), there are larger numbers of PUBLIC observations that have low impact than high impact regardless of the assumptions related to the background error decorrelation length scale. Narrowing the horizontal decorrelation length scale in data rich areas (middle left and bottom left panels) slightly increases the number of high impact PUBLIC observations, at the expense of the number of high impact observations from other network categories, such as RAWS and NWS. Narrowing both horizontal and vertical decorrelation length scales (middle right and bottom right panels) further increases the number of high impact PUBLIC temperature observations. In contrast to PUBLIC observations, the count of observations in each decile category is relatively flat for NWS observations (yellow bars) and to a large extent independent of the assumptions related to background error decorrelation length scale (Figure 5.8a). This is consistent with the results shown in Figure 5.6g where there was considerable regional dependency in observation impact for NWS observations. Many of the other networks exhibit similar tendencies. However, RAWS (magenta bars) tend to have more stations with high impact than lower impact. Figure 5.8b summarizes the impact of dewpoint observations and the results are generally similar to those shown for temperature Figure 5.8a). (The total number of humidity sensors is lower for the AG, HYDRO, and TRANS network categories.) 84 Application of the data density constraints only slightly increases the impact of observations in the PUBLIC network category. Overall, the influence of the data density constraints on the impact of dewpoint observations appears to be much less than the influence on the impact on temperature observations. The statistics obtained from wind speed observations shown in Figure 5.8c exhibit very different characteristics compared to the statistics from temperature or dewpoint observations (Figures 5.8a and 5.8b). For the control analysis (upper left panel), there are as many stations in the PUBLIC network category with high impact as low impact while the number of high impact observations increases as the background error decorrelation length scales shrinks. The increased impact of PUBLIC wind observations from the application of the constraints comes at the expense of the impact of the NWS and RAWS observations. The high observation impacts from the stations in the PUBLIC category are related to the aforementioned siting and representativeness issues of PUBLIC observations. In addition, occasional, possibly incorrect or misreported, calm winds obtained from PUBLIC stations produce strong negative observation innovations as well as strong negative analysis increments, and contribute to the high impacts of this category. Application of the asymmetric wind observation quality control discussed in Chapter 3 is one step towards mitigating for these issues, rather than the common operational practice of simply omitting all wind observations from the PUBLIC networks. As illustrated in this section, observation impact, , appears to be a useful metric for assessing the relative role of observations in the development of analyses. An alternative metric is simply the sensitivity, / , which depends only on the locations 85 of the observations and the analysis increments for this case study. The concept of analysis sensitivity is of particular relevance for targeting observations in a complete data assimilation system where additional observations may be of particular importance for a future forecast, yet the value of the observation that would be obtained by that targeted observation is unknown at that time (Langland et al. 1999; Baker and Daley 2000). The interpretation of observation sensitivity in this study has been judged to be of less relevance for evaluating the relative importance of observations from specific network categories. A large sensitivity could result at a station where an observation matches the background but is surrounded by observations with large deviations from the background, e.g., a station with a strong wind report surrounded by erroneously calm wind reports would be evaluated as having a large negative sensitivity. Summary statistics of the magnitude of sensitivity are presented in Figure 5.9 in a manner similar to that presented in Figure 5.8 for impact. Hence, large positive and negative sensitivities both appear in the highest percentile categories since there is no sign preference for the sensitivity metric. The percentile distributions of temperature observation sensitivity in the control case (upper left panel of Figure 5.9a) have many similar features to those of observation impact (Figure 5.8a). A notable difference is the relative number of stations in the upper 20th percentile in the HYDRO category such that the HYDRO observations tend to have more stations with high impact (presumably due to the locally larger innovations in many remote mountainous areas) than with high sensitivity. Application of the constraints specified by the M = 1 and M = 2 polynomials increases the influence of the PUBLIC category more when measured by observation sensitivity instead of by observation 86 impact. This effect is especially enhanced when the horizontal and vertical decorrelation length scales are both narrowed. The percentile distribution of dewpoint observation sensitivity is depicted in Figure 5.9b, and is similar to the distribution for temperature observations. As the decorrelation length scales narrow in high observation density regions, the counts of PUBLIC stations with high sensitivities tend to increase. The sensitivity metric tends to accentuate the importance of the RAWS networks compared to that of the impact metric. The count of wind speed observation sensitivities in the upper 20th percentile from the control analysis (upper left panel of Figure 5.9a) is substantively less than that of the wind speed observation impacts (upper left panel of Figure 5.8a). Application of the data density constraints tends to homogenize the sensitivities from the PUBLIC network stations, in contrast to the increasing impact of the PUBLIC stations as the decorrelation length scales are narrowed. While a large percentage of NWS and RAWS stations have high wind observation sensitivities in the control analysis, the percentages again tend to remain relatively constant when the decorrelation length scales are decreased. Another notable feature in the summary statistics for wind speed sensitivity is the very high percentage of stations in the top decile from the EXT networks (solid blue bars in Figure 5.9c). Nearly all of these highly sensitive locations are located offshore in relatively data void regions adjacent to onshore highly data rich areas. This high sensitivity may be due to the super-sensitivity artifact, first described by Baker and Daley (2000). Super-sensitivity typically occurs where sharp changes in observation density are found. The overfitting errors in the wind analysis increments seen in Figure 5.3e may result from the combination of super-sensitivity, the strong observation innovations of the 87 coastal observations, and constraining the analysis too tightly by narrow background decorrelation length scales. Clearly, there is a tradeoff between analysis quality and observation impact and sensitivity. It would be possible to force the observation impact and sensitivity to be high by drastically reducing the background error decorrelation length scales, which would in turn force the analysis to have many bull's-eyes in data rich areas and overfitting issues in data voids. 88 Figure 5.1. Hydrometeorological Prediction Center surface analysis valid for 1500 UTC 27 October 2010 (one hour after the case study presented in this research). Surface pressure isobars are plotted every 4 mb, along with surface fronts and maxima and minima in the surface pressure field. 89 Figure 5.2. RUC 1-hr forecast background fields for 1400 UTC 27 October 2010 over the CONUS domain. a. 2-m air temperature in °C. 90 Figure 5.2. continued. b. 2-m dewpoint temperature in °C. 91 Figure 5.3. continued. c. 10-m wind speeds in m/s. 92 Figure 5.3. UU2DVar increments and analyses for 1400 UTC 27 October 2010 using constant horizontal and vertical decorrelation length scales. a. 2-m air temperature analysis increments in °C. 93 Figure 5.3. continued. b. 2-m air temperature analysis in °C. 94 Figure 5.3. continued. c. 2-m dewpoint temperature analysis increments in °C. 95 Figure 5.3. continued. d. 2-m dewpoint temperature analysis in °C. 96 Figure 5.3. continued. e. 10-m wind speed analysis increments in m/s. 97 Figure 5.3. continued. f. 10-m wind speed analysis in m/s. 98 Figure 5.4. IDI analysis for all temperature, dewpoint, and wind speed observations used by the 1400 UTC 27 October 2010 analysis using horizontal and vertical decorrelation length scales of 80 km and 200 m, respectively. a. 2-m air temperature IDI analysis. 99 Figure 5.4. continued. b. 2-m dewpoint temperature IDI analysis. 100 Figure 5.5. continued. c. 10-m wind speed IDI analysis. 101 Figure 5.5. UU2DVar increments and analyses for 1400 UTC 27 October 2010 using the M = 2 polynomial to compute the vertical and horizontal decorrelation length scales. a. 2-m air temperature analysis increments in °C. 102 Figure 5.5. continued. b. 2-m air temperature analysis in °C. 103 Figure 5.5. continued. c. 2-m dewpoint temperature analysis increments in °C. 104 Figure 5.5. continued. d. 2-m dewpoint temperature analysis in °C. 105 Figure 5.5. continued. e. 10-m wind speed analysis increments in m/s. 106 Figure 5.5. continued. f. 10-m wind speed analysis in m/s. 107 Figure 5.6. Observation impact percentile for 2-m air temperature for 1400 UTC October 2010 using constant decorrelation length scales, grouped by network category. a. AG network category. 108 Figure 5.6. continued. b. AQ network category. 109 Figure 5.6. continued. c. EXT network category. 110 Figure 5.6. continued. d. FED network category. 111 Figure 5.6. continued. e. HYDRO network category. 112 Figure 5.6. continued. f. LOCAL network category. 113 Figure 5.6. continued. g. NWS network category. 114 Figure 5.6. continued. h. PUBLIC network category. 115 Figure 5.6. continued. i. RAWS network category. 116 Figure 5.6. continued. j. TRANS network category. 117 Figure 5.7. Observation impact percentile for 2-m air temperature for 1400 UTC October 2010 using using the M = 2 polynomial to compute the vertical and horizontal decorrelation length scales, grouped by network category. a. AG network category. 118 Figure 5.7. continued. b. AQ network category. 119 Figure 5.7. continued. c. EXT network category. 120 Figure 5.7. continued. d. FED network category. 121 Figure 5.7. continued. e. HYDRO network category. 122 Figure 5.7. continued. f. LOCAL network category. 123 Figure 5.7. continued. g. NWS network category. 124 Figure 5.7. continued. h. PUBLIC network category. 125 Figure 5.7. continued. i. RAWS network category. 126 Figure 5.7. continued. j. TRANS network category. 127 ℛ,- `,- ℛ,-Q `,- ℛ,-Q `,- ℛ,-Q `,-Q ℛ,- `,- Figure 5.8. Observation impact percentile distribution by network category for 1400 UTC 27 October 2010 for temperature, dewpoint, and wind speeds for all 5 specifications of the background error covariance studied in this research. a. 2-m air temperature observation impact percentile distribution. 128 Figure 5.8. continued. b. 2-m dewpoint temperature observation impact percentile distribution. ℛ,- `,- ℛ,-Q `,- ℛ,-Q `,- ℛ,-Q `,-Q ℛ,- `,- 129 Figure 5.8. continued. c. 10-m wind speed observation impact percentile distribution. ℛ,- `,- ℛ,-Q `,- ℛ,-Q `,- ℛ,-Q `,-Q ℛ,- `,- 130 ℛ,- `,- ℛ,-Q `,- ℛ,-Q `,- ℛ,-Q `,-Q ℛ,- `,- Figure 5.9. Observation sensitivity percentile distribution by network category for 1400 UTC 27 October 2010 for temperature, dewpoint, and wind speeds for all 5 specifications of the background error covariance studied in this research. a. 2-m air temperature observation sensitivity percentile distribution. 131 Figure 5.9. continued. b. 2-m dewpoint temperature observation sensitivity percentile distribution. ℛ,- `,- ℛ,-Q `,- ℛ,-Q `,- ℛ,-Q `,-Q ℛ,- `,- 132 Figure 5.9. continued. c. 10-m wind speed observation sensitivity percentile distribution. ℛ,- `,- ℛ,-Q `,- ℛ,-Q `,- ℛ,-Q `,-Q ℛ,- `,- CHAPTER 6 CONCLUSION Summary High resolution spatial and temporal objective surface analyses are needed for many different mesoscale nowcasting and short-term forecasting needs. Unfortunately, model output from many operational numerical models is unable to fill this need due to their coarser resolution as well as their inability to appropriately model or parameterize many boundary layer processes. Accurate surface analyses can be created by using this model output as a first guess and using surface mesonet observations to correct this first guess through data assimilation. This study presented the UU2DVar, a 2DVar analysis tool that can assimilate thousands of surface observations to produce surface analyses of 2-m air temperature, 2-m dewpoint temperature, 10-m - and - wind components, 10-m wind speed, and surface pressure. Unlike its predecessor (the LSA), the UU2DVar can be run over continental scale domains because it solves the variational cost function in observation space instead of analysis space, greatly reducing the necessary memory to store the background error covariances. The majority of the UU2DVar's routines are written to take advantage of parallel processing, greatly decreasing the computational time required to compute its background error covariances and allowing it to be run over a continental 134 scale domain within real-time constraints. The parallel speedup of the UU2DVar's functions to compute the background error covariances is a function of the amount of memory of the computer system used to run it; systems with more memory can take advantage of the processor computing larger blocks of the covariance array at once, increasing the actual speedup towards the idealized perfect speedup. The UU2DVar is written using the MATLAB programing language, allowing it to be run with any operating system that supports the MATLAB software (Windows, Mac OS X, and Linux). Earlier versions of the UU2DVar have also been compiled using the MATLAB Compiler, which has allowed the code to be run on systems without MATLAB licenses using the freely available MATLAB Compiler Runtime as an executable binary. Users of the UU2DVar do not have to supply their own observations and background fields, as the tool is written to interface with the University of Utah THREDDS server and MesoWest database; however, users have the option to incorporate their own observation datasets. The UU2DVar provides a flexible platform from which observations from heterogeneous surface mesonets can be examined objectively. The National Research Council (2009) recommended improved metadata, data quality control procedures, and understanding of the relative merits of differing data sources as ways to increase the utilization of such observations throughout the weather enterprise. The development of the UU2DVar and its adjoint was instigated with those goals in mind. While the UU2DVar shares its background field as well as a similar 2DVar assimilation technique with the RTMA, its analyses differ from those of the NCEP system due the different assumptions used to generate the analyses. Because the 135 UU2DVar computes its analyses in observation space, the UU2DVar utilizes a terrain field closer to that observed to compute its background error covariance, which helps to explain why the largest differences between the UU2DVar and the RTMA analyses tend to be located in areas of complex terrain. To illustrate the applicability of this system and expand on prior research (Horel and Dong 2010; Tyndall et al. 2010), a single case was examined in depth with particular attention placed on the dependence of the analysis system to variations in the background error covariance specified as a function of data density. Data density is computed in terms of the IDI field over the entire CONUS. Usage of the UU2DVar adjoint instead of leave-one-out data withholding experiments (as done by Horel and Dong [2010]) allows for an efficient methodology to determine the sensitivity and impacts to all observations. This study demonstrates that it is possible to use varying decorrelation length scales in specifying the background error covariance as well as the efficiency of the adjoint methodology to determine the impact of varying data assets. Further study is required to assess whether using a data density criterion to constrain the analysis is beneficial. However, it is clear from this single case that the extreme variations in data density over the continental United States are a challenge, since overfitting can result if the analysis is too tightly constrained. Additional research may show that a "flatter" polynomial, in which decorrelation length scales do not decrease as significantly in data dense regions, may prove more beneficial to the analysis. Furthermore, forecasters utilizing such analyses must help assess whether it is more beneficial to have a smoother analysis, or one that is able to resolve small scale features where the observing network is dense. 136 Observation impact appears to be a more robust metric for contrasting the influence of observations than observation sensitivity. Observations with high impact draw attention to locations with both high sensitivity as well as large innovations. However, it is difficult to distinguish through simple objective criteria between high impact observations resulting from meaningful deviations in local weather from the background, erroneous or unrepresentative observations, or erroneous features in the background itself. Alternative cost functions to that examined here (mean squared difference between the analysis and background over the entire grid) could be specified by Equation 4.4 in order to focus on other questions of interest, e.g., particular flow characteristics within limited domains. For the analysis hour examined here, stations in data sparse regions where deviations from the background were large tended to have high impact, while stations in data rich urban areas tended to have lower impact. For example, the HYDRO and RAWS network categories, with many stations in remote locations, had larger numbers of high impact temperature observations than low impact observations. RAWS observations also had many stations with high dewpoint temperature impact as well, due to their strong observation innovations in many regions of the western United States. Stations in the PUBLIC category tended to have very high observation impacts on the wind speed analysis, which may be the result of sensor siting issues as well as unrepresentative and erroneously calm wind observations collocated with high background field wind speeds. Applying the four data density constraints to the analyses increased the number of high impact PUBLIC observations in all fields. 137 This research, which includes the UU2DVar analysis tool, as well as an investigation of observation impacts for an individual case study, helps to lay the foundation for additional research focusing on addressing issues associated with the development of the NNoN. A parallel study applying the methodology presented in Chapter 4 to 100 cases of significant weather events is already underway and some preliminary results have already been collected. Those results confirm the higher observation impacts of networks located in data sparse regions (such as the RAWS network) as seen with the single case study presented in Chapter 5. The implementation of observation sensitivity and impacts as a measure of quality control for observations part of the NNoN is also being discussed with MesoWest researchers. Recommendations and Future Work The development of the UU2DVar and usage of the variational adjoint to determine observation impact has led to a number of additional questions, as well as goals for future work. These recommendations and goals for future work are expanded upon here: 1. Collection and regular updating of observation metadata is necessary for the production of high quality surface analyses. As shown by the wind analyses computed as part of this research, the assimilation of poor quality observations can greatly reduce the quality of the analyses. Unfortunately, without complete siting information, as well observation maintenance and station instrument information, it is difficult to differentiate good quality observations from poor quality observations, even with the use of more 138 advanced quality control procedures. Providers of observation data should also make every effort to make network description publications available with the data, and large observing networks (especially federally funded networks) should be required to maintain documents describing the standards used within the network. 2. Advanced quality control procedures should be implemented within the UU2DVar. Implementation of more rigorous quality control procedures on the observations retrieved from the MesoWest database used by the UU2DVar will substantively improve the utility of this system. The quality control steps implemented within this version of the UU2DVar are limited to removing gross errors based on error characteristics assumed for the entire domain as a while. A number of additional quality control procedures are under development by the MesoWest team and will help to remove many commonly occurring problems. For example, incorrect sta |
| Reference URL | https://collections.lib.utah.edu/ark:/87278/s6q534cg |



