Efforts Toward Establishing ML-McAs a Robust Depth Discriminant of Seismic Sources at Local (<150 km) Distances

Publication Type	honors thesis
School or College	College of Mines & Earth Sciences
Department	Geology & Geophysics
Faculty Mentor	Keith D. Koper
Creator	Voyles, Jonathan Ross
Title	Efforts Toward Establishing ML-McAs a Robust Depth Discriminant of Seismic Sources at Local (<150 km) Distances
Date	2020
Description	Few scientific fields have been as dramatically accelerated by war as Earth science. Prior to the twentieth century, Earth science was largely dominated by natural philosophers debating whether the Earth had been formed by a ball of magma that solidified or a ball of liquid that precipitated. This great debate stymied innovation and helped to set Earth science behind other fields like mathematics and physics in a time of great intellectual advancement. Things started to change in 1912 when Alfred Wegener suggested that continents could move. This notion challenged Earth scientists to change the scope of the problems they were trying to solve and incorporate their ideas with Charles Darwin's frame of geologic time that he proposed in 1859. Earth scientists reacted negatively to Wegener's ideas initially, but in the following years, geoscientists began to incorporate precise quantitative techniques to evaluate Wegener's claims and the field began to mature. Sadly, Wegener died an outcast of the scientific community because he failed to give a mechanism that could cause continents to move. Interestingly, much of the evidence for his idea was later discovered as a result of warfare. Wegener's thinking led to the paradigm shift of Earth scientist's accepting plate tectonic theory. Evidence for plate tectonics began to accumulate during World War II as naval control became a priority for countries. Money poured in as navies sought to detect other ships, particularly submarines, using techniques like radar and magnetism. High quality bathymetric data first allowed Henry Hess to hypothesize in 1962 that the seafloor was spreading, driven by mantle convection; the missing link to the plate tectonics problem. His ideas were later supported by electromagnetic surveys that showed banding of magnetic anomalies on the seafloor, a smoking gun for Hess's ideas. iii The Union of Soviet Socialist Republics (USSR) began developing nuclear weapons just as the United States (US) was leading the Manhattan project at the beginning of the Cold War. Initial efforts to better understand the technological progress and capabilities of the USSR's weapons program was found in earthquake science. Seismology first sought to understand earthquakes from an academic perspective, as earthquakes were important to study because large ones could kill hundreds of thousands and change the Earth like few other forces of nature. But it was quickly realized that earthquake location and magnitude estimation techniques could be translated to explosions. The installation of high-fidelity seismographs was useful for detecting and estimating the yield of nuclear explosions in the USSR, but also was leveraged for earthquake analysis. Large earthquakes were found to mostly occur in discrete belts worldwide, which defined the geometry of the plates and made plate tectonics even more robust. Seismology was forever changed when, in 1963, the Limited Test Ban Treaty (LTBT) was ratified by the US and USSR and the United Kingdom. This agreement made testing nuclear weapons illegal in space, underwater, and the atmosphere. The LTBT pushed nuclear weapons testing underground and provided a perfect opportunity for seismologists to contribute to international relations and geopolitics. An important part of the LTBT involved the US funding of Project Vela Uniform, which gave seismology a 3000% increase in federal funding in the course of one year. The funding was to enable scientists to effectively detect, locate, and estimate the yield of nuclear explosions. The sudden influx of resources allowed seismology to accelerate at an unprecedented rate. In 1974, testing was further constrained when the Threshold Test Ban Treaty (TTBT) was passed bilaterally between the US and the USSR. This treaty limited testing to 150 kilotons, which gave seismologists even more reasons to refine their techniques to detect smaller seismic sources. The fall of the USSR led to the Comprehensive Test Ban Treaty (CTBT) being first proposed in 1996. The CTBT is a zero-tolerance treaty, meaning weapon tests of any size are forbidden. This development gave seismologists the ultimate challenge; detect sources as small as conceivable or valuable. It was soon realized that decades of perfecting explosion analysis would need to be re-thought, as discriminating small sources would prove to be systematically different than previous larger explosions. Once a seismic source is detected using a network of seismographs, it needs to be analyzed to determine the type of source that the event is, for example, an explosion or earthquake. It is of the utmost importance to separate illegal nuclear explosions from perfectly natural, tectonic earthquakes. Techniques that accomplish this goal are known as discriminants, as they differentiate between explosions and earthquakes. Historically, discriminants were developed for large explosions, that were easily recorded at teleseismic source-to-receiver distances (>2000 km). Techniques like mb:MS and P/S amplitude ratios proved to be effective, but these worked at teleseismic distances where the wavefield is low frequency and more homogenous. At local distances (<200 km), wavefields are enriched in high frequency and are heterogenous, so the robust methods of the past are less successful. This has created a need for new discriminants to be proposed and work at local distances. Koper et al. (2016) discovered that they could discriminate between deep tectonic earthquakes and shallow mining-induced seismic events in Utah by comparing different magnitudes of the events at local distances. They proposed that ML-MC, or the difference between the local magnitude and the coda duration magnitude, was sensitive to depth. Therefore, by assuming that most earthquakes are relatively deep and all explosions occur near the surface, a jump in logic could be made that the discriminant could also differentiate between those sources. In order to evaluate this idea for application to realworld nuclear monitoring problems, it had to be better understood. First, this discriminant needed to be shown to work in a variety of geographic and geologic locations-it would not be useful if it was a phenomenon only associated with Utah. This was accomplished by Holt et al. (2019), when they showed that ML-MC worked in every place tested, like Italy, Oklahoma, and Yellowstone. The next step was to see if it worked for nuclear weapons analogs that were plentiful: mining explosions. The first chapter of this thesis answers the question of ML-MC being applied to mining explosions (Voyles et al., 2020). We rigorously analyzed thousands of explosions in Utah and showed that in general, ML-MC identified them as shallow sources. Another outcome from this project was an open-source database of high-quality explosions at local distances, which is already being used by the scientific community. This database provides an excellent opportunity to leverage machine learning techniques on the local distance discrimination problem. The second chapter of this thesis applies machine learning techniques to separate explosions from earthquakes and was published in the University of Utah's undergraduate research journal. The paper seeks to reproduce results from Linville et al. (2019) that used complicated, uninformed machine learning models to also discriminate sources at local distances. We use simple, interpretable models that are comparably computationally-inexpensive and obtain similar performance. One of the most important vi steps in establishing ML-MC is to try to reproduce empirical observations, like all prior research discussed has been, using numerical modeling. This allows for complete control over the simulation. In particular, there is a need to understand from a physical standpoint the mechanism as to why ML-MC is sensitive to depth. There are various phenomena that could account for this and there is a need to identify which are most influential. The third chapter of this thesis presents preliminary results concerning modeling and understanding why ML-MC works using high-performance computing, and it will be submitted for publication. By generating numerous models, each with a different combination and weight of the mechanisms that can explain our observations, we can tease out the model that best explains our observations. Final steps to understand ML-MC include continuing modeling, continuing to prepare catalogs of earthquakes and explosions, and test ML-MC on other nuclear explosion analogs like large chemical explosions and even nuclear explosions themselves.
Type	Text
Publisher	University of Utah
Language	eng
Rights Management	© Jonathan Ross Voyles
Format Medium	application/pdf
Permissions Reference URL	https://collections.lib.utah.edu/ark:/87278/s64r3dmh
ARK	ark:/87278/s6gf6d5t
Setname	ir_htoa
ID	1579666
OCR Text	Show THESIS ABSTRACT Few scientific fields have been as dramatically accelerated by war as Earth science. Prior to the twentieth century, Earth science was largely dominated by natural philosophers debating whether the Earth had been formed by a ball of magma that solidified or a ball of liquid that precipitated. This great debate stymied innovation and helped to set Earth science behind other fields like mathematics and physics in a time of great intellectual advancement. Things started to change in 1912 when Alfred Wegener suggested that continents could move. This notion challenged Earth scientists to change the scope of the problems they were trying to solve and incorporate their ideas with Charles Darwin’s frame of geologic time that he proposed in 1859. Earth scientists reacted negatively to Wegener’s ideas initially, but in the following years, geoscientists began to incorporate precise quantitative techniques to evaluate Wegener’s claims and the field began to mature. Sadly, Wegener died an outcast of the scientific community because he failed to give a mechanism that could cause continents to move. Interestingly, much of the evidence for his idea was later discovered as a result of warfare. Wegener’s thinking led to the paradigm shift of Earth scientist’s accepting plate tectonic theory. Evidence for plate tectonics began to accumulate during World War II as naval control became a priority for countries. Money poured in as navies sought to detect other ships, particularly submarines, using techniques like radar and magnetism. High quality bathymetric data first allowed Henry Hess to hypothesize in 1962 that the seafloor was spreading, driven by mantle convection; the missing link to the plate tectonics problem. His ideas were later supported by electromagnetic surveys that showed banding of magnetic anomalies on the seafloor, a smoking gun for Hess’s ideas. ii The Union of Soviet Socialist Republics (USSR) began developing nuclear weapons just as the United States (US) was leading the Manhattan project at the beginning of the Cold War. Initial efforts to better understand the technological progress and capabilities of the USSR’s weapons program was found in earthquake science. Seismology first sought to understand earthquakes from an academic perspective, as earthquakes were important to study because large ones could kill hundreds of thousands and change the Earth like few other forces of nature. But it was quickly realized that earthquake location and magnitude estimation techniques could be translated to explosions. The installation of high-fidelity seismographs was useful for detecting and estimating the yield of nuclear explosions in the USSR, but also was leveraged for earthquake analysis. Large earthquakes were found to mostly occur in discrete belts worldwide, which defined the geometry of the plates and made plate tectonics even more robust. Seismology was forever changed when, in 1963, the Limited Test Ban Treaty (LTBT) was ratified by the US and USSR and the United Kingdom. This agreement made testing nuclear weapons illegal in space, underwater, and the atmosphere. The LTBT pushed nuclear weapons testing underground and provided a perfect opportunity for seismologists to contribute to international relations and geopolitics. An important part of the LTBT involved the US funding of Project Vela Uniform, which gave seismology a 3000% increase in federal funding in the course of one year. The funding was to enable scientists to effectively detect, locate, and estimate the yield of nuclear explosions. The sudden influx of resources allowed seismology to accelerate at an unprecedented rate. In 1974, testing was further constrained when the Threshold Test Ban Treaty (TTBT) was passed bilaterally between the US and the USSR. This treaty iii limited testing to 150 kilotons, which gave seismologists even more reasons to refine their techniques to detect smaller seismic sources. The fall of the USSR led to the Comprehensive Test Ban Treaty (CTBT) being first proposed in 1996. The CTBT is a zero-tolerance treaty, meaning weapon tests of any size are forbidden. This development gave seismologists the ultimate challenge; detect sources as small as conceivable or valuable. It was soon realized that decades of perfecting explosion analysis would need to be re-thought, as discriminating small sources would prove to be systematically different than previous larger explosions. Once a seismic source is detected using a network of seismographs, it needs to be analyzed to determine the type of source that the event is, for example, an explosion or earthquake. It is of the utmost importance to separate illegal nuclear explosions from perfectly natural, tectonic earthquakes. Techniques that accomplish this goal are known as discriminants, as they differentiate between explosions and earthquakes. Historically, discriminants were developed for large explosions, that were easily recorded at teleseismic source-to-receiver distances (>2000 km). Techniques like mb:MS and P/S amplitude ratios proved to be effective, but these worked at teleseismic distances where the wavefield is low frequency and more homogenous. At local distances (<200 km), wavefields are enriched in high frequency and are heterogenous, so the robust methods of the past are less successful. This has created a need for new discriminants to be proposed and work at local distances. Koper et al. (2016) discovered that they could discriminate between deep tectonic earthquakes and shallow mining-induced seismic events in Utah by comparing different magnitudes of the events at local distances. They proposed that ML-MC, or the difference iv between the local magnitude and the coda duration magnitude, was sensitive to depth. Therefore, by assuming that most earthquakes are relatively deep and all explosions occur near the surface, a jump in logic could be made that the discriminant could also differentiate between those sources. In order to evaluate this idea for application to realworld nuclear monitoring problems, it had to be better understood. First, this discriminant needed to be shown to work in a variety of geographic and geologic locations—it would not be useful if it was a phenomenon only associated with Utah. This was accomplished by Holt et al. (2019), when they showed that ML-MC worked in every place tested, like Italy, Oklahoma, and Yellowstone. The next step was to see if it worked for nuclear weapons analogs that were plentiful: mining explosions. The first chapter of this thesis answers the question of ML-MC being applied to mining explosions (Voyles et al., 2020). We rigorously analyzed thousands of explosions in Utah and showed that in general, ML-MC identified them as shallow sources. Another outcome from this project was an open-source database of high-quality explosions at local distances, which is already being used by the scientific community. This database provides an excellent opportunity to leverage machine learning techniques on the local distance discrimination problem. The second chapter of this thesis applies machine learning techniques to separate explosions from earthquakes and was published in the University of Utah’s undergraduate research journal. The paper seeks to reproduce results from Linville et al. (2019) that used complicated, uninformed machine learning models to also discriminate sources at local distances. We use simple, interpretable models that are comparably computationally-inexpensive and obtain similar performance. One of the most important v steps in establishing ML-MC is to try to reproduce empirical observations, like all prior research discussed has been, using numerical modeling. This allows for complete control over the simulation. In particular, there is a need to understand from a physical standpoint the mechanism as to why ML-MC is sensitive to depth. There are various phenomena that could account for this and there is a need to identify which are most influential. The third chapter of this thesis presents preliminary results concerning modeling and understanding why ML-MC works using high-performance computing, and it will be submitted for publication. By generating numerous models, each with a different combination and weight of the mechanisms that can explain our observations, we can tease out the model that best explains our observations. Final steps to understand ML-MC include continuing modeling, continuing to prepare catalogs of earthquakes and explosions, and test ML-MC on other nuclear explosion analogs like large chemical explosions and even nuclear explosions themselves. vi TABLE OF CONTENTS THESIS ABSTRACT ii CHAPTER I 1 ABSTRACT 2 INTRODUCTION 3 SEISMIC MONITORING IN THE UTAH REGION 5 REANALYSIS OF LIKELY EXPLOSIONS 8 RESULTS 10 REGION A 11 REGION B 11 REGION C 12 REGION D 12 REGION E 13 REGION F 13 REGION G 14 REGION H 14 REGION I 15 REGION J 15 REGION K 15 REGION L 16 REGION M 16 REGION N 17 vii REGION O 17 REGION P 18 REGION Q 19 REGION R 19 REGION S 20 REGION T 21 REGION U 21 REGION V 22 REGION W 22 REGION X 23 REGION Y 23 REGION Z 24 DISCUSSION OF ML-MC OBSERVATIONS FOR UTAH EXPLOSIONS 24 CONCLUSIONS 27 DATA AND RESOURCES 29 ACKNOWLEDGEMENTS 29 FIGURES AND TABLES 30 CHAPTER II 42 ABSTRACT 43 INTRODUCTION 44 DATA AND FEATURE ENGINEERING 46 MODELS AND RESULTS 48 DISCUSSION 50 viii CONCLUSIONS 51 FIGURES AND TABLES 53 CHAPTER III 64 ABSTRACT 65 INTRODUCTION 66 SOURCES AND MODELS 68 RESULTS 70 DISCUSSION 72 CONCLUSIONS 73 ACKNOWELDGEMENTS 74 FIGURES 75 REFERENCES 84 ix 1 CHAPTER I Adapted from: A New Catalog of Explosion Source Parameters in the Utah Region with Application to ML-MC Based Depth Discrimination at Local Distances Jonathan R. Voyles1, Monique M. Holt1, J. Mark Hale1, Keith D. Koper1, Relu Burlacu1, and Derrick J. A. Chambers2 1 Dept. of Geology and Geophysics, University of Utah, Salt Lake City, UT, 84112, USA 2 Spokane Mining Research Division, National Institute for Occupational Safety and Health, Spokane, WA, 88207, USA Published in Seismological Research Letters, 1 January 2020, volume 91, number 1, pages 222-236, DOI: https://doi.org/10.1785/0220190185 2 ABSTRACT A catalog of explosion source parameters is valuable for testing methods of source classification in seismically active regions. We develop a manually reviewed catalog of explosions in the Utah region for 1 October 2012 – 30 June 2018 and use it to assess a newly proposed, magnitude-based depth discriminant. Within the Utah region we define 26 event clusters which are primarily associated with mine blasts but also include explosions from weapons testing and disposal. The catalog refinement process consists of confirming the explosion source labels, revising the local (ML) and coda duration (MC) magnitudes, and relocating the hypocenters. The primary features used to determine source labels are waveform characteristics such as frequency content, the proximity of the preliminary epicenter to a permitted blast region, the time of day, and prior notification from mine operators. We reviewed 2,199 seismic events of which 1,545 are explosions, 459 are local earthquakes, and 195 are other event types. Of the reviewed events, 127 (5.8%) were reclassified with new labels. Over 74% of the reviewed explosions have both ML and MC, a sizable improvement over the unreviewed catalog (65%). The mean ML-MC value for the new explosion catalog is -0.196 ± 0.017 (95% confidence interval) compared to a previously determined value of 0.048 ± 0.008 for naturally occurring earthquakes in the Utah region. The shallow depths of the explosions lead to enhanced coda production, which in turn leads to anomalously large MC values. This finding confirms that ML-MC is a useful metric for discriminating explosions from deeper tectonic earthquakes in Utah. However, there is significant variation in ML-MC among the 26 explosion source regions, suggesting that ML-MC observations should be 3 combined with other classification metrics to achieve the best performance in distinguishing explosions from earthquakes. INTRODUCTION An important task in regional seismic monitoring is distinguishing explosions from earthquakes. Misidentified explosions are problematic because they contaminate earthquake catalogs that are used to assess seismic hazard. Identifying small explosions at local-to-regional distances is also important for monitoring the zero-tolerance Comprehensive Nuclear-Test-Ban Treaty (Bowers and Selby, 2009). In this context, it is especially important to identify common industrial explosions, such as mine blasts, which might otherwise lead to false alarms (Richards et al., 1992; National Research Council, 1998; Stump et al., 2002). Several methods have been evaluated for seismic identification of mine blasts at local-to-regional distances including P/Lg amplitude ratios (Kim et al., 1993), Sg/Rg amplitude ratios (Tibi et al., 2018), Rg excitation (Goforth and Bonner, 1995), spectral modulations (Baumgardt and Ziegler, 1988; Arrowsmith et al., 2006), magnitude differences (Zeiler and Velasco, 2009), and spectral deviation from an earthquake source model (Allman et al., 1998). More recently, approaches based on machine learning have shown promise (Linville et al., 2019). Simpler considerations such as a daytime occurrence (Wiemer and Baer, 2000), waveform similarity with previous known explosions (Gibbons and Ringdal, 2006), and location near a permitted blasting region (Astiz et al., 2014) are also useful for identifying mine blasts. 4 In this study, we develop a new catalog of explosion source parameters in the Utah region which can be combined with the existing regional earthquake catalog to evaluate methods of source discrimination. We manually review 2,199 seismic events that occurred within or adjacent to the Utah region between 1 October 2012 and 30 June 2018. We repick the arrival times, relocate the hypocenters, and recalculate the magnitudes, resulting in revised solutions for 1,545 explosions, which are assigned to one of 26 distinct source regions. The explosions are mostly mining-related quarry blasts, but include some single-fired, above-ground explosions carried out by government agencies. We use the new catalog to examine the ability of a recently proposed (Koper et al., 2016; Holt et al., 2019) depth discriminant—the difference between local magnitude (ML) and coda duration magnitude (MC)—to separate explosions from earthquakes in the Utah region. Previous work found a mean ML-MC value of 0.048 ± 0.008 for 3,957 tectonic earthquakes in Utah and a mean ML-MC value of -0.137 ± 0.008 for 3,723 likely explosions in Utah (Koper et al., 2016). Their interpretation was that near surface explosions are efficient at generating extended coda waves, perhaps because of Rg excitation and scattering, leading to anomalously high MC values. The waveforms of the likely explosions studied in Koper et al. (2016) were analyzed less rigorously than those from events initially classified as earthquakes because the primary mission of the University of Utah Seismograph Stations (UUSS) is mitigating earthquake risk in the Utah region. The epicenters are generally of high-quality because they are a factor in the initial source classification process. However, for a given mine region the arrival times were picked by different analysts over a period of many years. The relative locations can be improved by using a single analyst to repick arrival times 5 from events in a given source region at the same time—as we do here. The magnitudes of the likely explosions studied in Koper et al. (2016) are generally of lesser quality than those of earthquakes characterized by UUSS because magnitudes are not used in the initial source classification process. The careful waveform reanalysis we do here leads to more accurate magnitudes and helps us identify any misclassified events, in turn leading to a more rigorous evaluation of ML-MC as a source classifier. SEISMIC MONITORING IN THE UTAH REGION The University of Utah Seismograph Stations (UUSS) operates the University of Utah Regional Seismic Network (UU; University of Utah, 1962), which, as of 30 June 2018, consisted of 39 broadband, 67 short-period, and 78 strong-motion seismometers (Figure 1). Continuous seismic data from this network are telemetered to the UUSS earthquake information center and combined with data recorded from neighboring seismic networks to detect, locate, and characterize seismic events. This process uses the ANSS Quake Management System (AQMS) software package, where ANSS is the Advanced National Seismic System. All automatically generated AQMS solutions are reviewed and refined by UUSS analysts. For AQMS solutions with magnitudes larger than M2.5–3.5, depending on the region, a duty seismologist reviews the event within 30–60 minutes and submits refined earthquake solutions to the ANSS Comprehensive Earthquake Catalog. UUSS also publishes online reports with finalized earthquake solutions on a quarterly basis (quake.utah.edu). The UUSS uses ML and MC to describe the magnitude of small earthquakes. ML is calculated from the horizontal channels of broadband stations using the equation 6 ML = log10[A] – log10[A0] + S (1) where A is half of the sum of the maximum north-south channel peak-to-peak amplitude (mm) for a single cycle divided by two and the maximum east-west channel peak-to-peak amplitude (mm) for a single cycle divided by two on an emulated Wood-Anderson seismograph, log10[A0] is an empirical distance correction, and S is an empirical station correction (Pechmann et al., 2007). MC is calculated from the vertical channel of shortperiod or filtered broadband stations using the equation MC = -2.25 + 2.32log10[τ] + 0.0023Δ (2) where τ is measured as the time difference in seconds from the P-wave arrival to the time that the average absolute value of the ground velocity drops below 0.01724 μm/s, and Δ is the epicentral distance in km (Pechmann et al., 2006). An important task for UUSS analysts is determining whether an automaticallydetected seismic event in the AQMS database is an earthquake (either local, regional, or teleseismic), a likely explosion, or a noise trigger. Because the UUSS mission focuses on reducing earthquake risk in Utah, AQMS solutions determined to be explosions are reviewed with less stringent quality criteria and excluded from the UUSS earthquake catalog. An analyst classifies a reviewed AQMS solution as a likely explosion if its epicenter is within or very near a permitted blasting region, it occurred during daylight hours (typically 13:00 – 03:00 GMT) when surface blasting is permitted in Utah, its waveforms look similar to those of previous explosions in the area, or if UUSS is given prior notification from the mine operator. In the AQMS database these events are labelled either as “qb” for quarry blast or “ex” for explosion. For convenience, in this paper we refer to all such events as “LE” for likely explosion. 7 There are two types of explosions that are most prominent in Utah: disposal and testing of military weapons, and quarry blasts. The most common military explosions involve the above-ground destruction of rocket motors at the Utah Test and Training Range (Stump et al., 2007). A smaller number of military-related aboveground explosions are carried out at Dugway Proving Ground (www.dugway.army.mil). Nonmilitary surface blasts, shallow ripple-fired blasts, and shallow single-fired blasts are generally used to develop resources such as precious metals, minerals, and gravel, and are classified as quarry blasts. In 2010, a list of Utah mines with active permits for surface blasting was obtained from the Utah Division of Oil, Gas, and Mining (www.ogm.utah.gov). The list was incorporated into the AQMS system to aid UUSS analysts. If an analyst-refined event epicenter is located within either 5 km or 20 km of a permitted mine, depending on the station coverage in that area, AQMS alerts the analyst that the event might be an explosion. The AQMS-generated nearby-mine warning can bias the analyst towards an explosion label, because a nearby mine warning frequently coincides with an actual explosion. As a result, LE’s are usually confined to the areas around known, active mines. An LE that does not locate near one of these mines could have an inaccurate location from analyst error or could be associated with a mine that has infrequent or small explosions. 8 REANALYSIS OF LIKELY EXPLOSIONS Since 1 October 2012, when UUSS started using AQMS software, thousands of likely explosions have been recorded. In this study, we reanalyze these events with the goal of confirming their source type and refining their magnitudes and, when appropriate, their locations. Generally, first arrival picks on LE waveforms are made with the same standards as those for earthquake waveforms. This is because part of the initial classification process uses the proximity of the epicenter to active mines as a criterion. However, the majority of explosion magnitudes are automated AQMS solutions which are less robust. Therefore, our reanalysis was more heavily focused on improving magnitude estimations. Because analysts work on events chronologically, as they occur, it is reasonable to reanalyze the LE’s in this manner. However, reanalyzing LE’s chronologically recreates the same problem the original analysts faced—the wide geographic distribution of the various sources makes event classification challenging. To mitigate this issue, and increase the accuracy of our findings, we reanalyze LE’s geographically. The first step is an examination of all seismicity within a defined latitude and longitude range encompassing a known blasting region, in order to establish a qualitative understanding of the waveform differences between earthquakes and explosions in that area. This is the most valuable tool in differentiating and labelling source types correctly because earthquakes and explosions that occur in the same geologic setting generally have distinct waveforms (Figure 2). Other factors considered for source classification include: preliminary epicentral location, P-wave first motions, and time of day. Unlike nearsurface blasting, underground blasting is not limited to daylight hours, but these blasts are 9 typically much smaller than surface blasts and unlikely to trigger the regional network. Other classification tools include checking for acoustic arrivals on seismic stations and communicating with mine operators for ground truth information. Reclassification results in one of five outcomes. An event that was originally classified as an earthquake, and remains classified as an earthquake upon reanalysis, is described as a true positive (TQ); if the event is reclassified as an explosion, then it is described as a false positive (FQ). An event that was originally classified as an explosion, and kept as an explosion, is described as a true negative (TX), whereas the same event reclassified as an earthquake is described as a false negative (FX). If the reclassification of an event cannot be described with this scheme, then the event is labeled as “other” (O). Teleseismic events, noise triggers, and poorly-recorded events are examples of “other” classifications. Once the event type is determined, we reevaluate peak-to-peak amplitude measurements for ML determination, signal duration measurements for MC determination, and arrival time picks for hypocenter determination. After all picks and measurements have been reviewed, we relocate the event, recalculate the associated magnitudes, and save the event to the AQMS database. Locations are calculated using HYPOINVERSE (Klein, 2002) with a set of region-specific one-dimensional velocity models. Depth control is limited for blasts because of the rarity of discernable S-waves and the low density of seismometers near blasting sites. Therefore, most explosions in our database are located with fixed depths of 2 km above sea level (h = -2 km) which is the approximate average elevation of permitted mines listed by the Utah Division of Oil, Gas, and Mining. 10 RESULTS In Table 1, we summarize the outcome of the reanalysis for each of the 26 geographical clusters shown in Figure 1. We report the total number of reviewed events and subtotals in each of the five classes: TQ, FQ, TX, FX, and O. We assess the quality of the original source labels using binary classification theory. Sensitivity measures the rate of true positives and is given by TQ/(TQ+FX), which varies between 0 and 1. High sensitivity means that UUSS analysts missed very few actual earthquakes in their original analysis. Specificity measures the rate of true negatives and in our nomenclature is given by TX/(TX+FQ), which also varies between 0 and 1. High specificity means that very few explosions were originally misclassified as earthquakes by UUSS analysts. For the 2,004 reviewed events that were not labelled O, the original analyst classifications had a sensitivity of 0.92 and a specificity of 0.96. Figure 3 illustrates the variation in sensitivity and specificity from region to region. Note that for regions with no TQ’s, the sensitivity is reported as 1.00. In Table 1 we also report the number of explosions with valid magnitude estimates for both ML and MC in each region. Overall, we were able to estimate both magnitudes for 1,148 of the 1,545 explosions. For explosions occurring in sparsely instrumented portions of the Utah region it was often difficult to estimate ML because of the lack of nearby broadband stations with calibrated ML station corrections. Very small explosions were also problematic because of the relative sparsity of broadband stations. Below we give details about each geographical cluster as defined in Figure 1. 11 REGION A Region A is located in the southwest corner of Utah on the border with Arizona. Station coverage is heavily biased toward the north-northeast leading to an average azimuthal gap (AAG) of 196° in the relocations. The nearest broadband station is UU.LCMT (where UU is the network code and LCMT is the station code), which is located about ~10 km east of the event cluster. Large amplitude, low-frequency surface wave content (Rg) is characteristic of the explosion waveforms. We reviewed 81 seismic events in this cluster, and identified 41 as explosions, 21 of which were originally identified as explosions (TX) and 20 which were originally misclassified as earthquakes (FQ). This was the highest fraction of false positives for any region. While the sensitivity of the original analyst classifications was higher than average at 0.98, the specificity was the lowest in any of the 26 regions at 0.51. REGION B Region B is located in southern Utah about 150 km northeast of Region A. Station coverage is again biased, to the north and west, but the coverage is more uniform than in Region A and the AAG is correspondingly lower at 125°. The nearest broadband station is UU.PKCU, which is located only 5–10 km to the east of the event cluster. Emergent, spindle-shaped waveforms are again observed for the explosion waveforms. The original classifications had a perfect sensitivity of 1.00, but again a relatively low specificity of 0.71. 12 REGION C Region C is located is southwestern Utah with station coverage concentrated to the south and east yielding an AAG of 147°. The nearest broadband station is UU.SZCU, which is located about 20 km east of the event cluster. The original analyst classifications had lower than average sensitivity (0.74). After thorough reanalysis, including conversations with mine operators, 18 events that were originally classified as explosions were changed to earthquakes. In general, the misclassified earthquakes had emergent P waves, which made it difficult to determine first motions; however, these waveforms also tended to have impulsive S waves that were absent on the explosion waveforms. This was the only region in which ML and MC estimates could be made for all the verified explosions. REGION D Region D is southwest of the Paradox Valley area in western Colorado, a region that experiences induced earthquakes from fluid injection (Figure 4; Block et al., 2015; Yeck et al., 2015; King et al., 2016). The Bureau of Reclamation runs a small aperture network (RE) in the region to monitor the induced seismicity. The network began recording in 1985 and has recently been updated with 20 broadband three-component seismometers. Station RE.PV05 is within 1–2 km of the cluster of explosion sources, which are located about 20 km south-southwest of the injection well—where the induced earthquakes are concentrated. The nearest UUSS station is UU.CRLU, which is located about 10 km to the north-northwest of the explosion cluster. It consists of three strongmotion channels and a short-period, vertical component channel. 13 In comparison to Regions A–C, Region D has a much lower percentage, 18% (Table 1), of total events that are earthquakes (natural and induced). A typical explosion waveform here has large amplitude surface waves and clear changes in frequency content over time—a higher frequency P-wave packet is followed by a lower-frequency surface wave (Rg) train. At greater distances, these waveforms become very emergent, which contributes to less accurate locations. The emergent nature of the first arrivals resulted in 4 events initially classified as EQ being reclassified as EX. Overall, the specificity was still quite high at 0.96, and the sensitivity was 1.00—indicating that no earthquakes were missed during the original analysis. REGION E Similar to Region D, Region E is located in western Colorado near Paradox Valley, about 30 km east of the injection well. The nearest station, RE.PV15, is a few kilometers to the northeast of the explosions. In contrast to Region D, the waveforms from Region E have strongly impulsive first arrivals. The waveform quality, distinct geographical location, and different closest stations, are what distinguished this cluster from Region D. This region exhibits a notable lack of earthquakes (natural and induced). There were no sources in this region that were originally misclassified. REGION F Region F is located in southwestern Utah near a region of high heat flow that is the site of a Department of Energy funded project called Frontier Observatory for Research in Geothermal Energy (FORGE, Pankow et al., 2017). UUSS station coverage 14 is very good in this area with four new broadband stations installed within 20 km of the explosion cluster during the last several years. Several other short-period stations have long been sited within ~40 km of the cluster. Out of 17 earthquakes and 61 explosions that were reanalyzed, none were misclassified. The explosion waveforms in this region were quite distinct from earthquake waveforms. REGION G Region G in west-central Utah is an area with infrequent natural seismicity. The closest broadband station, UU.SWUT, is located about 50 km to the north-northwest of the explosion cluster. These explosion waveforms exhibit particularly impulsive P waves, high amplitude S waves, and less distinctive Rg waves than the explosion waveforms in most other regions. The waveforms at the closest station, IMU, are very consistent leading to few misclassified events. However, the atypical Rg waves resulted in 3 FQ reclassifications, as surface wave propagation here is more similar to that of an earthquake than an explosion. Nevertheless, specificity was high at 0.96. REGION H Region H is located in central Utah and exhibits three distinct sub-clusters of explosions. The nearest broadband station, UU.NLU, is located 20–40 km north of the seismicity. Station coverage is good, with the only gap to the southwest. The southernmost sub-cluster exhibits typical explosion waveforms with consistent frequency content. Events in the two northern clusters have waveforms with surface waves of longer 15 duration and lower frequency. This discrepancy contributed to 6 FX reclassifications (missed earthquakes) and a sensitivity of only 0.84. The specificity was excellent at 0.98. REGION I Region I is located in a sparsely instrumented part of west-central Utah. The nearest broadband station, US.DUG, is located about 75 km to the northeast. A shortperiod vertical component station, UU.FSU is located about 5 km to the west. The explosion waveforms are very distinct from earthquake waveforms in this region. We used the P and S arrival-time moveout from the closest station, UU.FSU, to a farther station, UU.FLU, to distinguish earthquakes from explosions in Region I. The separation of the body waves at distance allows the analyst to observe the energy of S relative to P, and make a more confident choice of event type. Sensitivity and specificity were both high. REGION J Region J is located in a densely instrumented part of central Utah. The nearest broadband station, UU.MPU, is located 5–10 km to the east, and a second broadband, UU.NLU, is located 10–15 km to the west. Perhaps because of the good station coverage, none of the 33 events were reclassified. REGION K Region K is one of two areas in our catalog that is subject to periodic weapons testing and disposal. It is located in west-central Utah and contains the U.S. Army 16 Dugway Proving Ground (DPG). UUSS has a working relationship with DPG and has ground truth information (test confirmation and payload description) for 8 of the 9 explosions recorded in this region. UUSS is familiar with DPG explosive practices and the event depths in this region are fixed to -1.3 km, the elevation of the test site. Dugway explosions typically have an emergent first arrival and a highly energetic, long-duration surface wave train. These explosions are best recorded on station UU.DUG, deployed on DPG but about 20–30 km east of the seismic events. Though UUSS has ground truth for DPG events, high-accuracy seismic locations can be difficult to determine because of station gaps to the north and west. No events were misclassified in this region. REGION L Region L is in an area of central-northern Utah with dense station coverage but infrequent natural seismicity. The nearest broadband station, UU.MPU, is located about 15 km southeast of the cluster, and 5 other stations are located at shorter distances. Natural earthquakes are infrequent in this area and only 1 of the 31 events reviewed here was an earthquake. None of the events were misclassified. REGION M Similar to Region L, Region M is located in an area of central Utah with infrequent natural seismicity (3 EQ versus 70 EX). The relocated epicenters cluster tightly, but with some northwest-southeast scatter. Two broadband stations, UU.NLU and UU.MPU, are located within 40 km of the event cluster. There are 7 type-O events here, which are poorly recorded, small-magnitude LE’s. The EX waveforms are consistent 17 from event to event and with other mines (emergent arrivals, frequency content, and decay rate). The low rate of natural seismicity and the consistency of the waveforms contributed to there being no misclassifications. REGION N Region N is located in north-central Utah in a densely instrumented part of the network. The nearest broadband, UU.NOQ, is about 15 km northwest of the seismicity, but nearly 10 other seismograph stations are located at similar or closer distances. This region encompasses a mine as well as an earthquake swarm that occurred in June 2017 leading to a perfect balance of source types (41 EQ and 41 EX). There were no misclassifications in this region. REGION O Region O is nearby Regions M and N and is an area of dense station coverage. This coverage outweighs the low magnitude of the explosions which allows for 90% of the events to have calculable ML and MC values. The first arrivals at station UU.WTU are emergent, leading to relatively dispersed epicenters for this region. UU.WTU is used in tandem with UU.GMU for discrimination based on phase moveout. Even with a low rate of seismicity, there are 3 FX reclassifications, giving a sensitivity of 0.25, the lowest in any region. The P wave frequency content and S wave frequency content for some of the blasts in this region looked similar to that of an earthquake, but these events were correctly classified after finding subtle similarities in waveform characteristics between the events when all were simultaneously compared. 18 REGION P Region P is located in the Oquirrh mountain range in north central Utah (Figure 5). The nearest broadband station, UU.NOQ, is 5–10 km to the north of the clustered seismicity. Region P contains one of the largest open pit mines in the world, which is a source of frequent explosions. A variety of blasting practices are used to mine primarily porphyry copper and other precious metals such as gold and molybdenum. Dense station coverage and the relatively large magnitude of these events result in well constrained locations. There is only 1 FQ (specificity ~1.0) because UUSS analysts are very familiar with this mine. In contrast, there are a significant number of FXs (6), leading to a relatively low sensitivity of 0.79. This is likely due to the 70:1 ratio of LE’s to earthquakes in this region. The 2,817 LE’s in Region P required a different analysis strategy than the other regions. We chose to evaluate every tenth explosion in chronological order for efficiency. Through selecting a subset chronologically, we believed we would get a representative sample of epicenters, magnitudes, blasting practices, and times of day for the LE’s. In this region, three stations are used to verify that an event is from this mine. UU.MID is normally the closest station to an event and UU.CWU and UU.NOQ, while farther away, also have clear waveforms. Although only a tenth of all Region P LE’s were carefully analyzed, every remaining event was briefly checked for duplication and time of day. Duplication of events occurs in other mining regions, but was most commonly observed in Region P. Occasionally, an event is large enough to trigger the AQMS automatic procedure and this automatic solution is not reviewed by an analyst. The analyst may then process a separate 19 trigger with the same event and two solutions will exist in the database. This occurred for 25 events in Region P. REGION Q Region Q is in north-central Utah about 50 km east of Region P, but still within the densest portion of the UUSS regional network. The broadband station UU.JLU is within the LE cluster, which is relatively dispersed compared to the tight epicentral clustering observed in other regions. Region Q events are processed as having three, indistinct sub-clusters of seismic activity: two east-west lineations (sub-clusters I and II) and a southern group of events (sub-cluster III). Sub-cluster I is the northernmost subset of events within this region and its seismicity is best recorded on UU.RCJ. The explosion waveforms typically have high-amplitude, low-frequency, impulsively-arriving, and quickly-decaying surface waves. Events within sub-cluster II can be described as more emergent and smaller amplitude than sub-cluster I. These events are often well-recorded on UU.KLJ and UU.JLU. Sub-cluster III events demonstrate clear frequency changes upon the arrival of S waves. Station UU.HEB records the highest quality waveforms for these LE’s. Both sensitivity (0.98) and specificity (0.93) are high in this region. REGION R Region R is in the northeastern corner of Utah where station density is low, leading to an AAG of 200 (Figure 6). Broadband station UU.RDMU, however, is located adjacent to the southern edge of the LE cluster. Region R contains an active phosphate mine. Natural earthquakes are infrequent in this region and all the events were correctly 20 classified. Additionally, 99% of events had both an ML and MC measurement, making Region R among the top 2 mines with both ML and MC values. This is mainly owing to the large size of the events here. This mine sets off explosives in double shots, an uncommon practice in Utah. The majority of shots have an eight second delay, plus or minus a few seconds. Only station UU.RDMU was close enough to consistently record the double shots clearly, though they are occasionally visible at UU.VNL. We limit our reanalysis to 67 of the 185 LE’s in Region R: single blasts (very infrequent), double blasts with enough delay (~10 s) to distinguish the two shots, and double blasts that essentially overlap, but have a larger second shot. The goal is to avoid double shots that are indistinguishable since they would yield an artificially high MC. Events with an unclear number of shots, or unrecorded on UU.RDMU, or both, are not analyzed. Double shots from this mine can look similar to earthquakes; the P wave of the second shot can look like the S wave of the first shot. The analyst looks for the number of phases as distance increases. An earthquake recorded at local distances will have two distinct body wave phases (P and S), whereas a double blast would yield four phases (P and S for each blast). REGION S Region S is located in north-central Utah where station density is highest. The nearest broadband station, UU.CTU, is within a few kilometers of the LE cluster. Good station coverage (65% of events have ML and MC estimates) and extremely clear, consistent, and blast-like waveforms at UU.CTU contribute towards making confident 21 classifications (specificity of 1.00). There were no natural earthquakes in Region S during the time period covered in this study. REGION T Region T is nearby Region S and includes one of the quarries that UUSS analysts are very familiar with (Figure 7). This familiarity results in no event reclassifications in this region. UU.HRU, the closest station to most of the event epicenters, and UU.RBU are used in tandem for classification. Along with region N, region T also encompasses an earthquake swarm, which included 7 events located southwest of the LE cluster in March 2012. Over 90% of the 200 events have both ML and MC. In general, the strong overall station coverage outweighs the small size of the events and results in well-constrained epicenters and magnitudes. REGION U Region U is also in the high-density portion of the UUSS regional network about 20 km northeast of Region T. The nearest broadband station, UU.CTU, is about 10 km to the east of the LE cluster, and the strong-motion station UU.MOR is essentially collocated with the LE’s. Region U has slightly more earthquakes than the neighboring regions to the south (4 EQ versus 22 EX) and the LE’s are quite small. These factors lead to 3 FQ classifications and a relatively low specificity of 0.86. None of the four earthquakes were previously misclassified, giving a sensitivity of 1.00. Both ML and MC estimates were available for only 9 of the 22 LE’s because of their small size. 22 REGION V Region V borders Region U to the east, and shares the same good station coverage. The waveforms from the 8 earthquakes in this region are distinct from the waveforms of the 14 explosions, and none of the events were reclassified. Typical explosion waveforms with large-amplitude surface waves are best observed at broadband station UU.TCU, which is a few kilometers to the east. The earthquakes in the region have impulsive S waves at UU.TCU. REGION W Region W is just west of the Great Salt Lake in northern Utah. It is on the western edge of the high-density portion of the UUSS seismic network and the nearest broadband station, UU.SPU, is about 40 km to the northeast. Region W is home to the Hill Air Force Base’s Utah Test and Training Range (UTTR). The UTTR is a Major Range and Test Facility run by the Department of Defense which is remote enough to evaluate weapons that could be too impactful to test elsewhere (Hedlin et al., 2012). In addition to the 178 analyzed explosions at UTTR, Region W has a distinct cluster of events about 8 km north of the main UTTR blasting platform (Figure 8). UTTR explosions from the main platform have waveforms with impulsive P waves, slightly less impulsive S waves at larger distances, and well-defined frequency changes between the two. The cluster to the north has emergent first arrivals. The aboveground nature of these explosions means that surface waves (Rg) are not as effectively produced and discrimination is more difficult. We used stations UU.BGU, UU.SNUT, and UU.SPU for discrimination. UU.SPU records the least number 23 of events, but is the most useful in discrimination because the S wave is most prominent. Station coverage is poor in Region W, with an AAG of 170. Most of the UTTR explosions occur at certain times of the day during certain months of the year. Though there are 8 FQ’s in this region, the consistency in blasting origin time, high frequency of testing, and low rate of natural seismicity make discrimination straightforward. The availability of ground truth information regarding UTTR explosions allowed us to fix the depths at -1.5 km. REGION X Region X is located in a densely instrumented part of northern Utah. The nearest broadband station, UU.SPU, is about 20 km to the west. Three other stations, UU.GZU, UU.PCL, and UU.BES, are located with a few kilometers of the explosion cluster. Like Region N, Region X encompasses an earthquake swarm as well as the explosion cluster. The earthquake swarm was clearly recognized by analysts, so there are no reclassifications in this region. 70% of the events in this region have an ML and MC measurement which is likely a result of the relatively large size of the explosions. REGION Y Region Y is about 20 km north of Region X and is likewise densely instrumented. The closest station to the events in region Y (UU.BCU) is close to the average epicentral centroid; however, this station is usually too noisy to contribute to discrimination. Instead, we primarily relied on UU.WVUT, UU.LTU, and UU.HONU. The LE’s in this 24 area have dominant surface waves that make explosion sources obvious. The sensitivity was 1.00 and the specificity was 0.91. REGION Z Region Z is located about 20 km west of Region Y in northern Utah. The explosion cluster is about 40 km northeast of broadband station UU.SPU. Short-period station UU.LTU is essentially collocated with the LE’s but event locations are hard to constrain because first arrivals are emergent and only propagate to a few nearby stations. The three FQ reclassifications (specificity of 0.77) are likely a consequence of clear S waves in the LE waveforms. DISCUSSION OF ML-MC OBSERVATIONS FOR UTAH EXPLOSIONS We present the original and revised ML-MC distributions for the same group of 1,007 reanalyzed explosions in Figure 9. The revised distribution is more asymmetrical and less biased than the original distribution. The shift in ML-MC that occurred during reanalysis reflects larger revisions to MC than to ML (Figures 4–8), likely because peakto-peak amplitude picks are simpler to make than choosing coda duration windows. The changes in MC values are mostly due to the redetermination of the signal duration, as the human-reviewed duration is often very different than the automatic-solution since the automatic-solution was calibrated for earthquakes, not explosions. MC values are also changed through relocation, as relocation changes the epicentral distance, but these changes are minimal relative to the change in MC from reevaluated signal duration measurements. During reanalysis the number of explosions with both ML and MC grew to 25 1,148, with a mean of -0.196 ± 0.017 (95% confidence interval). Previous work in Utah found a mean ML-MC value of -0.137 ± 0.008 for likely explosions and a mean value of 0.048 ± 0.008 for naturally occurring tectonic earthquakes, which dominantly occur at depths of 5–15 km (Koper et al., 2016). Hence, the work presented here corroborates the usefulness of ML-MC as a crustal depth discriminant in Utah, helping to distinguish extremely shallow seismic events from deeper, naturally occurring earthquakes. It is interesting to consider how ML-MC varies regionally since there is a range of source types, propagation paths, and site effects across the Utah region. In Figure 10, we present ML-MC mean values and 95% confidence intervals for each region with at least 5 ML-MC observations. Of the 22 regions that meet this requirement, 19 have ML-MC means that are more negative than the overall earthquake mean at confidence levels above 95%. Two regions (F and U) have ML-MC 95% confidence intervals that include the earthquake range. Only Region G has an ML-MC 95% confidence interval that is more positive than the earthquake range. ML-MC has a stronger dependence on ML than MC (Figure 10). Events with larger ML tend to have larger, more earthquake-like values of ML-MC. We attribute this to UUSS/AQMS network processing techniques. All of the individual station MC values are averaged to calculate the event MC. Any station MC value is discarded from the calculation of the event MC if it differs by more than 0.8 magnitude units from the event MC. This leads to a distance dependence of MC. Small events are only recorded by nearby stations, and those seismograms tend to have anomalously long durations relative to typical peak-to-peak amplitudes because Rg energy, which is strong at close distances, extends duration more than it increases amplitude (MC is increased relative to ML). Large 26 events are mostly recorded at farther stations where Rg is less observable, so waveforms tend to have more typical durations and MC and ML are in agreement. In other words, the distance dependence of the UUSS MC formula is too weak to account for the true distance-dependent change in coda energy from explosions. If these anomalously large station MC values were not eliminated from the event averages, many regions (such as P) would have more negative ML-MC values. The relatively high values of ML-MC in Regions F, U, and G noted earlier corroborate this network processing explanation. Other source-specific factors also play a role in the regional variation of ML-MC. For instance, Regions K and W have 95% confidence intervals for ML-MC that are only slightly below the earthquake range. These are the two regions that contain above ground explosions. The weak coupling to the solid Earth leads to less efficient Rg excitation, which in turn leads to shorter codas and more positive ML-MC values. The clearest example of source-related ML-MC variation comes from a comparison between Regions S and T (Figure 11). These two regions are separated by only ~15 km and produce similarly sized quarry blasts with 26 events in Region S having a mean ML of 1.24 ± 0.052, and 180 events in Region T having a mean ML of 1.21 ± 0.036. Yet, Region S has a mean ML-MC of -0.098 ± 0.046, while Region T has a mean ML-MC of 0.512 ± 0.333. Waveform differences from the two regions occur over a range of distances and azimuths (Figure 11), implying that the difference is not caused by geologic variations along source-to-receiver paths, but rather by variations in the source region. There are two likely source-related explanations for the waveform variations observed in Regions S and T. The first explanation is that the mine in Region T may use more shot holes on average than the mine in Region S. Increasing the number of shot 27 holes increases the source-time function without affecting the maximum amplitude of the explosion. As the duration of the source-time function approaches the dominant period of Rg (0.5–2 s), Rg is increasingly excited leading to longer duration waveforms (Kim et al., 1994). The second explanation involves near source geology. Region T is geologically heterogeneous, with deformed Mississippian limestones overlain by extensive conglomerates. It lies within the mature Wasatch Fault damage zone, along the boundary between the Salt Lake City sedimentary basin to the southwest and the Wasatch Mountains to the northeast. Such near-source sediment is thought to enhance P-to-S conversion (Kim et al., 1994), leading to longer coda duration. Region S, in contrast, is on the axis of the ductilely deformed Parley’s Canyon Syncline, a structure composed of competent, laterally homogeneous Jurassic limestones, more distant from the basin. Determining the relative balance of these two effects requires obtaining detailed blasting patterns from the mines and performing high-frequency waveform simulation in realistic Earth models, which we leave for future work. CONCLUSIONS We manually refined the locations and magnitudes of 1,545 explosions that occurred in the Utah region between 1 October 2012 and 30 June 2018. ML values vary from 0.65 to 2.90 with a median of 1.58, and MC values vary from 0.35 to 3.15 with a median of 1.68. Most of the explosions are delay-fired, near-surface mine blasts, although some single-fired, above-ground blasts related to military activities are also included. The new catalog is intended to be used by researchers in verification seismology working on methods of low-yield nuclear monitoring. While methods of classifying moderate-sized 28 (M3–5) seismic events recorded at regional distances (~200–2000 km) are wellestablished (National Research Council, 2012), it is unclear whether the same methods are effective for small seismic events (M0–2) recorded at local distances (<200 km), or whether new discriminants need to be developed. We used the new catalog to confirm the effectiveness of ML-MC as a depth discriminant in the Utah region. A total of 1,148 explosions in the new catalog were well recorded enough to have both MC and ML calculated. The mean ML-MC value for the explosions is -0.196 ± 0.017 (95% confidence interval), significantly more negative than the ML-MC value of 0.048 ± 0.008 previously observed for naturally occurring earthquakes in Utah (Koper et al., 2016). We attribute the negative ML-MC values of explosions to their shallow depth, which preferentially excites local surface waves (Rg) that scatter and disperse within the low-velocity, strongly heterogeneous, shallow crust, thus generating abnormally long-duration coda waves. We observed significant variation in ML-MC among 26 distinct explosion source regions in and around Utah. The most negative value of -0.634 ± 0.204 was observed for 9 explosions from a quarry in north central Utah, and the most positive value of 0.218 ± 0.314 was observed for 13 explosions from a quarry in southwestern Utah. However, even nearby source regions could have significantly different values because of different blasting practices or geologic setting. Two explosion source regions near Salt Lake City that are separated by just 15 km had mean ML-MC values that differed by 0.414 ± 0.336. To achieve the best discrimination results, ML-MC observations should be combined with other, preferably non-correlated, local discriminants in a mathematically rigorous 29 manner, perhaps similar to the methodology used to combine regional-to-teleseismic discriminants (Anderson et al., 2007). DATA AND RESOURCES The seismic data used in this study are publicly available from the Incorporated Research Institutions for Seismology (IRIS) Data Management Center at www.iris.edu. The program HYPOINVERSE is available from the U.S. Geological Survey (USGS) at https://earthquake.usgs.gov/research/software/. The catalog of explosion source parameters is available by email request to Keith D. Koper (koper@seis.utah.edu). ACKNOWLEDGMENTS This study was funded by the Air Force Research Laboratory under contract FA9453-17-C-0022. We used Generic Mapping Tools (GMT; Wessel and Smith, 1998) to make many of the figures. We thank Jim Pechmann, Kim McCarter, and two anonymous reviewers for comments and suggestions. 30 FIGURES AND TABLES Figure 1. (a) Seismometers used by the University of Utah Seismograph Stations (UUSS) between 2012 and 2018. The UUSS maintains and operates the University of Utah Regional Network (UU; solid triangles) and records data from seismographs in allied networks (open triangles) for seismic processing. (b) The 26 explosion source regions, organized from south to north. The 1,545 explosions in these regions occurred between 1 October 2012 and 30 June 2018. 31 Figure 2. (a) Locations of a quarry blast (red) and naturally occurring earthquake (black) that have nearly identical epicenters. The quarry blast occurred on 2014/10/14 at 20:13:22 (UTC) with magnitudes of 1.41 ML and 1.81 MC and a depth of -2.0 km. The earthquake occurred on 2017/06/25 at 21:20:10 (UTC) with magnitudes of 1.95 ML and 2.02 MC and a depth of 9.0 km. Vertical component waveforms are shown for stations (b) UU.WTU, (c) UU.GMU, and (d) UU.NOQ. All waveforms are proportional to ground velocity and have been bandpassed at 1–10 Hz. Each trace is individually normalized in 32 the given time window. The average epicentral distance, d, from the co-located events is reported in the bottom of each panel. Figure 3. Variation in sensitivity and specificity of the original UUSS analyst classification of source type. For sensitivity, a ratio of 1.0 means that no earthquakes were originally misclassified as explosions. For specificity, a ratio of 1.0 means that no explosions were originally misclassified as earthquakes. The total number of earthquakes and explosions reviewed in each region are listed across the top x axis. 33 Figure 4. (a) Map of Region D with explosions plotted to reflect their epicenters before (gray) and after (red) reanalysis. The location of Region D within the Utah region is indicated by the red star on the inset image of the state. Changes in (b) ML and (c) MC during reanalysis. 34 Figure 5. (a) Map of Region P with explosions plotted to reflect their epicenters before (gray) and after (red) reanalysis. The location of Region P within the Utah region is indicated by the red star on the inset image of the state. Changes in (b) ML and (c) MC during reanalysis. 35 Figure 6. (a) Map of Region R with explosions plotted to reflect their epicenters before (gray) and after (red) reanalysis. The location of Region R within the Utah region is indicated by the red star on the inset image of the state. Changes in (b) ML and (c) MC during reanalysis. 36 Figure 7. (a) Map of Region T with explosions plotted to reflect their epicenters before (gray) and after (red) reanalysis. The location of Region T within the Utah region is indicated by the red star on the inset image of the state. Changes in (b) ML and (c) MC during reanalysis. 37 Figure 8. (a) Map of Region W with explosions plotted to reflect their epicenters before (gray) and after (red) reanalysis. The location of Region W within the Utah region is indicated by the red star on the inset image of the state. Changes in (b) ML and (c) MC during reanalysis. 38 Figure 9. Histograms of ML-MC values for the same 1,007 explosions in the Utah region (a) before reanalysis and (b) after reanalysis. 40 Figure 11. (a) Comparison of typical explosion waveforms from two nearby quarries in north central Utah. The (b) Region S explosion occurred on 2012/11/03 13:49:23 (UTC) and the (c) Region T explosion occurred on 2012/11/27 14:55:04 (UTC). Events from Region T generally have anomalously large MC values, while those from Region S have more earthquake-like ML-MC values. All traces are proportional to vertical-component ground velocity in a 1–10 Hz passband. 42 CHAPTER II Adapted from: Local Distance Source Discrimination Using Interpretable Machine Learning Models Jonathan R. Voyles, Ben I. Baker, and Keith D. Koper Dept. of Geology and Geophysics, University of Utah, Salt Lake City, UT, 84112, USA Accepted by University of Utah 2020 Undergraduate Research Journal 43 ABSTRACT The recent maturity and accessibility of machine learning (ML) tools have encouraged scientists from various backgrounds to begin leveraging ML techniques to process and interpret their datasets. In particular, ML has recently been used in seismology for the problem of discriminating explosions from earthquakes at local distances using a state-of-the-art convolutional neural network (CNN) and an adaption of a recurrent neural network called Long-Short-Term-Memory (LSTM). Here, we show that feature-engineered, simple models trained on data that are easy to obtain, process, and store perform similarly to cutting-edge, complicated models trained on data streams that require orders of magnitude more storage and extensive processing. Using only three non-waveform features for discriminating explosions from earthquakes in Utah, we train a 94.1% accurate random forest and a 93.6% accurate logistic regression. These accuracies compare favorably with CNN and LSTM testing accuracies from recent work on a similar dataset using waveform data as the features. The CNN and LSTM were 95.8% and 95.9% accurate on average for a single station, respectively, and 99.2% and 99.3% accurate on average for a network average, respectively (Linville et al., 2019). The similar performance between straightforward-to-implement, interpretable models and computationally expensive models that are oftentimes more difficult to interpret suggests that simple models should be seriously considered prior to introducing more complicated black-box models. The models developed here use ML-MC, a newly proposed depthdiscriminant that works at local distance, as one of the model features to evaluate how well it differentiates between earthquakes and explosions as a source-discriminant. 44 INTRODUCTION Discrimination of seismic event types has historically been approached using empirical relationships like mb:MS and P/S amplitude ratios (Anderson et al., 2007). Whereas in the past, discrimination techniques were designed to work at regional to teleseismic source-receiver distances, current discrimination efforts are framed in the scope of local distances (<150-200 km) (Koper, 2019). Local distance discriminants are necessary in part because the Comprehensive Test Ban Treaty (CTBT) is a zero-tolerance treaty (Bowers and Selby, 2009). Zero-tolerance would not allow nuclear weapons tests of any size, which has resulted in the need to detect extremely small yield tests. Small yield tests generate wavefields with amplitudes too small to be detected at regional to teleseismic distances, where existing discriminants have been calibrated and are effective. Therefore, these small yield, evasive tests need to be detected at local distances, where previous discriminants do not work well due in part to more complex and higher frequency waveforms, and lack of data for testing and calibration (Kim et al., 1993). Recent work has suggested that ML-MC might be a viable local-distance depthdiscriminant because it has been shown to have depth-sensitivity for natural and induced earthquakes in Utah (Koper et al., 2016), earthquakes of various source mechanism in different regions globally (Holt et al., 2019), and explosions in Utah (Voyles et al., 2020). While ML-MC as a depth-discriminant has been tested, the performance of ML-MC as a source discriminant still needs to be evaluated. Here, we show for the first time the effectiveness of ML-MC as a source discriminant. Moreover, we leverage the recommendations made in Voyles et al. (2020) to create the most effective sourceclassification model, with ML-MC as a feature. The primary purpose of this classification 45 model is to aid regional seismic networks (RSN) like the University of Utah Seismograph Stations (UUSS) discriminate explosions from earthquakes in routine processing, but it also has implications for forensic seismology and nuclear discrimination. Earthquake catalogs curated by RSNs are used for seismic hazard evaluation (Pankow et al., 2019) and earthquake sequence characterization, and should not be contaminated with misclassified explosions (Wiemer and Baer, 2000). In addition to using ML-MC, we also use the event epicentral distance to the nearest active mine and the origin time as features. The proximity of an event to a known mine is a sensible discriminant because explosion epicenters typically cluster around active mines. Additionally, the UUSS sometimes receives notifications of blasting and therefore has ground truth information. The origin time is also a logical discriminant because surficial blasting is illegal in Utah during nighttime hours. Before classifying an event as an explosion, UUSS analysts check that the origin time is during daylight hours and that the epicenter is near a known mine. They also use S-wave waveform content and the polarity of first arrivals, but these features require waveform information, which is not used in this study. ML-MC, distance to nearest mine, and origin time are all features that are readily available from routine processing. Complicated models like the CNN and LSTM developed in Linville et al. (2019) use computationally expensive time-frequencyanalysis inputs. Furthermore, applying these models to datasets is computationally expensive. By using simpler features and avoiding waveform data, the proposed workflow is easier and faster. 46 DATA AND FEATURE ENGINEERING Between October 1, 2012, and September 20, 2019, 10,596 seismic events were recorded in the Utah authoritative region and assigned ML and MC values, of which 4,914 are earthquakes (EQ) and 5,682 are explosions (EX) (Figure 1). The earthquakes are well-labeled and 1,545 (27 percent) of the explosions are confidently labeled, meaning they are included in the Voyles et al. (2020) catalog. Linville et al. (2019) use a similar catalog of explosions and earthquakes recorded at local distances in Utah. The two different event type classes are relatively well balanced (46 percent:54 percent, EQ:EX). The first feature developed is ML-MC, which is the difference between the network averaged local magnitude value and the network averaged coda-duration magnitude for a particular event (Figure 2). For earthquakes, the mean ML-MC value is close to zero due to MC being calibrated from ML. Conversely, for explosions, the mean ML-MC value is negative due to the extremely shallow source depths. The second feature is proximity to a mine boundary, which requires a list of the 26 epicentral centroids of the mining regions identified in Voyles et al. (2020), combined with seven other centroids of mining regions that were not included in the Voyles et al. (2020) catalog. The epicenter of each event is compared to this list of centroids and the haversine formula is used to compute the distance from each event to the nearest of the 33 mine centroids. The mean distance to the nearest mine for explosions is much smaller than that of earthquakes and the explosion distribution has a much lower variance (Figure 3). The third feature, time-of-day, is explored with three different methods to determine which formulation of time-of-day is most effective. First, the hour of the origin 47 time of the event is used as the time-of-day feature. Hour-of-day varies between 0 and 23 (Figure 4). There is no hour-of-day trend in the earthquake population. The distribution of explosion hour-of-day data is mostly constrained to daylight hours, with few outliers. Voyles et al. (2020) note that some explosions in Utah have occurred during night hours. The next method for engineering the time-of-day feature is to use binary classification to label the time-of day as day (1) or night (2). Voyles et al. (2020) suggest that explosions occur between the hours of 13:00 and 3:00 UTC (6:00-20:00 MST) in Utah, taking into account daylight-saving time changes. Therefore, if the origin time of the event is between 13:00 UTC and 3:00 UTC, it would be classified as day, otherwise it would be classified as night (Figure 5). The earthquake day or night distribution is not split evenly between classes because the length of time each class encompasses is not the same. The final time-of-day feature makes use of the periodicity of daylight hours. It is calculated as: tp = sin ( !" # ) (3) where tp is the periodic time, t ∈ [0, 23] is the hour of the event origin time in MST, and T = 23 is the period (Figure 6). This is done to benefit linear classifier models like logistic regression, because these models struggle defining decision boundaries along nonlinear distributions, such as the hour-of-day distribution (Figure 7). A nonlinear classifier like a random forest model does not experience these types of problems. Nighttime hours like zero and 23 yield a periodic time value of 0 whereas daylight hours like 11–13 yield a time value centered around one. The periodic time distribution for earthquakes is maximum at one and extends across all periods, whereas the explosion distribution is more clustered around one. 48 MODELS AND RESULTS The dataset of explosions and earthquakes is divided such that 70 percent of the events are used to fit the model coefficients while the remaining 30 percent are for evaluating model performance. This split dataset is used for all models. The first model developed is logistic regression, a linear classifier (Pedregosa et al., 2011). Both logistic regression without interaction terms and logistic regression with interaction terms are trained and tested. Testing and training accuracies are computed as the ratio of correct classifications to the number of observations (Figure 8). Most misclassifications occur at the overlap between the two different source type populations. A total of nine logistic regression models (LR1 to LR9), each with a different combination and number of features are trained and tested to see which features are controlling model performance (Figure 9, Table 1). Interaction terms are added to the linear regression model as a way to introduce nonlinearity into the linear decision boundaries. The best performing logistic regression model is LR1 without interaction terms, which achieves a testing accuracy 0.936 (Table 1). Periodic time is the best performing time-of-day feature, as LR9 using periodic time yields a testing accuracy of 0.731, whereas LR7 using day or night and LR8 using hour-of-day return 0.677, and 0.491, respectively. Therefore, periodic time is used in all logistic regression models that have a time-of-day feature. The key feature driving best accuracies is distance to nearest mine, as it performed the best of all logistic regression models with only one feature. Since the distance to nearest mine dominates model performance, misclassifications are likely a 49 function of space. LR5 using distance to mine gives a testing accuracy of 0.901 while LR6 using ML-MC returns 0.674. All of the models that use two features perform better than the models with one feature, as long as distance to nearest mine is a feature. Testing accuracies are very close to training accuracies. The addition of interaction terms in the feature space did little to improve model performance. The other model created is a random forest with 100 trees, a nonlinear classifier (Pedregosa et al., 2011). A grid search is conducted using the area under the receiver operating characteristic (ROC) curve criterion for finding the optimized hyperparameters. The best performing random forest model is RF2, with a testing accuracy of 0.941 (Table 2). Hour-of-day is the most effective time-of-day manipulation in the random forest model. Feature importance is calculated for RF2. The most important feature is distance to closest mine with a relative importance of 0.627. Hour-of-day has an importance of 0.208 and ML-MC has an importance of 0.165. Confusion matrices and ROC curves for the best logistic regression model and the best random forest model are shown in Figure 10 and Figure 11, respectively. A confusion matrix for binary classification of earthquakes (EQ) and explosions (EX) shows the number of true positives (TP: true EQ predicted as EQ), false positives (FP: true EX predicted as EQ), true negatives (TN: true EX predicted as EX), and false negatives (FN: true EQ predicted as EX) a model produces. A ROC curve is another performance measurement that shows how effectively a model is discriminating between classes. It is a plot of the true positive rate (TP/(TP+FN)) versus the false positive rate (FP/(TN+FP)). The true positive rate is the same as sensitivity and the false positive rate is just one minus the specificity. The line y=x on a ROC curve indicates that the 50 discriminant is not more effective than chance and the more the curve bends to the upper left, the better the classifier is. DISCUSSION The random forest models developed here are interpretable, but logistic regression models are even easier to interpret because slopes and y intercepts of the decision boundaries are readily accessible. For example, the ML-MC decision boundary slope is negative, which indicates that as ML-MC decreases, the likelihood of that event being an explosion increases. Since ML-MC is a proxy for depth, more negative ML-MC values coincide with shallow source depths. Explosions have source depths at or near the surface whereas earthquakes are typically much deeper, so more negative ML-MC values are attributed to explosions. Therefore, our physical understanding of the depth-discriminant matches the decision boundary output. With respect to feature engineering time-of-day, periodic time performed best for the logistic regression models while hour-of-day was the best for the random forest models. The notion that different models perform better with different feature engineering draws attention to the need to engineer features specifically for the particular model and problem being approached. This should be related to the types of decision boundaries each model will generate, namely if they are linear or nonlinear. The hour-ofday feature formulation has a pseudo-normal distribution for the explosions that a linear model like logistic regression has difficulty partitioning into two groups because it has to choose one time to separate earthquakes from explosions. Models like random forests can partition data into multiple groups, for example before 6:00 MST and after 20:00 MST 51 and consequently perform better. This is the basis for engineering the periodic time feature, as the linear logistic regression model can more easily choose a single time to separate the two different populations. The feature importance analysis from the random forest models ranks distance to nearest mine, time-of-day, and ML-MC, in order from most to least important. This order agrees with the ranking of the best performing logistic regression models that only use one of the three features at a time. This indicates that feature importance can be correlated with model performance using the specific feature. The accuracies reported in this paper are not given with error bounds, for the sake of clarity, but cross validation can yield constraints on accuracy error to determine if the differences in model performance are significant. Model performances for RSN applications could be even better if only explosions from Voyles et al. (2020) were used, as these explosions have been analyzed by human analysts and have trusted labels and magnitudes. Many of the explosions in the dataset were recorded with poor station coverage, which leads to more frequent event misclassification, and have questionable magnitudes that were automatically calculated. CONCLUSIONS We achieve similar testing accuracies to complicated neural network models using single stations, by using interpretable models without waveform data. The CNN and LSTM do perform better than our models, with station averaging. Our best overall model performance is attributed to the random forest model with a testing accuracy of 0.941. The random forest models perform better than the best logistic regression model with a testing accuracy of 0.936. Distance to the nearest mine is by far the most important 52 and best performing feature, with time-of-day and then ML-MC, in order of less importance. All three features were needed in all three models to yield the highest accuracy. The most effective way to engineer the time-of-day feature is dependent upon the model. Occam’s razor would suggest using the best logistic regression model in the future, since it performs closely to that of the random forest models but is simpler. These results have implications for RSNs as the models developed here could significantly aid UUSS analysts to correctly identify events routinely. The models created rely on the UUSS having a list of active mine locations and Utah law ruling surface blasting at night as illegal. Other RSNs could or already have compiled lists of actives mines in their authoritative region and legislation outlawing surface blasts at night might apply to their region as well. With this information, models developed here could be easily implemented in the operational workflow of a network. The mission of RSNs is primarily to characterize seismicity in their authoritative bounds, so the ability to easily separate explosions from earthquakes catalogs is of value. 59 Figure 9. Results of logistic regression models with and without interaction terms. Both training and testing accuracies are reported for each of nine models with different combinations and numbers of features. Blue, orange, green, and red represent training accuracy without interaction terms, testing accuracy without interaction terms, training accuracy with interaction terms, and testing accuracy with interaction terms, respectively. 61 Figure 11. Receiver operating characteristic curves for the best logistic regression model and the best random forest models plotted in comparison to a reference line for a classification due to chance. 62 Table 1. Logistic regression model results Train w/ Int. Test w/ Int. 0.936 0.933 0.935 0.911 0.915 0.908 0.913 ML-MC & Periodic Time 0.781 0.791 0.788 0.791 LR4 LR5 Periodic Time & Dist. Dist. 0.927 0.900 0.931 0.901 0.927 0.900 0.931 0.901 LR6 ML-MC 0.675 0.674 0.675 0.674 LR7 LR8 LR9 Day or Night Hour-of-Day Periodic Time 0.661 0.502 0.720 0.677 0.491 0.731 0.661 0.502 0.720 0.677 0.491 0.731 Model Name Features Train wo/ Int. Test wo/ Int. LR1 ML-MC, Dist., & Periodic Time 0.933 LR2 ML-MC & Dist. LR3 * The nine logistic regression models using different combinations and numbers of features are shown in the left most column. Each model’s testing and training accuracies are shown in the other columns for the model with (w/) and without (wo/) interaction terms (int.). The models with one feature yield the same accuracies with or without interaction terms. 63 Table 2. Random forest model results Model Name Features Testing Accuracy RF1 ML-MC, Dist., & Day or Night 0.920 RF2 ML-MC, Dist., & Hour-of-Day 0.941 RF3 ML-MC, Dist., & Periodic Time 0.930 64 CHAPTER III In preparation for submission to the Bulletin of the Seismological Society of America: Simulated Effects of Shallow Crustal Heterogeneity, Surface Topography, and Seismic Source Depth on Coda Wave Generation for Magnitude-Based Depth Discrimination Jonathan R. Voyles1, Arben Pitarka2, and Keith D. Koper1 1 Dept. of Geology and Geophysics, University of Utah, Salt Lake City, UT, 84112, USA 2 Atmospheric, Earth and Energy Division and Geophysical Monitoring Program, Lawrence Livermore National Laboratory, Livermore, CA, USA 65 ABSTRACT The nuclear explosion monitoring community has historically utilized seismic discrimination methods at regional-to-teleseismic distances to distinguish high-yield explosions from earthquakes. It is unclear how well such discriminants perform at local distances (<100 km). Recent studies show that a new discrimination technique based on comparing ML-MC values between small earthquakes and low-yield explosions performs well at local distances. However, more analysis of its performance is required to establish its sensitivity to source depth and source type. Previous observations suggest that coda wave excitation from local, shallow sources increases MC relative to ML and is independent of geologic setting. The main contributors to coda wave energy, which directly affect ML-MC, are low velocity surface layers, shallow crustal heterogeneity, and surface topography. In order to evaluate their separate contributions, we perform numerical simulations using one-dimensional and three-dimensional velocity models with and without small-scale variability and surface topography. These simulations are performed in the frequency range of 0.0–4.2 Hz using Seismic Waves, fourth order, or SW4. We simulate ground motion from a tectonic earthquake and a ripple-fire explosion recorded by the University of Utah Seismograph Stations (UUSS). The events are epicentrally co-located and have ML of 1.95 and 1.41, respectively. We use a onedimensional velocity model for the Wasatch Front in Utah and include a surficial low velocity layer to better reproduce observed shallow wave trapping. Additionally, we use the three-dimensional Wasatch Front Community Velocity Model which accounts for basin and range structures. Scattering from crustal heterogeneity is generated by random 66 correlated velocity perturbations in the background velocity models. High resolution surface topography is also included. Comparisons between synthetic and recorded waveforms are used to determine the parameters that contribute the most to wave scattering and wave phases used in estimating ML-MC. Furthermore, the depth of the sources is varied to investigate source depth effects on wave scattering. We find that wave scattering due to shallow crustal heterogeneity depends on source depth and could be responsible for the observed difference in coda wave amplitude and duration between earthquakes and mining explosions. INTRODUCTION Recent work has shown that the difference between local magnitude (ML) and coda duration magnitude (MC) is a sensitive function of depth for mining-induced seismicity and tectonic earthquakes in Utah (Koper et al., 2016), earthquakes of various source type in multiple different geologic settings (Holt et al., 2019), and small mining or weapons testing and disposal explosions in Utah (Voyles et al., 2019). All three of these observational studies note that ML-MC is sensitive to depth and offer some explanation as to what is driving the correlation. Koper et al. (2016) produced simple, one-dimensional simulations that showed as the source depth approached the surface, the peak amplitude of the Sg wave packet became more reduced than the duration and amplitude of the coda package. Since ML is sensitive to the peak amplitude of the Sg wave packet and MC is calculated from the duration of the coda wave relative to the P wave onset, they showed numerically that shallow source depths preferentially increase MC relative to ML. These simulations used a realistic one-dimensional layered earth model with low velocity layers 67 at the surface to capture wave-guide effects. Waveforms for a single epicentral source-toreceiver distance were computed for a double-couple source at a variety of depths using Green’s functions calculated with an f-k technique (Zhu and Rivera, 2002). These waveforms were compared to equivalent synthetics calculated with the IASP91 reference model (Kennett and Engdahl, 1990) which exhibited much shorter durations because this model consists of a simple, two-layer crust that does not reflect near surface low velocity structure. Holt et al. (2019) summarized several possible explanations for events at shallow depths exhibiting preferentially longer coda durations relative to deeper events. Nearsurface, low-velocity layers act as wave guides that trap energy and extend coda (Koper et al., 2016). Volumetric scatterers from near surface heterogeneity contribute to scattering (Frankel and Clayton, 1986). Topography at the surface acts as a source of scattering (Takemura, et al., 2015). These proposed explanations are all directly controllable in numerical models. The final explanation given is that shallower events could have lower stress drops and consequently a longer source duration, but this mechanism is not explicitly controllable in the simulations shown here (e.g. Trugman et al., 2017, Long, 2019, Goebel et al., 2016). Voyles et al. (2020) uses Rg, a short period Rayleigh wave generated from sources at shallow depths, to explain observations of variability in average ML-MC for mines in Utah that are all located at the surface. Although ML-MC is strongly affected by Rg, this phase is a fundamental result of wave propagation and cannot be directly controlled. In order to expand upon the preliminary simulations of Koper et al. (2016), isolate and control the three controllable contributors to coda outlined in Holt et al. (2019), and 68 include the effects of Rg on the wavefield noted in Voyles et al. (2020), elastic wave propagation with fourth order accuracy is used. By comparing the results of these simulations with observed earthquakes and explosions, we can constrain which codaproducing mechanisms are most important at local distances (<100 km) and high frequencies (up to 4.2 Hz). Seismic Waves, fourth order (SW4) is chosen as it allows for three-dimensional modeling with heterogeneity and topography in a high-performance computing architecture with mesh refinement to save on computational cost ((Sjogreen and Petersson, 2012). SOURCES AND MODELS To minimize the independent variables between simulations, an epicentrallycollocated earthquake and explosion pair is selected (Figure 1). The explosion occurred on 14 October 2014 at 20:13:22 (UTC) near the surface, with ML = 1.41 and MC = 1.81. The explosion occurred at an active mine in Utah that uses ripple-fire blasting patterns to break up rock for production. As an explosion, the source used is assumed to be completely isotropic expansion. The earthquake occurred on 25 June 2017 at 21:20:10 (UTC) at a depth of 9.0 km, with ML = 1.95 and MC = 2.02. It occurred in the same region as the 2019 Bluffdale earthquake sequence. Since the earthquake occurred in close spatial proximity to the Bluffdale sequence, the focal mechanism that was calculated for the mainshock is assumed to be representative of the simulated earthquake. The doublecouple source is normal in motion, strikes nearly north-south, and is assumed to dip west due to local tectonics. Both sources are assumed to be point sources and are assigned the same gaussian source time function. 69 In order to determine the individual contribution from each of the coda-producing mechanisms, it is necessary to generate a suite of models and compare them to one other. The starting point for our models is the simplest: a one-dimensional model. The University of Utah Seismograph Stations (UUSS) uses a location dependent onedimensional velocity model to routinely calculate the hypocenters of earthquakes. We use the model from the Wasatch Front which consists of a depth profile of layer thicknesses, P-wave velocities (Vp) and S-wave velocities (Vs). This velocity profile was compared to other profiles in Utah that also estimate density (ρ), to integrate the profiles into one with Vp, Vs, and ρ. S-wave quality factor (Qs) is a function of Vs and is calculated using the cubic method of Graves and Pitarka (2004), which was determined by modeling strong ground motion data in basins from the 1994 Northridge EQ in Southern California. Pwave quality factor (Qp), is double that of Qs. A thin, low-velocity layer was added at the surface of the model to act as a wave guide. A more complicated model, the threedimensional Wasatch Front Community Velocity Model (CVM) is also used to further capture the three-dimensionality of basin structures, in particular the resultant amplification from constructive interference and superposition. The Wasatch Front CVM specifies Vp, Vs, and ρ for any point within its domain. Qp and Qs were then calculated using the aforementioned methods. To simulate near-surface heterogeneity, velocity perturbations are generated using the von Karman self-similar correlation function which perturbs the Vp and Vs, without changing Vp/Vs or ρ. The stochastic perturbations are added in layers of constant perturbation magnitude, that decrease with depth until the magnitude is zero. True topography data is used to account for this contributor to scattering. A highly-variable 70 curvilinear mesh is fit from the topographic free-surface boundary and becomes flatter until it reaches a specified point where the mesh becomes cartesian for computational ease. Topography results in large computations because the curvilinear mesh is more complicated to solve the wave equation on than cartesian. Mesh refinement is used throughout to allow for large grid-point spacings at depth and short grid-point spacings in the near surface to resolve the high frequencies required. Absorbing boundary conditions are used to avoid artificial reflections. There are six different models generated to test the variety of possible contributors to scattering: (1) the one-dimensional Wasatch Front model with a near surface low-velocity layer, (2) the same model as in (1) but with velocity perturbations, (3) the same model as in (1) but with topography, (4) the same model as in (1) but with both velocity perturbations and topography, (5) the three-dimensional Wasatch Front CVM, and (6) the same model as in (5) but with velocity perturbations (Figure 2). The velocity perturbations, topography, and three-dimensional Wasatch Front CVM are shown in Figure 3. RESULTS The resulting waveforms for the earthquake and explosion simulation using the six models previously described are compared with the recorded waveforms in Figure 4. Since this study is focused on coda duration related to MC more than ML, amplitudes for each simulation were scaled by factors given so that the waveform characteristics could be easily compared. The model that performs the best overall at emulating the recorded waveforms is model 6, the three-dimensional Wasatch Front CVM with heterogeneity. 71 Given their simplicity, the one-dimensional models do a good job of matching the observations. Of the one-dimensional models, model 4 with heterogeneity and topography performs the best. Model 2 with heterogeneity performs similarly to model 4, but is much faster to compute because of a strictly cartesian grid. Model 5, the threedimensional Wasatch Front CVM, surprisingly performs the worst, as it is the least effective at generating coda. Side-by-side comparisons of the displacement wavefield for various times for model 6 are shown in Figure 5. It is clear that the basin structure is effective at extending the duration of the event due to more energy being trapped in the near surface low-velocity zone. Amplitudes for this comparison are not scaled, but are quite different for each event due to geometric spreading from the contrasting source depths. Another side-by-side comparison for model 6 is given, but for divergence and curl of the wavefield (Figure 6). S-wave energy is the dominant phase in the guided waves in the near surface low-velocity layer. Furthermore, source depth and source-to-receiver are varied to observe their effects on wave propagation. Model 2 was chosen for this set of simulations because it performs well and is computationally inexpensive relative to other models with topography. Synthetic stations, shown in Figure 1, are added radially from source-toreceiver to observe how coda develops as a function of path. Five synthetic stations are added from the source to the station NOQ. For the explosion source type, 7 different depths are simulated, from the true event depth of 50 m, to the 9 km event depth of the earthquake (Figure 7). A matching set of simulations is run for the earthquake source type, but with five event depths, ranging from the true event depth of the earthquake to the depth of the explosion (Figure 8). Coda is prolific when the source depth of the 72 explosion is within the low-velocity layer at the surface. At depth, the explosion loses much of its coda energy and P-wave energy dominates the waveform. As source-toreceiver distance increases, the explosion waveforms become less monochromatic in frequency, more heterogenous, and develop significant coda. The earthquake at a shallow source depth also generates significant coda. DISCUSSION The best one-dimensional velocity model included both perturbations and topography, but the model with topography (3) alone did not perform nearly as well as the model with perturbations (2) alone. Similarly, the three-dimensional model performed poorly, even in comparison to simpler one-dimensional models, without the inclusion of heterogeneity, at which point the model outperformed all other models. This suggests that heterogeneity is more of a significant contributor to scattering topography, at least in the study region. The topography in the study region is not weak, but it is certainly not as complicated in other places. Additionally, the simulations with varied source depth and source-to-receiver distance demonstrate that coda is a function of the distance that the wavefield has travelled. Since topography is most influential at near receiver locations, the notion that coda is dependent on path suggests that again, scattering is the dominant mechanism. It also possible that ML-MC is sensitive to ML, so care should be given to preserve information concerning both magnitudes. For this reason, ML-MC should be calculated for all of the synthetic waveforms to quantitively describe the effects of codagenerating mechanisms, source-depth, and source-to-receiver distance on ML-MC. 73 Further modeling is needed. Sensitivity analysis on the structure and size of the velocity perturbations should be done as well as varying attenuation models. Coda is heavily dependent upon both of these properties and we have more uncertainty with respect to attenuation and heterogeneity structure than we do velocity structure and topography. Initial results show that resulting waveforms are not extremely sensitive to the length scales and magnitude of the velocity perturbations. Simulations with only one attenuation model have been computed, but others are readily available. Additionally, the three-dimensional Wasatch Front CVM model with only topography and perturbations and topography need to be evaluated for completeness, though these runs are computationally expensive. Models without surficial low-velocity layers should also be generated, but this will radically increase the number of possible combinations of scattering mechanisms. CONCLUSIONS We successfully use SW4 to simulate wave propagation at local distances up to 4.2 Hz using a variety of realistic Earth models. Synthetic data roughly match with observed waveforms from an earthquake and an explosion and especially match using the 3-D model with perturbations. Coda energy may be extended by wave guide trapping, scattering from topography, scattering from heterogeneity, enhanced Rg-wave production, and low stress drops. This study experimented with coda sensitivity to the first three mechanisms by directly adding or removing them, with the fourth mechanism tested via varying source depth. We find that wave scattering due to shallow crustal 74 heterogeneity may be primarily responsible for the observed difference in coda wave amplitude and duration between an earthquake and an explosion. ACKNOWLEDGEMENTS This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. All simulations were achieved using Livermore Computing’s resources. 77 Figure 2. Shear wave velocity sections (north-south at -112.2° for (a)-(d) and east-west at 40.4° for (e) and (f)) taken (Figure 1) for (a) a simple one-dimensional model that the UUSS uses to locate along the Wasatch Front, (b) model from (a) with stochastic velocity perturbations, (c) model from (a) with true topography, (d) model from (a) with both velocity perturbations and topography, (e) the three-dimensional Wasatch Front Community Velocity Model, and (f) the model from (e) with stochastic velocity perturbations. All models have the geologically accurate low-velocity layer near the surface. 78 (a) (b) (c) Figure 3. Map views of (a) the velocity perturbations in Fig. 2 (b), the surface topography fit by a curvilinear mesh used in (c) Fig. 2 (c) and (d), and (c) the velocity distribution of the three-dimensional Wasatch Front CVM used in Fig. 2 (e) and (f). 83 Figure 8. Vertical component synthetic waveforms for the EQ source and the onedimensional model with perturbations up to 5 Hz (Fig. 2 (b)). Source depth decreases to the right and source-to-receiver distance increases downwards. The first five stations are synthetic receivers, that extend radially from the source to the receiver, NOQ. Source initiates at 0.2 seconds. 84 REFERENCES Allmann, B. P., P. M. Shearer, and E. Hauksson (2008). Spectral discrimination between quarry blasts and earthquakes in southern California, Bull. Seismol. Soc. Am. 98, 2073–2079, doi: 10.1785/0120070215. Anderson, D. N., D. K. Fagan, M. A. Tinker, G. D. Kraft, and K. D. Hutchenson (2007). Mathematical statistics formulation of the teleseismic explosion identification problem with multiple discriminants, Bull. Seismol. Soc. Am. 97, 1730–1741, doi:10.1785/0120060052. Arrowsmith, S. J., M. D. Arrowsmith, M. A. H. Hedlin, and B. Stump (2006). Discrimination of delay-fired mine blasts in Wyoming using an automatic timefrequency discriminant, Bull. Seismol. Soc. Am. 96, 2368–2382. Astiz, L., J. A. Eakins, V. G. Martynov, T. A. Cox, J. Tytell, J. C. Reyes, R. L. Newman, G. H. Karasu, T. Mulder, M. White, G. A. Davis, R. W. Busby, K. Hafner, J. C. Meyer, and F. L. Vernon (2014). The array network facility seismic bulletin: products and an unbiased view of United States seismicity, Seism. Res. Lett. 85, 576–593, doi:10.1785/0220130141. Baumgardt, D. R., and K. A. Ziegler (1988). Spectral evidence for source multiplicity in explosions: application to regional discrimination of earthquakes and explosions, Bull. Seismol. Soc. Am. 78, 1773–1795. Block, L. V., C. K. Wood, W. L. Yeck, and V. M. King (2015). Induced seismicity constraints on subsurface geological structure, Paradox Valley, Colorado, Geophys. J. Inter. 200, 1172–1195. 85 Bowers, D., and N. D. Selby (2009). Forensic seismology and the Comprehensive Nuclear-Test-Ban Treaty, Annu. Rev. Earth Planet. Sci. 37, 209–236. Frankel, A., and R. W. Clayton (1986). Finite difference simulations of seis-mic scattering: Implications for the propagation of short-period seismic waves in the crust and models of crustal heterogeneity, J. Geophys. Res. 91, 6465– 6489. Gibbons, S. J., and F. Ringdal (2006). The detection of low magnitude seismic events using array-based waveform correlation, Geophys. J. Int. 165, 149–166. Goebel T. H. W. Hauksson E. Plesch A., and Shaw J. H. 2016. Detecting significant stress drop variations in large micro‐earthquake datasets: A comparison between a convergent step‐over in the San Andreas Fault and the Ventura Thrust Fault System, Southern California, Pure Appl. Geophys. 174, 2311–2330, doi: https://doi.org/10.1007/s00024-016-1326-8. Goforth, T. T., and J. L. Bonner (1995). Characteristics of Rg waves recorded from quarry blasts in Central Texas, Bull. Seismol. Soc. Am. 85, 1232–1235. Hedlin, M. A. H., C. de Groot-Hedlin, and D. Drob (2012). A study of infrasound propagation using dense seismic network recordings of surface explosions, Bull. Seismol. Soc. Am. 102, 1927–1937. Holt, M. M., K. D. Koper, W. Yeck, S. d'Amico, Z. Li, J. M. Hale, and R. Burlacu (2019). On the portability of ML-MC as a depth discriminant for small seismic events recorded at local distances, Bull. Seismol. Soc. Am., 109(5), 1661–1673. Kennett, B. L. N., and E. R. Engdahl (1990), Traveltimes for global earthquake location and phase identification, Geophys. J. Inter., 105, 429–465. 86 Kim, W.-Y., D. W. Simpson, and P. G. Richards (1993). Discrimination of earthquakes and explosions in the eastern United States using regional high-frequency data, Geophys. Res. Lett. 20, 1507–1510. King, V. M., L.V. Block, and C. K. Wood (2016). Pressure/flow modeling and induced seismicity resulting from two decades of high-pressure deep-well brine injection, Paradox Valley, Colorado, Geophysics 81, 119–134. Klein, F. W. (2002). User’s guide to HYPOINVERSE-2000, a Fortran program to solve for earthquake locations and magnitudes, open file report 2002–171, U.S. Geological Survey, 123 pp. Koper, K. D. (2019). The importance of regional seismic networks in monitoring nuclear test-ban treaties, Seism. Res. Lett. early edition online. Koper, K. D., J. C. Pechmann, R. Burlacu, K. L. Pankow, J. Stein, J. M. Hale, P. Roberson, and M. K. McCarter (2016). Magnitude-based discrimination of manmade seismic events from naturally occurring earthquakes in Utah, USA, Geophys. Res. Lett. 43, 10,638–10,645. Linville, L., K. Pankow, and T. Draelos (2019). Deep learning models augment analyst decisions for event discrimination, Geophys. Res. Lett. 46, 3643–3651. Long L. T. 2019. The mechanics of natural and induced shallow seismicity: A review and speculation based on studies of Eastern U.S. earthquakes, Bull. Seismol. Soc. Am. 109, 336–347, doi: https://doi.org/10.1785/0120180134. National Research Council (1998). Seismic Signals from Mining Operations and the Comprehensive Test Ban Treaty: Comments on a Draft Report by a Department of 87 Energy Working Group, Washington, DC: The National Academies Press, doi:10.17226/6226. National Research Council (2012). The Comprehensive Nuclear Test Ban Treaty: Technical Issues for the United States, Washington, DC: The National Academies Press, doi:10.17226/12849. Pankow, K. L., M. Stickney, J. Y. Ben-Horin, M. Litherland, S. Payne, K. D. Koper, S. L. Bilek, and K. Bogolub (2019). Regional seismic network monitoring in the eastern intermountain west, Seism. Res. Lett. early edition online. Pankow, K. L., S. Potter, H. Zhang, and J. Moore (2017). Local seismic monitoring at the Milford, Utah FORGE site, Geothermal Resources Council Transactions 41, 304– 312. Pechmann, J. C., J. C. Bernier, S. J. Nava, and F. M. Terra (2006). Correction of systematic time-dependent coda magnitude errors in the Utah and Yellowstone National Park region earthquake catalogs, 1981–2001, Appendix C in Arabasz, W. J., R. B. Smith, J. C. Pechmann, K. L. Pankow, and R. Burlacu, Integrated Regional and Urban Seismic Monitoring – Wasatch Front Area, Utah and Adjacent Intermountain Seismic Belt, Final Tech. Rept., U.S. Geological Survey Cooperative Agreement 04HQAG0014, 137 pp., available at http://earthquake.usgs.gov/research/external/reports/04HQAG0014.pdf Pechmann, J. C., S. J. Nava, F. M. Terra, and J. C. Bernier (2007). Local magnitude determinations for Intermountain Seismic Belt earthquakes from broadband digital data, Bull. Seismol. Soc. Am. 97, 557–574. 88 Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay (2011). Scikit-learn: Machine Learning in P}ython, Journal of Machine Learning Res. 12, 2825-2830. Richards, P. G., D. A. Anderson, and D. W. Simpson (1992). A survey of blasting activity in the United States, Bull. Seismol. Soc. Am. 82, 1416–1433. Sjogreen, B., & Petersson, N. A. (2012). A fourth order accurate finite difference method for the elastic wave equation in second order formulation, Journal of Scientific Computing, 52(1), 17–48, doi: https://doi.org/10.1007/s10915‐011‐9531‐1. Stump, B., R. Burlacu, C. Hayward, J. Bonner, K. Pankow, A. Fisher, and S. Nava (2007). Seismic and infrasound energy generation and propagation at local and regional distances: Phase I-Divine Strake experiment, in Proceedings of the 29th Monitoring Research Review: Ground-Based Nuclear Explosion Monitoring Technologies, LA-UR-07-5613, vol. 1, pp. 674–683, U.S. Dept. of Energy, Denver. Stump, B. W., M. A. H. Hedlin, D. C. Pearson, and V. Hsu (2002). Characterization of mining explosions at regional distances: implications with the International Monitoring System, Rev. Geophysics 40, doi:10.1029/1998RG000048. Takemura S. Furumura T., and Maeda T. 2015. Scattering of high‐frequency seismic waves caused by irregular surface topography and small‐scale velocity heterogeneity, Geophys. J. Int. 201, 459–474, doi: https://doi.org/10.1093/gji/ggv038. 89 Tibi, R., K. D. Koper, K. L. Pankow, and C. J. Young (2018). Depth discrimination using Rg-to-Sg spectral amplitude ratios for seismic events in Utah recorded at local distances, Bull. Seismol. Soc. Am. 108, 1355–1368. Trugman D. T. Dougherty S. L. Cochran E. S., and Shearer P. M. 2017. Source spectral properties of small to moderate earthquakes in southern Kansas, J. Geophys. Res. 122, 1–14, doi: https://doi.org/10.1002/2017JB014649. University of Utah (1962). University of Utah Regional Seismic Network. International Federation of Digital Seismograph Networks, Other/Seismic Network, 10.7914/SN/UU. Voyles, J. V., M. M. Holt, J. M. Hale, K. D. Koper, R. Burlacu, and D. J. A. Chambers (2020). A new catalog of explosion source parameters in the Utah region with application to ML-MC-based depth discrimination at local distances, Seism. Res. Lett. 91, 222-236. Wessel, P., and W. H. F. Smith (1998). New, improved version of generic mapping tools released, Eos Trans. AGU 79, p. 579, doi:10.1029/98EO00426. Wiemer, S., and M. Baer (2000). Mapping and removing quarry blast events from seismicity catalogs, Bull. Seismol. Soc. Am. 90, 525–530. Yeck, W. L., L. V. Block, C. K. Wood, and V. M. King (2015). Maximum magnitude estimations of induced earthquakes at Paradox Valley, Colorado, from cumulative injection volume and geometry of seismicity clusters, Geophys. J. Inter. 200, 322– 336. Zeiler, C., and A. A. Velasco (2009). Developing local to near-regional explosion and earthquake discriminants, Bull. Seismol. Soc. Am. 99, 24–35. 90 Zhu, L., and L. A. Rivera (2002), A note on the dynamic and static displacements from a point source in multilayered media, Geophys. J. Inter., 148, 619–627.
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6gf6d5t