{"responseHeader":{"status":0,"QTime":8,"params":{"q":"{!q.op=AND}id:\"197389\"","hl":"true","hl.simple.post":"","hl.fragsize":"5000","fq":"!embargo_tdt:[NOW TO *]","hl.fl":"ocr_t","hl.method":"unified","wt":"json","hl.simple.pre":""}},"response":{"numFound":1,"start":0,"docs":[{"ark_t":"ark:/87278/s62g0ws1","doi_t":"doi:10.26053/0H-Q0H9-ZF00","setname_s":"ir_etd","subject_t":"Analysis of Variance; Functional Data; Time Series; Weak convergence","restricted_i":0,"department_t":"Mathematics","format_medium_t":"application/pdf","identifier_t":"etd3/id/3838","date_t":"2015-05","publisher_t":"University of Utah","description_t":"This dissertation aims to develop the theory and applications of functional time series analysis. Functional data analysis came into prominence in the 1990s when more sophisticated data collection and storage systems became prevalent, and many of the early developments focused on simple random samples of curves. However, a common source of functional data is when long, continuous records are broken into segments of smaller curves. An example of this is geologic and economic data that are presented as hourly or daily curves. In these instances, successive curves may exhibit dependencies which invalidate statistical procedures that assume a simple random sample. The theory of functional time series analysis has grown tremendously in the last decade to provide methodology for such data, and researchers have focused primarily on adapting methods available in finite dimensional time series analysis to the function space setting. As a first problem, we consider an invariance principle for the partial sum process of stationary random functions. This theory is then applied to the problems of testing for stationarity of a functional time series and the one-way functional analysis of variance problem under dependence.","rights_management_t":"Copyright © Gregory Rice 2015","title_t":"Invariance principles in functional time series analysis with applications","id":197389,"publication_type_t":"dissertation","parent_i":0,"type_t":"Text","dissertation_institution_t":"University of Utah","thumb_s":"/e6/37/e63757c220df84c6066e0a3d5efee54a1577a27c.jpg","oldid_t":"etd3 3838","author_t":"Rice, Gregory","metadata_cataloger_t":"CLR","format_t":"application/pdf","modified_tdt":"2018-01-22T16:26:42Z","dissertation_name_t":"Doctor of Philosophy","school_or_college_t":"College of Science","language_t":"eng","file_s":"/dc/1d/dc1d965ede8ce72da5db5ecaf7ccd95497a02aca.pdf","format_extent_t":"26,957 bytes","created_tdt":"2016-06-06T00:00:00Z","permissions_reference_url_t":"https://collections.lib.utah.edu/details?id=1293059","_version_":1679953428956053504,"ocr_t":"INVARIANCE PRINCIPLES IN FUNCTIONAL TIME SERIES ANALYSIS WITH APPLICATIONS by Gregory Rice A dissertation submitted to the faculty of The University of Utah in partial ful llment of the requirements for the degree of Doctor of Philosophy Department of Mathematics The University of Utah May 2015 Copyright c Gregory Rice 2015 All Rights Reserved Th e Un i v e r s i t y o f Ut ah Gr a d u a t e S c h o o l STATEMENT OF DISSERTATION APPROVAL The dissertation of Gregory Rice has been approved by the following supervisory committee members: Lajos Horvath , Chair 3/9/2015 Date Approved Davar Khoshnevisan , Member 3/10/2015 Date Approved Stewart Ethier , Member 3/9/2015 Date Approved Tom Alberts , Member 3/9/2015 Date Approved David Kiefer , Member 3/9/2015 Date Approved and by Peter Trapa , Chair/Dean of the Department/College/School of Mathematics and by David B. Kieda, Dean of The Graduate School. ABSTRACT This dissertation aims to develop the theory and applications of functional time se- ries analysis. Functional data analysis came into prominence in the 1990s when more sophisticated data collection and storage systems became prevalent, and many of the early developments focused on simple random samples of curves. However, a common source of functional data is when long, continuous records are broken into segments of smaller curves. An example of this is geologic and economic data that are presented as hourly or daily curves. In these instances, successive curves may exhibit dependencies which invalidate statistical procedures that assume a simple random sample. The theory of functional time series analysis has grown tremendously in the last decade to provide methodology for such data, and researchers have focused primarily on adapting methods available in nite dimensional time series analysis to the function space setting. As a rst problem, we consider an invariance principle for the partial sum process of stationary random functions. This theory is then applied to the problems of testing for stationarity of a functional time series and the one-way functional analysis of variance problem under dependence. For my family. CONTENTS ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : iii LIST OF TABLES: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vii LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : viii ACKNOWLEDGEMENTS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ix CHAPTERS 1. INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.1 Functional data, functional time series, and examples . . . . . . . . . . . . . . . . . . 1 1.2 Organization of the dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. WEAK INVARIANCE PRINCIPLES FOR SUMS OF DEPENDENT RANDOM FUNCTIONS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Proof of theorem 2.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Some moment and maximal inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3. TESTING STATIONARITY OF FUNCTIONAL TIME SERIES : : : : : 25 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 Assumptions and test statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.2.1 Fully functional tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 Tests based on projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3 Asymptotic behavior under alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3.1 Change in the mean alternative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3.2 The integrated alternative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.3 Deterministic trend alternative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4 Implementation and nite sample performance . . . . . . . . . . . . . . . . . . . . . . . 39 3.4.1 Details of the implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4.2 Empirical size and power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.5 Application to intraday price curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.6 Proofs of the results of Section 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.7 Proofs of the results of Section 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.7.1 Variances of the limits in Theorem 3.3.1 . . . . . . . . . . . . . . . . . . . . . . . . 48 3.8 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4. TESTING EQUALITY OF MEANS WHEN THE OBSERVATIONS ARE FROM FUNCTIONAL TIME SERIES : : : : : : : : : : : : : : : : : : : : : : : 67 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.3 Consistency of the test statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.4 Implementation of the test and a simulation study . . . . . . . . . . . . . . . . . . . . . 75 4.4.1 Finite sample size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4.2 Power study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.5 Applications{electricity demand in Adelaide, Australia . . . . . . . . . . . . . . . . . 79 4.6 Proof of Theorem 4.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.7 Proofs of the Theorems 4.3.1 and 4.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.8 Distribution of a quadratic form of normal vectors . . . . . . . . . . . . . . . . . . . . . 86 4.9 Three technical lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.10 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 vi LIST OF TABLES 3.1 Empirical size results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.2 Empirical power results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.1 Empirical size results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.2 Empirical size results for heterogeneous populations . . . . . . . . . . . . . . . . . . . . . 95 4.3 Empirical power results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4 Results of weekday comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 LIST OF FIGURES 1.1 Graphs of horizontal intensity measurements and daily Disney stock prices . . . 3 3.1 Cummulative intraday returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.2 Disney stock intraday price curves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.3 P-values based on price curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.4 P-values based on CIDR curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.1 Electricity demand curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.2 Log di erenced demand curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.3 Seasonal mean curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.4 Weekday mean curves from summer months . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 ACKNOWLEDGEMENTS I would like to begin by thanking my family, in particular my anc e Brittney, my mother and father Paula and Kent, my brothers Brad and Doug, and my grandparents Merle, Wanda, Robert, and Jo. Words cannot express how much you all mean to me. I would not have accomplished a thing if not for what I learned from each of you, like the value of hard work, and friendly competition, and I thank you all for that. Thank you also for your unconditional support and love. I was just a high school football player going through the motions of school when I entered Mr. Ward's trigonometry class as a senior. It would be the last math class I would take in high school, but Mr. Ward left a lasting impression on me. He pushed me during class, realized I had some talent for the subject, and asked me to enter a district-wide Mathematics contest, in which I got second place. I doubt I ever would have even considered studying mathematics if not for his encouragement. Thank you, Mr. Ward. Among the many faculty members at Oregon State University who in uenced me during my undergraduate studies, I would like to give a special thanks to Professor Edward Waymire. Professor Waymire met me weekly during my senior year to discuss and work on problems in Real Analysis. I feel so fortunate to have had that time to hone the mathematical skills that I nd the most useful in my present work, and I always look back fondly on our meetings and conversations. Thank you, Professor Waymire, for your time, patience, and advice. I would like to especially thank several members of the Mathematics Department at the University of Utah. Thank you, Professor Davar Koshnevisan, for so many wonderful courses. I learned a lot about both mathematics, and how to present it, by watching you work. Thank you, Professor Yekaterina Ephstein, for your advice and support. Thank you, Dr. Nelson Beebe, for a number of conversations on typesetting that led to an improved presentation of this thesis. I would also like to thank the graduate program coordinators during my time as a graduate student, Sandy Hiskey and Paula Tooman. Both of you were far too nice us, and you each saved me too many times to count; thank you for that. I have been fortunate enough to have been able to collaborate with a number of great statisticians and mathematicians during the last four years from whom I learned a great deal. Thank you Professors Istv an Berkes, Marie Hu skov a, and Piotr Kokoszka, for many fruitful conversations about mathematics, your advice, and your support. My nal acknowledgement goes to my thesis advisor Professor Lajos Horv ath. Lajos treats his students more like his family than anything else. I was exceptionally fortunate to have had a supervisor who shared his thoughts, emphasized learning by doing, and pushed his students to succeed as Lajos did. I grew tremendously as a statistician under his tutelage. It is hard to count the hours that we spent together conversing, drinking espressos, and engaging in mathematical pursuits. I learned so much more from you than what is written in the pages below, and for that I am very grateful. x CHAPTER 1 INTRODUCTION In this chapter, we introduce functional data analysis and functional time series analysis. Some examples of functional time series data are discussed. We then provide an outline of the problems considered and their organization in this dissertation. 1.1 Functional data, functional time series, and examples In classical statistics, it is typically assumed that the observations are elements of Rd, and that the sample size N is larger than, or at least comparable in size to, d. Modern data, though, often exhibits such high dimensionality (d >> N) that classical statistical procedures are invalid, and this necessitates the development of new theory. One type of such data is continuous time phenomena that are observed at a high frequency. For example, the left panel of Figure 1.1 shows linearly interpolated measurements over time of the horizontal intensity of the Earth's magnetic eld, a measure of the Earth's magnetic eld strength, taken in 2001. At the weather station that provided this data, the horizontal intensity was measured every minute, giving a total of 1440 measurements per day. These measurements approximate up to a 1-minute resolution how the horizontal intensity is changing over time. It is thus natural, in this case, to view the 1440 daily measurements as a discrete sample of an underlying daily curve that represents the horizontal intensity throughout the day. Similarly, even though the stock price of a company is only recorded each time it is traded, which constitutes millions of irregularly spaced observations per day, if one were to check the price of Disney stock at google.com/ nance, it will be displayed as a continuous curve such as in the right hand panel of Figure 1.1. It is quite common that high-dimensional and high-frequency data can be interpreted as curves or functions. Viewing data in this way is the basis of functional data analysis (FDA). What is gained by trading nite dimensional observations, like the raw data in the above examples, for in principle in nite dimensional curves is two fold. Firstly, when the data are generated by an underlying continuous phenomena, a curve-based analysis is more appropriate since it takes advantage of this structure. Secondly, passing the data to a curve provides a more exible framework for such data than multivariate techniques as it does not require that 2 there be the same number of observations per curve, nor that the observations be obtained at equally spaced time intervals. A common way in which functional data are obtained is by breaking long, continuous records into segments of shorter, for example hourly or daily, curves. The points at which to segment raw data into individual curves are often clear in the context of the data. For example, with magnetogram records, segmenting the raw data into daily curves is natural due to the e ect of the Earth's rotation on the magnetic eld. In these cases, successive curves may exhibit dependencies, and functional time series analysis, which combines concepts from FDA and classical time series analysis, provides a theory to model and utilize these dependencies. This dissertation aims to develop the theory of functional time series analysis and corresponding methodology. 1.2 Organization of the dissertation This dissertation is organized into three remaining chapters. In Chapter 2, we consider an invariance principle for the partial sum process of stationary random functions that exhibit a Bernoulli shift structure. Chapter 3 develops an application of this result to testing for the stationarity of a functional time series that may be considered as an analog of the popular KPSS test. The dissertation concludes with Chapter 4 in which the one-way functional analysis of variance problem under serial dependence within each population in considered. 3 0 2000 4000 6000 8000 10000 −40 −20 0 20 40 60 Time(min) 17.0 17.5 18.0 18.5 19.0 Price($) 9am 5pm Figure 1.1. The graph on the left displays over 10,000 horizontal intensity measurements taken from March 1 to March 7, 2001. The green lines separate the data into daily functional observations. The graph on the right displays 25 functional observations derived from the intraday stock price of Disney; each curve represents one day of data. CHAPTER 2 WEAK INVARIANCE PRINCIPLES FOR SUMS OF DEPENDENT RANDOM FUNCTIONS1 Motivated by problems in functional data analysis, in this chapter, we prove the weak convergence of normalized partial sums of dependent random functions exhibiting a Bernoulli shift structure. 2.1 Introduction Functional data analysis in many cases requires central limit theorems and invariance principles for partial sums of random functions. The case of independent summands is much studied and well understood, but the theory for the dependent case is less complete. In this chapter, we study the important class of Bernoulli shift processes which are often used to model econometric and nancial data. Let X = fXi(t)g1i =1 be a sequence of random functions, square integrable on [0; 1], and let jj jj denote the L2[0; 1] norm. To lighten the notation, we use f for f(t) when it does not cause confusion. Throughout this chapter, we assume that X forms a sequence of Bernoulli shifts, i.e. Xj(t) = g( j(t); j1(t); :::) for some (2.1) nonrandom measurable function g : S1 7! L2 and iid random functions j(t); 1 < j < 1; with values in a measurable space S; j(t) = j(t; !) is jointly measurable in (t; !) (1 < j < 1); (2.2) EX0(t) = 0 for all t, and EjjX0jj2+ < 1 for some 0 < < 1, (2.3) and 1The content of this chapter is based on joint research with Istv an Berkes and Lajos Horv ath. 5 the sequence fXng1 n=1 can be approximated by `{dependent sequences (2.4) fXn;`g1 n=1 in the sense that 1X `=1 (EjjXn Xn;`jj2+ )1= < 1 for some > 2 + ; where Xn;` is de ned by Xn;` = g( n; n1; :::; n`+1; n;`); n;` = ( n;`;n`; n;`;n`1; : : :); where the n;`;k's are independent copies of 0; independent of f i;1 < i < 1g: We note that assumption (2.1) implies that Xn is a stationary and ergodic sequence. H ormann and Kokoszka (2010) call the processes satisfying (2.1){(2.4) L2 m{decomposable processes. The idea of approximating a stationary sequence with random variables which exhibit nite dependence rst appeared in Ibragimov (1962) and is used frequently in the literature (cf. Billingsley (1968)). Aue et al.(2012) provide several examples when assumptions (2.1){(2.4) hold which include autoregressive, moving average, and linear processes in Hilbert spaces. Also, the nonlinear functional ARCH(1) model (cf. H ormann et al.(2012)) and bilinear models (cf. H ormann and Kokoszka (2010)) satisfy (2.4). We show in Section 2.2 (cf. Lemma 2.2.2) that the series in C(t; s) = E[X0(t)X0(s)] + 1X `=1 E[X0(t)X`(s)] + 1X `=1 E[X0(s)X`(t)] (2.5) are convergent in L2. The function C(t; s) is positive de nite, and therefore, there exist 1 2 : : : 0 and orthonormal functions i(t); 0 t 1 satisfying i i(t) = Z C(t; s) i(s)ds; 1 i < 1; (2.6) where R means R 1 0 . We de ne (x; t) = 1X i=1 1=2 i Wi(x) i(t); where Wi are independent and identically distributed Wiener processes (standard Brownian motions). Clearly, (x; t) is Gaussian. We show in Lemma 2.2.2 that P1 `=1 ` < 1, and therefore, sup 0 x 1 Z 2(x; t)dt < 1 a.s. Theorem 2.1.1. If assumptions (2.1){(2.4) hold, then for every N we can de ne a Gaus- 6 sian process N(x; t) such that fN(x; t); 0 x; t 1g D= f(x; t); 0 x; t 1g and sup 0 x 1 Z (SN(x; t) N(x; t))2dt = oP (1); where SN(x; t) = 1 N1=2 bXNxc i=1 Xi(t): The proof of Theorem 2.1.1 is given in Section 2.2. The proof is based on a maximal inequality which is given in Section 2.3 and is of interest in its own right. There is a wide literature on the central limit theorem for sums of random processes in abstract spaces. For limit theorems for sums of independent Banach space valued random variables, we refer to Ledoux and Talagrand (1991). For the central limit theory in the context of functional data analysis, we refer to the books of Bosq (2000) and Horv ath and Kokoszka (2012). In the real valued case, the martingale approach to weak dependence was developed by Gordin (1969) and Philipp and Stout (1975), and by using such techniques, Merlev ede (1996) and Dedecker and Merlev ede (2003) obtained central limit theorems for a large class of dependent variables in Hilbert spaces. For some early in uential results on invariance for sums of mixing variables in Banach spaces, we refer to Kuelbs and Philipp (1980), Dehling and Philipp (1982), and Dehling (1983). These papers provide very sharp results, but verifying mixing conditions is generally not easy and without additional continuity conditions, even autoregressive (1) processes may fail to be strong mixing (cf. Bradley (2007)). The weak dependence concept of Doukhan and Louhichi (1999) (cf. also Dedecker et al. (2007)) solves this di culty, but so far, this concept has not been extended to variables in Hilbert spaces. Wu (2005, 2007) proved several limit theorems for one{dimensional stationary processes having a Bernoulli shift representation. Compared to classical mixing conditions, Wu's physical dependence conditions are easier to verify in concrete cases. Condition (2.3) cannot be directly compared to the approximating martingale conditions of Wu (2005, 2007). For extensions to the Hilbert space case, we refer to H ormann and Kokoszka (2010). 2.2 Proof of theorem 2.1.1 The proof is based on three steps. We recall the de nition of Xi;m from (2.4). For every xed m, the sequence fXi;mg is m{dependent. According to our rst lemma, the sums of the Xi's can be approximated with the sums of m{dependent variables. The second step is the 7 approximation of the in nite dimensional Xi;m's with nite dimensional variables (Lemma 2.2.4). Then the result in Theorem 2.1.1 is established for nite dimensional m{dependent random functions (Lemma 2.2.6). Lemma 2.2.1. If (2.1){(2.4) hold, then for all x > 0, we have lim m!1 lim sup N!1 P ( max 1 k N 1 p N Xk i=1 (Xi Xi;m) > x ) = 0: (2.7) Proof. The proof of this lemma requires the maximal inequality of Theorem 2.3.2. Section 2.3 is devoted to the proof of this result. Using Theorem 2.3.2, (2.7) is an immediate consequence of Markov's inequality. De ne Cm(t; s) = E[X0;m(t)X0;m(s)] + Xm i=1 E[X0;m(t)Xi;m(s)] + Xm i=1 E[X0;m(s)Xi;m(t)]: (2.8) We show in the following lemma that for every m, the function Cm is square{integrable. Hence there are 1;m 2;m 0 and corresponding orthonormal functions i;m; i = 1; 2; : : : satisfying i;m i;m(t) = Z Cm(t; s) i;m(s)ds; i = 1; 2; : : : Lemma 2.2.2. If (2.1){(2.4) hold, then we have ZZ C2(t; s)dtds < 1; (2.9) ZZ C2m (t; s)dtds < 1 for all m 1; (2.10) lim m!1 ZZ (C(t; s) Cm(t; s))2dtds = 0: (2.11) Z C(t; t)dt = 1X k=1 k < 1; (2.12) Z Cm(t; t)dt = 1X k=1 k;m < 1 (2.13) and lim m!1 Z Cm(t; t)dt = Z C(t; t)dt: (2.14) Proof. Using the Cauchy-Schwarz inequality for expected values, we get 8 ZZ (E[X0(t)X0(s)])2dtds ZZ ((EX2 0 (t))1=2(EX2 0 (s))1=2)2dtds = (EjjX0jj2)2 < 1: Recalling that X0 and Xi;i are independent and both have 0 mean, we conclude rst using the triangle inequality and then the Cauchy{Schwarz inequality for expected values that ZZ 1X i=1 E[X0(t)Xi(s)] !2 dtds 1=2 (2.15) = 8< : ZZ 1X i=1 E[X0(t)(Xi(s) Xi;i(s))] !2 dtds 9= ; 1=2 0 @ ZZ 1X i=1 EjX0(t)(Xi(s) Xi;i(s))j !2 dtds 1 A 1=2 1X i=1 Z Z fEjX0(t)(Xi(s) Xi;i(s))jg2 dtds 1=2 1X i=1 ZZ n (EX2 0 (t))1=2(E(Xi(s) Xi;i(s))2)1=2 o2 dtds = Z EX2 0 (t)dt 1X i=1 Z E(Xi(s) Xi;i(s))2ds = EjjX0jj2 1X i=1 EjjX0 X0;ijj2 < 1 on account of (2.4). This completes the proof of (2.9). ch Since EX0;m(t)X0;m(s) = EX0(t)X0(s), in order to establish (2.10), it is enough to show that ZZ ( Xm i=1 E[X0;m(t)Xi;m(s)] )2 dtds < 1: It follows from the de nition of Xi;m that the vectors (X0;m;Xi;m) and (X0;Xi;m) have the same distribution for all 1 i m. Also, (Xi;m;Xi;i) has the same distribution as (X0;X0;i); 1 i m: Hence following the arguments in (2.15), we get 8< : ZZ ( Xm i=1 jEX0;m(t)Xi;m(s)j )2 dtds 9= ; 1=2 = 8< : Z Z ( Xm i=1 jEX0(t)Xi;m(s)j )2 dtds 9= ; 1=2 EjjX0jj2 Xm i=1 Z E(Xi;m(s) Xi;i(s))2ds 9 EjjX0jj2 1X i=1 EjjX0 X0;ijj2: < 1: The proof of (2.10) is now complete. The arguments used above also prove (2.11). Repeating the previous arguments we have Z C(t; t)dt Z EX2 0 (t)dt + 2 1X i=1 Z jE[X0(t)Xi(t)]jdt = Z EX2 0 (t)dt + 2 1X i=1 Z jE[X0(t)(Xi(t) Xi;i(t))]jdt = Z EX2 0 (t)dt + 2 1X i=1 Z (EX2 0 (t))1=2(E[Xi(t) Xi;i(t)]2)1=2dt EjjX0jj2+2 1X i=1 Z EX2 0 (t)dt 1=2 Z E[Xi(t) Xi;i(t)]2dt 1=2 = EjjX0jj2+2(EjjX0jj2)1=2 1X i=1 (EjjX0 X0;ijj2)1=2 < 1: Observing that Z C(t; t)dt = 1X i=1 i Z 2i (t)dt = 1X i=1 i; the proof of (2.12) is complete. The same arguments can be used to establish (2.13). The relation in (2.14) can be established along the lines of the proof of (2.11). By the Karhunen{Lo eve expansion, we have that Xi;m(t) = 1X `=1 hXi;m; `;mi `;m(t): (2.16) De ne Xi;m;K(t) = XK `=1 hXi;m; `;mi `;m(t) (2.17) to be the partial sums of the series in (2.16), and X i;m;K(t) = Xi;m(t) Xi;m;K(t) = 1X `=K+1 hXi;m; `;mi `;m(t): (2.18) Lemma 2.2.3. If fZigNi =1 are independent L2 valued random variables such that 10 EZ1(t) = 0 and EjjZ1jj2< 1; (2.19) then for all x > 0, we have that P 8< : max 1 k N Xk i=1 Zi 2 > x 9= ; 1 x E XN i=1 Zi 2 : (2.20) Proof. Let Fk be the sigma algebra generated by the random variables fZjgkj =1. By assumption (2.19) and the independence of the Z0 is, we have that E 0 @ Xk+1 i=1 Zi 2 Fk 1 A = Xk i=1 Zi 2 + EjjZk+1jj2 Xk i=1 Zi 2 : Therefore, Pk i=1 Zi 2 1 k=1 is a non-negative submartingale with respect to the ltration fFkg1k =1. If we de ne A = 8< :! : max 1 k N Xk i=1 Zi 2 > x 9= ;; then it follows from Doob's maximal inequality (Chow and Teicher, 1988 p. 247) that xP 8< : max 1 k N Xk i=1 Zi 2 > x 9= ; E 0 @ XN i=1 Zi 2 IA 1 A E N X i=1 Zi 2 ; which completes the proof. Lemma 2.2.4. If (2.1){(2.4) hold, then for all x > 0, lim K!1 lim sup N!1 P ( max 1 k N 1 p N Xk i=1 X i;m;K > x ) = 0: (2.21) Proof. De ne Qk(j) = fi : 1 i k; i = j(mod m)g for j = 0; 1; :::;m1, and all positive integers k. It is then clear that Xk i=1 X i;m;K = mX1 j=0 X i2Qk(j) X i;m;K: We thus obtain by the triangle inequality that 11 P ( max 1 k N 1 p N Xk i=1 X i;m;K > x ) P 8< : mX1 j=0 max 1 k N 1 p N X i2Qk(j) X i;m;K > x 9= ;: It is therefore su cient to show that for each xed j, lim K!1 lim sup N!1 P 8< : max 1 k N 1 p N X i2Qk(j) X i;m;K > x 9= ; = 0: By the de nition of Qk(j), f X i;m;Kgi2Qk(j) is an iid sequence of random variables. So, by applications of Lemma 2.2.3 and the assumption (2.3), we have that P 8< : max 1 k N 1 p N X i2Qk(j) X i;m;K 2 > x 9= ; 1 x E 1 p N X i2QN(j) X i;m;K 2 (2.22) 1 x Ejj X 2 0;m;Kjj = 1 x 1X `=K+1 `;m: Since the right-hand side of (2.22) tends to zero as K tends to in nity independently of N, (2.21) follows. Clearly, with k = bNxc we have 1 p N Xk i=1 Xi;m;K(t) = XK j=1 0 @ 1 p N bXNxc i=1 hXi;m; j;mi 1 A j;m(t): (2.23) Lemma 2.2.5. If (2.1){(2.4) hold, then the K dimensional random process 0 @ 1 p N bXNxc i=1 hXi;m; 1;mi; :::; 1 p N bXNxc i=1 hXi;m; K;mi 1 A converges, as N ! 1, in D[0; 1] to 1=2 1;mW1(x); :::; 1=2 K;mWK(x) ; (2.24) where fWigKi=1 are independent, identically distributed Wiener processes. Proof. A similar procedure as in Lemma 2.2.4 shows that for each j, p1 N PbNxc i=1 hXi;m; j;mi can be written as a sum of sums of independent and identically distributed random variables, and thus, by Billingsley (1968), it is tight. This implies that the K dimensional process 12 0 @ 1 p N bXNxc i=1 hXi;m; 1;mi; :::; 1 p N bXNxc i=1 hXi;m; K;mi 1 A is tight, since it is tight in each coordinate. Furthermore, the Cram er-Wold device and the central limit theorem for m{dependent random variables (cf. DasGupta (2008) p. 119) shows that the nite dimensional distributions of the vector process converge to the nite dimensional distributions of the process in (2.24). The lemma follows. In light of the Skorkohod{Dudley{Wichura theorem (cf. Shorack and Wellner (1986), p. 47), we may reformulate Lemma 2.2.5 as follows. Corollary 2.2.1. If (2.1){(2.4) hold, then for each positive integer N, there exists K independent, identically distributed Wiener processes fWi;NgKi =1 such that for each j, sup 0 x 1 1 p N bXNxc i=1 hXi;m; j;mi 1=2 j;mWj;N (x) P ! 0; as N ! 1. Lemma 2.2.6. If (2.1){(2.4) hold, then for fWi;NgKi =1 de ned in Corollary 2.2.1, we have that sup 0 x 1 Z 0 @ 1 p N bXNxc i=1 Xi;m;K(t) XK `=1 1=2 `;mW`;N (x) `;m(t) 1 A 2 dt P ! 0; (2.25) as N ! 1. Proof. By using (2.23), we get that 1 p N bXNxc i=1 Xi;m;K(t) XK `=1 1=2 `;mW`;N (x) `;m(t) = XK `=1 0 @ 1 p N bXNxc i=1 hXi;m; `;mi 1=2 `;mW`;N (x) 1 A `;m(t): The substitution of this into the expression in (2.25) along with a simple calculation shows that 13 sup 0 x 1 Z 1 p N bXNxc i=1 Xi;m;K(t) XK `=1 1=2 `;mW`;N (x) `;m(t) 2dt = sup 0 x 1 XK `=1 1 p N bXNxc i=1 hXi;m; `;mi 1=2 `;mW`;N (x) 2 XK `=1 sup 0 x 1 1 p N bXNxc i=1 hXi;m; `;mi 1=2 `;mW`;N (x) 2 P ! 0; as N ! 1, by Corollary 2.2.1. Lemma 2.2.7. If (2.1){(2.4) hold, sup 0 x 1 Z 1X `=K+1 1=2 `;mW`(x) `;m(t) !2 dt P ! 0; (2.26) as K ! 1, where W1;W2; : : : are independent and identically distributed Wiener processes. Proof. Since the functions f `;mg1 `=1 are orthonormal, we have that E sup 0 x 1 Z 1X `=K+1 1=2 `;mW`(x) `;m(t) !2 dt = E sup 0 x 1 1X `=K+1 `;mW2 ` (x) 1X `=K+1 `;mE sup 0 x 1 W2 ` (x) ! 0; as K ! 1. Therefore, (2.26) follows from the Markov inequality. Lemma 2.2.8. If (2.1){(2.4) hold, then for each N, we can de ne independent identically distributed Wiener processes fWi;NgKi =1 such that sup 0 x 1 Z 0 @ 1 p N bXNxc i=1 Xi;m(t) 1X `=1 1=2 `;mW`;N (x) `;m(t) 1 A 2 dt P ! 0; as N ! 1. Proof. It follows from Lemmas 2.2.4-2.2.7. Since the distribution of W`;N ; 1 ` < 1 does not depend on N, it is enough to consider the asymptotics for P1 `=1 1=2 `;mW`(x) `;m(t), where W` are independent Wiener processes. Lemma 2.2.9. If (2.1){(2.4) hold, then for each m, we can de ne independent and iden- tically distributed Wiener processes W `;m(x); 1 ` < 1 such that 14 sup 0 x 1 Z 1X `=1 1=2 `;mW`(x) `;m(t) 1X `=1 1=2 ` W `;m(x) `(t) !2 dt P ! 0; (2.27) as m ! 1. Proof. Let m(x; t) = 1X `=1 1=2 `;mW`(x) `;m(t): Let M be a positive integer and de ne xi = i=M; 0 i M. It is easy to see that E max 0 i 0, we have 15 lim sup K!1 lim sup m!1 P 8< : Z 1X `=K+1 h m(x; ); `i `(t) !2 dt > z 9= ; = 0: (2.29) The joint distribution of h (xi; ); `i; 1 i M; 1 ` K is multivariate normal with zero mean. Hence they converge jointly to a multivariate normal distribution. To show their joint convergence in distribution, we need to show the convergence of the covariance matrix. Using again Lemma 2.2.2, we get that Eh (xi; ); `ih (xj ; ); ki = min(xi; xj) ZZ Cm(t; s) `(t) k(s)dtds ! min(xi; xj) ZZ C(t; s) `(t) k(s)dtds = min(xi; xj) `Ifk = `g: Due to this covariance structure and the Skorkohod{Dudley{Wichura theorem (cf. Shorack and Wellner (1986), p. 47), we can nd independent Wiener processes W `;m(x); 1 ` < 1 such that max 1 i M max 1 ` K jh (xi; ); `i 1=2 ` W `;m(xi)j= oP (1); as m ! 1: Clearly, for all 0 x 1 E Z 1X `=K+1 1=2 ` W `;m(x) `(t) !2 dt = x 1X `=K+1 ` ! 0; as m ! 1; and therefore similarly to (2.29) lim sup K!1 lim sup m!1 P 8< : Z 1X `=K+1 1=2 ` W `;m(x) `(t) !2 dt > z 9= ; = 0 for all z > 0. Similarly to (2.28) one can show that E max 0 i m and Zi = g( i; : : : ; 1; i), if 1 i m, where i = ( i;0; i;1; : : :) and i;j are iid copies of 0, independent of the `'s and k;`'s. Clearly, Zi and Y0 are independent and thus with Yi;i = Xi;i Zi we have E[Y0(t)Yi(t)] = E[Y0(t)(Yi(t) Yi;i(t))]: Furthermore, by rst applying the Cauchy-Schwarz inequality for expected values and then by the Cauchy{Schwarz inequality for functions in L2, we get that Z jE[Y0(t)(Yi(t) Yi;i(t))]jdt Z EY 2 0 (t) 1=2 E [Yi(t) Yi;i(t)]2 1=2 dt Z EY 2 0 (t)dt 1=2 Z E [Yi(t) Yi;i(t)]2 dt 1=2 : 18 Also, Z E [Yi(t) Yi;i(t)]2 dt 2 Z E [Xi(t) Xi;i(t)]2 dt + Z E [Xi;m(t) Zi(t)]2 dt The substitution of this expression into (2.32) gives that E Xn i=1 Yi 2 n Z EY 2 0 (t)dt + 23=2n 1X i=1 Z EY 2 0 (t)dt 1=2 Z E [Xi(t) Xi;i(t)]2 dt 1=2 + Z E [Xi;m(t) Zi(t)]2 dt 1=2 n \"Z EY 2 0 (t)dt + 25=2 Z EY 2 0 (t)dt 1=2 I(2) # ; which completes the proof. Theorem 2.3.1. If (2.1){(2.4) hold, then for all N 1 E XN i=1 (Xi Xi;m) 2+ N1+ =2B; where B = EjjX0X0;mjj2+ +c2+ [A1+ =2 + J2+ m + JmA(1+ )=2 + A(1+ =2) J2m ] (2.33) + (c J2m )1=(1 ) with A de ned in (2.31), c = 36 1 1 2 =2 1 (2.34) and Jm = 2(EjjX0 X0;mjj2+ )( 2 )=( (2+ )) 1X `=1 (EjjX0 X0;`jj2+ )1= : Proof. We prove Theorem 2.3.1 using mathematical induction. By the de nition of B, the inequality is obvious when N = 1. Assume that it holds for all k which are less than or equal to N 1. We assume that N is even, i.e. N = 2n. The case when N is odd can be done in the same way with minor modi cations. Let Yi = Xi Xi;m: For all i satisfying n + 1 i 2n, we de ne X i;n = g( i; i1; :::; n+1; n; n1; :::) where the j 's denote iid copies of 0, independent of f i;1 < i < 1g and f k;`;1 < k; ` < 1g: We de ne Zi;n = Xi;m, if m + n + 1 i 2n and 19 Zi;n = g( i; : : : ; n+1; n; : : : i m+1; i) with i = ( i;n; i;n1; : : :); if n+1 i n+m, where the k;`'s are iid copies of 0, independent of the k's and k;`'s. Let Y i;n = X i;nZi;n, if n+1 i 2n. Under this de nition, the sequences fYi; 1 i ng and fY i;n; n + 1 i 2ng are independent and have the same distribution. Let = Xn i=1 Yi + X2n i=n+1 Y i;n and = X2n i=n+1 Yi Y i;n : By applying the triangle inequality for the L2 norm and for expected values, we get E X2n i=1 Yi 2+ = E Xn i=1 Yi + X2n i=n+1 Y i;n + X2n i=n+1 Yi Y i;n 2+ (2.35) E ( + )2+ (E 2+ )1=(2+ ) + (E 2+ )1=(2+ ) 2+ : A two term Taylor expansion gives for all a; b 0 and r > 2 that (a + b)r ar + rar1b + r(r 1) 2 (a + b)r2b2: (2.36) Since both of the expected values in the last line of the inequality in (2.35) are positive, we obtain by (2.36) that E X2n i=1 Yi 2+ E 2+ + (2 + )(E 2+ )(1+ )=(2+ )(E 2+ )1=(2+ ) (2.37) + (2 + )(1 + )[(E 2+ )1=(2+ ) + (E 2+ )1=(2+ )] (E 2+ )2=(2+ ): We proceed by bounding the terms (E 2+ )1=(2+ ); and E 2+ individually. Applications of both the triangle inequality for the L2 norm and for expected values yield that (E 2+ )1=(2+ ) = 0 @E X2n i=n+1 Yi Y i;n 2+ 1 A 1=(2+ ) 0 @E X2n i=n+1 jjYi Y i;njj !2+ 1 A 1=(2+ ) X2n i=n+1 (EjjYi Y i;njj2+ )1=(2+ ): By H older's inequality, we have, with in (2.4), 20 (EjjYi Y i;njj2+ )1=(2+ ) = (E[jjYi Y i;njj(2+ )2= jjYi Y i;njj(2+ )(2+ )2= ])1=(2+ ) (EjjYi Y i;njj2+ )1= (EjjYi Y i;njj2+ )( 2 )=( (2+ )): It follows from the de nition of Yi, Y i;n and the convexity of x2+ that EjjYi Y i;njj2+ 21+ (EjjXi X i;njj2+ +EjjXi;m Zi;njj2+ ) 22+ EjjX0 X0;injj2+ and EjjYi Y i;njj2+ 21+ (EjjXi Xi;mjj2+ +EjjX i;n Zi;njj2+ ) 22+ EjjX0 X0;mjj2+ : Thus we get (E 2+ )1=(2+ ) 2(EjjX0 X0;mjj2+ )( 2 )=( (2+ )) 1X `=1 (EjjX0 X0;`jj2+ )1= = Jm: To bound E 2+ , since Pn i=1 Yi and P2n i=n+1 Y i;n are independent and have the same distribution, we have by Lemma 2.3.2, Remark 2.3.1, and the inductive assumption that E 2+ = E Xn i=1 Yi + X2n i=n+1 Y i;n 2+ 2E Xn i=1 Yi 2+ + 2 0 @E Xn i=1 Yi 2 1 A 1+ =2 2n1+ =2B + 2(nA)1+ =2: The substitution of these two bounds into (2.37) give that E X2n i=1 Yi 2+ 2n1+ =2B + 2(nA)1+ =2 (2.38) + (2 + )[2n1+ =2B + 2(nA)1+ =2](1+ )=(2+ )Jm + (2 + )(1 + ) h 2n1+ =2B + 2(nA)1+ =2 + Jm i J2m : Furthermore, by the de nition of B, we may further bound each summand on the right-hand side of (2.38). We obtain for the rst two terms that 2n1+ =2B + 2(nA)1+ =2 (2n)1+ =2B \" 2 =2 + A1+ =2 B # (2n)1+ =2B h 2 =2 + 6c1 i : 21 A similar factoring procedure applied to the expression in the second line of (2.38) yields that (2 + ) h 2n1+ =2B + 2(nA)1+ =2 i(1+ )=(2+ ) Jm 6 h (n1+ =2B)(1+ )=(2+ ) + (nA)(1+ =2)[(1+ )=(2+ )] i Jm (2n)1+ =2B \" 6Jm B1=(2+ ) + 6JmA(1+ =2)[(1+ )=(2+ )] B # (2n)1+ =2B 12c1 : Since 0 < < 1, the expression in the third line of (2.38) may be broken into three separate terms: (2 + )(1 + ) h 2n1+ =2B + 2(nA)1+ =2 + Jm i J2m 6(2n1+ =2B) J2m + 6(2 (nA)(1+ =2) J2m + 6J2+ m : Furthermore, by again applying the de nition of B, we have that 6(2n1+ =2B) J2m = (2n)1+ =2B \" 6(2n1+ =2B) J2m (2n)1+ =2B # (2n)1+ =2B 6J2m B1 (2n)1+ =2B[6c1 ]; 6(2(nA)(1+ =2)) J2m = (2n)1+ =2B \" 6(2(nA)(1+ =2)) J2m (2n)1+ =2B # (2n)1+ =2B \" 6A(1+ =2) J2m B # (2n)1+ =2B[6c1 ]; and 6J2+ m = (2n)1+ =2B 6J2+ m (2n)1+ =2B (2n)1+ =2B 6J2+ m B (2n)1+ =2B[6c1 ]: The application of these bounds to the right-hand side of (2.38) give that E X2n i=1 Yi 2+ (2n)1+ =2B h 2 =2 + 36c1 i = (2n)1+ =2B; 22 which concludes the induction step and thus the proof. Theorem 2.3.2. If (2.1){(2.4) hold, then we have E max 1 k N Xk i=1 (Xi Xi;m) !2+ amN1+ =2 (2.39) with some sequence am satisfying am ! 0 as m ! 1. Proof. By examining the proofs, it is evident that Theorem 3.1 in M oricz et al.(1982) holds for L2 valued random variables. Furthermore, by the stationarity of the sequence fXi Xi;mg1i =1 and Theorem 2.3.1, the conditions of Theorem 3.1 in M oricz are satis ed and therefore E max 1 k N Xk i=1 (Xi Xi;m) !2+ c N1+ =2B; with some constant c , depending only on and B is de ned in (2.33). Observing that B = Bm ! 0 as m ! 1, the result is proven. Theorem 2.3.1 provides inequality for the moments of the norm of partial sums of Xi Xi;m which are not Bernoulli shifts. However, checking the the proof of Theorem 2.3.1, we get the following result for Bernoulli shifts. Theorem 2.3.3. If (2.1), (2.3) are satis ed and X is a Bernoulli shift satisfying I(2 + ) = 1X `=1 (EjjX0 X0;`jj2+ )1=(2+ ) < 1 with some 0 < < 1; where X0;` is de ned by (2.4), then for all N 1 E XN i=1 Xi 2+ N1+ =2B ; where B = EjjX0jj2+ +c2+ [A1+ =2 + I2+ (2 + ) + I(2 + )A(1+ )=2 + A(1+ =2) I2(2 + )] + (c I2(2))1=(1 ); A = Z EX2 0 (t)dt + 2 Z EX2 0 (t)dt 1=2 I(2) and c is de ned in (2.34) and I(2) in (2.30). 23 Remark 2.3.2. The inequality in Theorem 2.3.1 is an extension of Proposition 4 in Berkes et al.(2011) to random variables in Hilbert spaces; we have computed how B depends on the distribution of X explicitly. Acknowledgement We are grateful to Professor Herold Dehling for pointing out several important references. 2.4 Bibliography [1] Aue, A., H ormann, S., Horv ath, L., Hu skov a, M. and J. Steinebach, J.: Sequential testing for the stability of portfolio betas. Econometric Theory, 28(2012), 804-837. [2] Berkes, I. H ormann, S. and Schauer, J.: Split invariance principles for stationary processes, Annals of Probability, 39(2011), 2441-2473. [3] Billingsley, P.:Convergence of Probability Measures, Wiley, New York, 1968. [4] Bosq, D.: Linear Processes in Function Spaces. Theory and Applications. Lecture Notes in Statistics, 149. Springer-Verlag, New York, 2000. [5] Bradley, R.C.: Introduction to Strong Mixing Conditions I-III. Kendrick Press, Heber City UT, 2007. [6] Chow, Y. and Teicher, H.: Probability Theory: Independence, Interchangeability, Martingales, Springer-Verlag, Heidelberg, 1978. [7] DasGupta, A.:Asymptotic Theory of Statistics and Probability, Springer, New York, 2008. [8] Dedecker, J. Doukhan, P. Lang, G. Le on, R.J.R. Louhichi, S. and Prieur, C.:Weak Dependence: with Examples and Applications. Lecture Notes in Statistics, 190 Springer, New York, 2007. xiv+318 pp. [9] Dedecker, J. and Merlev ede, F.: The conditional central limit theorem in Hilbert spaces. Stochastic Processes and Their Applications 108 (2003), 229-262. [10] Dehling, H.: Limit theorems for sums of weakly dependent Banach space valued random variables. Z. Wahrsch. Verw. Gebiete 63 (1983), 393-432. [11] Dehling, H. and Philip, W.: Almost sure invariance principles for weakly dependent vector-valued random variables.Annals of Probability 10 (1982), 689-701. [12] Doob, J.: Stochastic Processes , Wiley, New York, 1953. [13] Doukhan, P. and Louhichi, S.: A new weak dependence condition and applications to moment inequalities.Stochastic Processes and Their Applications 84 (1999), 313-342. [14] Eberlein, E;: An invariance principle for lattices of dependent random variables. Z. Wahrsch. Verw. Gebiete 50(1979), 119-133. 24 [15] Garsia, A.M.: Continuity properties of Gaussian processes with multidimensional time parameter. In: Proceedings of the 6thBerkeley Symposium Math. Stat. Probab., University of California, Berkeley, Vol. 2 (1970), pp. 369{374. [16] Gordin, M.I.: The central limit theorem for stationary processes. (Russian) Dokl. Akad. Nauk SSSR 188 (1969), 739-741. [17] Hardy, G.H., Littlewood, J.E. and P olya, G.: Inequalities. (Second Edition) Cambridge University Press, 1959. [18] H ormann, S., Horv ath, L. and Reeder, R.: A functional version of the ARCH model. Econometric Theory, 2012 In press. [19] H ormann, S. and Kokoszka, P.: Weakly dependent functional data. Annals of Statistics 38 (2010), 1845{1884. [20] Horv ath, L. and Kokoszka, P.: Inference for Functional Data with Applications. Springer, New York, 2012. [21] Ibragimov, I. A.: Some limit theorems for stationary processes.Theory of Probabability and Its Applications 7 (1962), 349{382. [22] Kuelbs, J. and Philipp, W.: Almost sure invariance principles for partial sums of mixing B-valued random variables. Annals of Probabability 8 (1980), 1003-1036. [23] Merlev ede, F.: Central limit theorem for linear processes with values in a Hilbert space. Stochastic Processes and Their Applications 65 (1996), 103-114. [24] M oricz, F. Ser ing, R. and Stout, W.: Moment and probability bounds with quasi- superadditive structure for the maximal partial sum, Annals of Probability, 10(1982), 1032{1040. [25] Philipp, W. and Stout, W.: Almost sure invariance principles for partial sums of weakly dependent random variables.Memoirs of the Amererican Mathematical Society 161(1975), iv+140 pp. [26] Shorack, G. and Wellner, J.: Empirical Processes With Applications To Statistics, Wiley, New York, 1986. [27] Wu, W.: Nonlinear system theory: another look at dependence. Proc. Natl. Acad. Sci. USA 102 (2005), 14150-14154. [28] Wu, W.: Strong invariance principles for dependent random variables. Annals of Probability 35 (2007), 2294-2320. CHAPTER 3 TESTING STATIONARITY OF FUNCTIONAL TIME SERIES2 Economic and nancial data often take the form of a collection of curves observed consecutively over time. Examples include, intraday price curves, yield and term structure curves, and intraday volatility curves. Such curves can be viewed as a time series of functions. A fundamental issue that must be addressed, before an attempt is made to statistically model such data, is whether these curves, perhaps suitably transformed, form a stationary functional time series. This chapter formalizes the assumption of stationarity in the context of functional time series and proposes several procedures to test the null hypothesis of stationarity. The tests are nontrivial extensions of the broadly used tests in the KPSS family. The properties of the tests under several alternatives, including change{point and I(1), are studied, and new insights, present only in the functional setting, are uncovered. The theory is illustrated by a small simulation study and an application to intraday price curves. 3.1 Introduction Over the last two decades, functional data analysis has become an important and steadily growing area of statistics. Very early on, major applications and theoretical developments pertained to functions observed consecutively over time, for example one function per day, or one function per year, with many of these data sets arising in econometric research. The main model employed for such series has been the functional autoregressive model of order one, which has received a great deal of attention; see Bosq (9), Antoniadis and Sapatinas (3), Antoniadis et al. (4), and Kargin and Onatski (30), among many others. More recent research has considered functional time series which have nonlinear dependence structure; see H ormann and Kokoszka (21), Gabrys et al. (15), Horv ath et al. (27), H ormann et al. (23), as well as the review of H ormann and Kokoszka (22) and Chapter 16 of Horv ath and Kokoszka (25). As in traditional (scalar and vector) time series analysis, the underlying 2The content of this chapter is based on joint research with Piotr Kokoszka and Lajos Horv ath. 26 assumption for inference in such models is stationarity. Stationarity is also required for functional dynamic regression models like those studied by Hays et al. (20) and Kokoszka et al. (33); for bootstrap and resampling methods for functional time series, see McMurry and Politis (37) and for the functional analysis of volatility, see M uller et al. (38). Testing stationarity received due attention as soon as fundamental time series modeling principles have emerged. Early work includes Grenander and Rosenblatt (19), Granger and Hatanaka (18), and Priestley and Subba Rao (45). The methods considered by these authors rest on the spectral analysis which dominated the eld of time series analysis at that time. While such approaches remain useful, see Dwivedi and Subba Rao (14), the spectral analysis of nonstationary functional time series has not been developed to a point where useable extensions could be readily derived. We note, however, the recent work of Panaretos and Tavakoli (39), Panaretos and Tavakoli (40) and H ormann et al. (24), who advance the spectral analysis of stationary functional time series. We follow a time domain approach introduced in the seminal paper of Kwiatkowski et al. (34) which is now rmly established in econometric theory and practice, and has been extended in many directions. The work of Kwiatkowski et al. (34) was motivated by the fact that unit root tests developed by Dickey and Fuller (11), Dickey and Fuller (12), and Said and Dickey (47) indicated that most aggregate economic series had a unit root. In these tests, the null hypothesis is that the series has a unit root. Since such tests have low power in samples of sizes occurring in many applications, Kwiatkowski et al. (34) proposed that stationarity should be considered as the null hypothesis (they used a broader de nition which allowed for deterministic trends), and the unit root should be the alternative. Rejection of the null of stationarity could then be viewed as a convincing evidence in favor of a unit root. It was soon realized that the KPSS test of Kwiatkowski et al. (34) has a much broader utility. For example, Lee and Schmidt (35) and Giraitis et al. (17) used it to detect long memory, with short memory as the null hypothesis. At present, both the augmented Dickey{Fuller test and the KPSS test, as well as its robust version of de Jong et al. (10), are typically applied to the same series to get a fuller picture. They are available in many packages, including R and Matlab implementations. The work of Lo (36) is also very relevant to our approach. His contribution is crucial because he showed that to obtain parameter free limit null distributions, statistics similar to the KPSS statistic must be normalized by the long{run variance rather than by the sample variance, which leads to these distributions only if the observations are independent. This chapter seeks to develop a general methodology for testing the assumption that a 27 functional time series to be modeled is indeed stationary and weakly dependent. Such a test should be applied before tting one of the known stationary models (all of them are weakly dependent). In many cases, it will be applied to functions transformed to remove seasonality or obvious trends, or to model residuals. At present, only CUSUM change point tests are available for functional time series; see Berkes et al. (7), Horv ath et al. (26), and Zhang et al. (53). These tests have high power to detect abrupt changes in the stochastic structure of a functional time series, either the mean or the covariance structure. Our objective is to develop more general tests of stationarity which also have high power against integrated and other alternatives. It is di cult to explain the main contribution of this chapter without introducing the required notation, but we wish to highlight in this paragraph the main di culties which are encountered in the transition from the scalar or vector to the functional case. A stationary functional time series can be represented as Xn(t) = (t) + 1X j=1 p j jnvj(t); where n is the time index that counts the functions (referring e.g. to a day), and t is the (theoretically continuous) argument of each function. The mean function and the functional principal components vj are unknown deterministic functions which depend on the stochastic structure of the series fXng, and which are estimated by random functions ^ and ^vj . If fXng is not stationary, one can still compute the estimators ^ and ^vj , but they will not converge to or vj because these population quantities will not exist then. Thus the use of a data{driven basis system vj represents an aspect which is not encountered in the theory of scalar or vector valued tests. Therefore, after de ning meaningful extensions to the functional setting, we must develop a careful analysis of the behavior of the tests under alternatives. The chapter is organized as follows. Section 3.2 formalizes the null hypothesis of station- arity and weak dependence of functional time series, introduces the tests, and explores their asymptotic properties under the null hypothesis. In Section 3.3, we turn to the behavior of the tests under several alternatives. Section 3.4 explains the details of the implementation, and contains the results of a simulation study, while Section 3.5 illustrates the properties of the tests by an application to intraday price curves. Appendices 3.6 and 3.7 contain, respectively, the proofs of the results stated in Sections 3.2 and 3.3. 28 3.2 Assumptions and test statistics Linear functional time series, in particular functional AR(1) processes, have the form Xn = P j j(\"nj ), where the \"i are iid error functions, and the j are bounded linear operators acting on the space of square integrable functions. In this chapter, we assume merely that Xn = f(\"n; \"n1; : : :), for some, possibly nonlinear, function f. The operators j or the function f arise as solutions to structural equations, very much like in the univari- ate econometric modeling; see e.g. Ter asvirta et al. (50). For the functional autoregressive process, the norms of the operators j decay exponentially fast. For the more general nonlinear moving averages, the rate at which the dependence of Xn on past errors \"nj decays with j can be quanti ed by a condition known as Lp{m{approximability stated in assumptions (3.1){(3.4) below. In both cases, these functional models can be said to be in a class which is customarily referred to as weakly dependent or short memory time series. It is convenient to state the conditions for the error process, which we denote by = f jg1 1, and which will be used to formulate the null and alternative hypotheses. Throughout the chapter, L2 denotes the Hilbert space of square integrable functions on the unit interval [0; 1] with the usual inner product h ; i and the norm jj jj it generates, R means R 1 0 . forms a sequence of Bernoulli shifts, i.e. j = g(\"j ; \"j1; : : :) (3.1) for some measurable function g : S1 7! L2 and iid functions \"j ; 1 < j < 1; with values in a measurable space S; \"j(t) = \"j(t; !) is jointly measurable in (t; !); 1 < j < 1; (3.2) E 0(t) = 0 for all t, and Ejj 0jj2+ < 1; for some 0 < < 1; (3.3) and the sequence f ng1 n=1 can be approximated by `{dependent (3.4) sequences f n;`g1 n=1 in the sense that 1X `=1 (Ejj n n;`jj2+ )1= < 1 for some > 2 + ; where n;` is de ned by n;` = g(\"n; \"n1; :::; \"n`+1; n;`); n;` = (\" n;`;n`; \" n;`;n`1; : : :); where the \" n;`;k's are independent copies of \"0; 29 independent of f\"i;1 < i < 1g: Assumptions similar to those stated above have been used extensively in recent theoreti- cal work, as all stationary time series models in practical use can be represented as Bernoulli shifts; see Wu (52), Shao and Wu (48), Aue et al. (5), H ormann and Kokoszka (21), among many other contributions. They have been used in econometric research even earlier, and the work of P otscher and Prucha (44) contributed to their popularity. Bernoulli shifts are stationary by construction; weak dependence is quanti ed by the summability condition in (3.4) which intuitively states that the function g decays so fast that the impact of shocks far back in the past is so small that they can be replaced by their independent copies, with only a small change in the distribution of the process. We wish to test H0 : Xi(t) = (t) + i(t); 1 i N; where 2 L2: The mean function is unknown. The null hypothesis is that the functional time series is stationary and weakly dependent, with the structure of dependence quanti ed by conditions (3.1){(3.4). The most general alternative is that H0 does not hold, but some profound insights into the behavior of the tests can be obtained by considering some speci c alternatives. We focus on the following. Change point alternative: HA;1 : Xi(t) = (t) + (t)Ifi > k g + i(t); 1 i N; with some integer 1 k < N: The mean function (t), the size of the change (t), and the time of the change, k , are all unknown parameters. We assume that the change occurs away from the end points, i.e. k = bN c with some 0 < < 1: (3.5) Integrated alternative: HA;2 : Xi(t) = (t) + Xi `=1 `(t); 1 i N: Deterministic trend alternative: HA;3 : Xi(t) = (t) + g(i=N) (t) + i(t); 1 i N (3.6) where 30 g(t) is a piecewise Lipschitz continuous function on [0; 1]: (3.7) The trend alternative includes various change point alternatives, including HA;1, but also those in which change can be gradual. It also includes the polynomial trend alternative, if g(u) = u . We emphasize that both under the null hypothesis and all alternatives, the mean function (t) is unknown. The tests we propose can be shown to be consistent against any other su ciently large departures from stationarity and weak dependence. In particular, functional long memory alternatives could be considered as well, as studied in the scalar case by Giraitis et al. (17). Since long memory functional processes have not been considered in any applications yet, we do not pursue this direction at this point. In the remainder of this section, we consider two classes of tests, those based on the curves themselves, and those based on the nite dimensional projections of the curves on the functional principal components. As will become clear, the tests of the two types are related. 3.2.1 Fully functional tests Our approach is based on two tests statistics. The rst is TN = ZZ Z2N (x; t)dtdx; where ZN(x; t) = SN(x; t) xSN(1; t); 0 x; t 1; with SN(x; t) = N1=2 bXNxc i=1 Xi(t); 0 x; t 1: The second test statistic is MN = TN Z Z ZN(x; t)dx 2 dt = ZZ ZN(x; t) Z ZN(y; t)dy 2 dxdt: If Xi(t) = Xi, i.e. if the data are scalars (or constant functions on [0; 1]), the statistic TN is the numerator of the KPSS statistic of Kwiatkowski et al. (34), and MN is the numerator of the V/S statistic of Giraitis et al. (17), who introduced centering to reduce the variability of the KPSS statistic and to increase power against \\changes in variance\" which are a characteristic of long memory in volatility. As pointed out by Lo (36), to obtain parameter free limits under the null, statistics of this type must be divided by the long{run variance. 31 We now proceed with the suitable de nitions in the functional case. The null limit distributions of TN and MN depend on the eigenvalues of the long{run covariance function of the errors: C(t; s) = E 0(t) 0(s) + 1X `=1 E 0(t) `(s) + 1X `=1 E 0(s) `(t): (3.8) It is proven in Horv ath et al. (27) that the series in (3.8) is convergent in L2. The function C(t; s) is positive de nite, and therefore, there exist 1 2 : : : 0 and orthonormal functions 'i(t); 0 t 1, satisfying i'i(t) = Z C(t; s)'i(s)ds; 1 i < 1: (3.9) The following theorem speci es limit distributions of TN and MN under the stationarity null hypothesis. Throughout the chapter, B1;B2; : : : are independent Brownian bridges. Theorem 3.2.1. If assumptions (3.1){(3.4) and H0 hold, then TN D ! 1X i=1 i Z B2 i (x)dx (3.10) and MN D ! 1X i=1 i Z Bi(x) Z Bi(y)dy 2 dx: (3.11) According to Theorem 3.6.1, under assumptions (3.1){(3.4), the sum P1 i=1 i is nite, and therefore, the variables T0 and M0 are nite with probability one. Theorem 3.2.1 shows, in particular, that for functional time series, a simple normaliza- tion with a long{run variance is not possible, and approaches involving the estimation of all large eigenvalues must be employed. The eigenvalues 1 2 : : : can be easily estimated under the null hypothesis because then C(t; s) = cov(X0(t);X0(s)) + 1X i=1 [cov(X0(t);Xi(s)) + cov(X0(s);Xi(t))]; so we can use the kernel estimator ^ CN of Horv ath et al. (27) de ned as ^ CN(t; s) = ^ 0(t; s) + NX1 i=1 K i h (^ i(t; s) + ^ i(s; t)) ; (3.12) where ^ i(t; s) = 1 N XN j=i+1 Xj(t) X N(t) Xji(s) X N(s) 32 with X N(t) = 1 N XN i=1 Xi(t): The kernel K in the de nition of ^ CN satis es the following conditions: K(0) = 1; (3.13) K(u) = 0 if u > c with some c > 0; (3.14) and K is continuous on [0; c]; where c is given in (3.14): (3.15) The window (or smoothing bandwidth) h must satisfy only h = h(N) ! 1 and h(N) N ! 0; as N ! 1: (3.16) Now the estimators for the eigenvalues and eigenfunctions are de ned by ^ i ^'i(t) = Z ^ CN(t; s) ^'i(s)ds; 1 i N; where ^ 1 ^ 2 : : : are the empirical eigenvalues and ^'1; ^'2; : : : are the corresponding orthonormal eigenfunctions. We can thus approximate the limits in Theorem 3.2.1 with Xd i=1 ^ i Z B2 i (x)dx and Xd i=1 ^ i Z Bi(x) Z Bi(y)dy 2 dx; where d is suitably large. The details are presented in Section 3.4. We note that the ^ i and the ^'i are consistent estimators only under H0. Their behavior under the alternatives is complex. It is studied in Section 3.3. 3.2.2 Tests based on projections Theorem 3.2.1 leads to asymptotic distributions depending on the eigenvalues i, which can collectively be viewed as an analog of the long{run variance. In this section, we will see that by projecting on the eigenfunctions ^'i, it is possible to construct statistics whose limit null distributions are parameter free. This procedure is a functional analog of dividing by an estimator of a long{run variance. To have uniquely de ned (up to the sign) eigenfunctions, we assume 1 > 2 > : : : d > d+1 > 0: (3.17) De ne 33 T0N (d) = Xd i=1 1 ^ i Z hZN(x; ); ^'ii2dx; T N (d) = Xd i=1 Z hZN(x; ); ^'ii2dx; M0N (d) = Xd i=1 1 ^ i Z hZN(x; ); ^'ii Z hZN(u; ); ^'iidu 2 dx and M N (d) = Xd i=1 Z hZN(x; ); ^'ii Z hZN(u; ); ^'iidu 2 dx: Theorem 3.2.2. If assumptions (3.1){(3.4), (3.13){(3.16), (3.17) and H0 hold, then T0N (d) D ! Xd i=1 Z B2 i (x)dx; (3.18) T N (d) D ! Xd i=1 i Z B2 i (x)dx; (3.19) M0N (d) D ! Xd i=1 Z Bi(x) Z Bi(u)du 2 dx (3.20) and M N (d) D ! Xd i=1 i Z Bi(x) Z Bi(u)du 2 dx: (3.21) It is clear that T N and M N are just d{dimensional projections of TN and MN. The distribution of the limit in (3.18) can be found in Kiefer (31). Critical values based on Monte Carlo simulations are given in Table 6.1 of Horv ath and Kokoszka (25). The distributions of the limits both in (3.18) and (3.20) can also be expressed in terms of sums of squared normals; see Shorack and Wellner (49) and Section 3.4. It is also easy to derive normal approximations. By the central limit theorem, we have, as d ! 1, 45 d 1=2 \" Xd i=1 Z B2 i (x)dx d 6 # D! N(0; 1); where N(0; 1) stands for a standard normal random variable. Aue et al. (5) demonstrated that the limit in (3.18) can be approximated well with normal random variables even for moderate d. The limit in (3.20) can be approximated in a similar manner, namely, as d ! 1, 34 360 d 1=2 \" Xd i=1 (Z B2 i (x)dx Z Bi(x)dx 2 ) d 12 # D! N(0; 1): 3.3 Asymptotic behavior under alternatives The asymptotic behavior of the KPSS and related tests under alternatives is not com- pletely understood, even for scalar data. This may be due to the fact that an asymptotic analysis of power is generally much more di cult than the theory under a null hypothesis. Giraitis et al. (17) studied the behavior of the KPSS test, the R/S test of Lo (36), and their V/S test under the alternative of long memory. Pelagatti and Sen (41) established the consistency of their nonparametric version of the KPSS test under the integrated alternative. In this section, we present an asymptotic analysis, under alternatives, of the tests introduced in Section 3.2. In the functional setting, there is a fundamentally new aspect: convergence of a scalar estimator of the long{run variance must be replaced by the convergence of the eigenvalues and the eigenfunctions of the long{run covariance function. We derive precise rates of convergence and limits for this function, and use them to study the asymptotic power of the tests introduced in Section 3.2. In Section 3.4, we will see how these asymptotic insights manifest themselves in nite samples. We expect that the tests introduced in Section 3.2 are also consistent against suitably de ned long memory alternatives. While scalar long memory models have received a lot of attention in recent decades, long memory functional models have not been considered in econometric literature yet. To keep this contribution within reasonable limits, we do not pursue this direction here. 3.3.1 Change in the mean alternative To state consistency results, we assume that the jump function is in L2, i.e. Z 2(t)dt < 1: (3.22) We introduce the function (x; t) = (t)[(x )Ifx g x(1 )] (3.23) and the Gaussian process 0(x; t) with E0(x; t) = 0 and E0(x; t)0(y; s) = (min(x; y) xy)C(t; s): The existence of the process 0(x; t) will be established in Appendix 3.6. 35 Theorem 3.3.1. If assumptions (3.1){(3.4), (3.5), (3.22), and HA;1 hold, then N1=2 TN N 3 2(1 )2jj jj2 D ! 2 ZZ 0(x; t) (x; t)dtdx (3.24) and N1=2 MN N 12 2(1 )2jj jj2 (3.25) D ! 2 ZZ 0(x; t) Z 0(y; t)dy (x; t) Z (y; t)dy dtdx: It is easy to see that the limits in Theorem 3.3.1 are zero mean normal random variables. Their variances, computed in Appendix 3.7, are positive if C(t; s) is strictly positive de nite. In that case, TN and MN increase like N. However, as we prove in Lemma 3.7.2, ^ CN(t; s) does not converge to C(t; s) under HA;1, so it is not clear what the asymptotic behavior of the critical values under HA;1 is. To show that the asymptotic power is 1, a more delicate argument is needed, which we now outline. Applying Lemma 3.7.2 with the result of Dunford and Schwartz (13), p. 1091, we conclude that ^ 1 h P! A;1 = 2 (1 )jj jj2 Z c 0 K(u)du; (3.26) and ^'1(t) ^c1 (t) jj jj = oP (1): (3.27) According to (3.26), when we compute c = c(h;N), the critical value from simulated copies of Pd i=1 ^ i R B2 i (t)dt, then c increases at most linearly with h. Therefore, using (3.16) with Theorem 3.3.1, we conclude that lim N!1 PfTN cg = 1 under HA;1: (3.28) This shows that the test based on TN is consistent. The same argument applies to MN. We now turn to the tests based on projections, with the test statistics de ned in Section 3.2.2. As we have seen, under HA;1, the largest empirical eigenvalue ^ 1 increases to 1, as N ! 1, and the corresponding empirical eigenfunction ^'1 is asymptotically in the direction of the change. This means that both T N and M N are dominated by the rst term under HA;1. The precise asymptotic behavior of all statistics introduced in Section 3.2.2 is described in the following theorem. Theorem 3.3.2. If assumptions (3.1){(3.4), (3.13){(3.16), (3.22), and HA;1 hold, then 36 N1=2 T N (1) N 3 2(1 )2h ; ^'1i2 D ! 2 ZZ 0(x; t) (x; t)dxdt; (3.29) N1=2 M N (1) N 12 2(1 )2h ; ^'1i2 (3.30) D ! 2 ZZ 0(x; t) Z 0(y; t)dy (x; t) Z (y; t)dy dtdx; h N1=2 2 (1 )jj jj2 Z c 0 K(u)du T0N (1) N 3^ 1 2(1 )2h ; ^'1i2 (3.31) D ! 2 ZZ 0(x; t) (x; t)dxdt; and h N1=2 2 (1 )jj jj2 Z c 0 K(u)du M0N (1) N 12^ 1 2(1 )2h ; ^'1i2 (3.32) D ! 2 ZZ 0(x; t) Z 0(y; t)dy (x; t) Z (y; t)dy dtdx: If in addition we assume that h=N1=2 ! 0 as N ! 1; then T N (d) = N 3 2(1 )2jj jj2(1 + oP (1)); (3.33) M N (d) = N 12 2(1 )2jj jj2(1 + oP (1)); (3.34) T0N (d) = N h (1 ) 6 R c 0 K(u)du (1 + oP (1)); (3.35) and M0N (d) = N h (1 ) 24 R c 0 K(u)du (1 + oP (1)): (3.36) Observe that according to Theorems 3.3.1 and 3.3.2, the statistics TN and T N (1) (MN and M N(1), respectively) exhibit the same asymptotic behavior under the change point alternative. This is due to the fact that the projection in the direction of ^'1 picks up all information on the change available in the data, as, by (3.27), ^'1 is asymptotically aligned with the direction of the change. 37 Remark 3.1. In the local alternative model Xi(t) = (t) + N (t)Ifi > k g + i(t); 1 i N; with some integer 1 k < N; where jj N jj! 0 as N ! 1. We discuss brie y how the statistic TN behaves under this model. If N1=2jj N jj! 0, then TN converges in distribution to RR (0(x; t))2dtdx as is the case under H0. On the other hand, if N1=2jj N jj! 1, then TN P ! 1 and therefore, consistency is retained. Moreover, under the additional assumption N RR C(t; s) N (t) N (s)dtds ! 1 we show that 1 AN TN 1 N jj Njj2 D ! N(0; 1); (3.37) where A2 N = 4N ZZ C(t; s) N (t) N (s)dtds ZZ (min(x; y) xy) (x) (y)dxdy: In the critical case when N1=2jj N jj! in L2, where is some non zero function, then we have TN D ! + 1X `=1 n `jjB`jj2+2 1=2 ` hB`; ih'`; i o ; (3.38) where = jj jj2jj jj2, B1;B2; : : : are independent Brownian bridges, the i's and ''s are de ned in (3.9), and (x) = (x )Ifx g x(1 ): (3.39) The asymptotic behavior of MN can be studied analogously in the local alternative change point model. The derivation of the asymptotic properties of T0N (d); T N (d);M0N (d), and M N (d) is much more involved since it requires the study of ^ CN(t; s) under this model. We will not pursue this line of inquiry in the present chapter. 3.3.2 The integrated alternative Let (x; t) = Z x 0 (u; t)du x Z (u; t)du; (3.40) where (x; t) is a Gaussian process with E(x; t) = 0 and E(x; t)(y; s) = min(x; y)C(t; s). The existence of (x; t) is established in Theorem 3.6.1. For the fully functional tests of Section 3.2.1, we have the following result. Theorem 3.3.3. If assumptions (3.1){(3.4) and HA;2 hold, then 1 N2 TN D ! ZZ 2(x; t)dtdx (3.41) 38 and 1 N2MN D ! ZZ (x; t) Z (u; t)du 2 dtdx: (3.42) To nd the limit distributions of the statistics based on projections, we need the following theorem. Theorem 3.3.4. If assumptions (3.1){(3.4), (3.13){(3.16), and HA;2 hold, then 1 N ZN(x; t); 1 Nh ^ CN(t; s); 0 x; t; s 1 ! (x; t);Q(t; s); 0 x; t; s 1 in D([0; 1] L2), where Q(t; s) = 2 Z c 0 K(w)dw Z R(z; t)R(z; s)dz; with R(z; t) = Z z 0 (u; t)du Z Z v 0 (u; t)du dv: We show in Lemma 3.7.5 that Q(t; s) is non{negative de nite with probability one, so there are random variables 1 2 : : : and random functions ' 1 (t); ' 2 (t); : : : satisfying i ' i (t) = Z Q(t; s)' i (s)ds; 1 i < 1: (3.43) Combining Theorem 3.3.4 with Dunford and Schwartz (13), we get that ^ 1=(Nh);^ 2=(Nh); : : : ; : : : ;^ d=(Nh); ^'1(t); ^'2(t); : : : ; ^'d(t) D ! ( 1 ; 2 ; : : : ; d; ' 1 (t); ' 2 (t); : : : ; ' d(t)) : Thus the behavior of T0N (d); T N (d);M0N (d) and M N (d) is an immediate consequence of Theorem 3.3.4. An argument similar to that developed in Section 3.3.1 shows that the tests are consistent. Theorem 3.3.5. If assumptions (3.1){(3.4), (3.13){(3.16), and HA;2 hold, then h N T0N (d) D! Xd i=1 1 i Z h (x; ); ' i ( )i2dx; (3.44) 1 N2 T N (d) D! Xd i=1 Z h (x; ); ' i ( )i2dx; (3.45) h N M0N (d) D! Xd i=1 1 i Z h (x; ); ' i ( )i Z h (u; ); ' i ( )idu 2 dx; (3.46) 39 and 1 N2M N (d) D! Xd i=1 Z h (x; ); ' i ( )i Z h (u; ); ' i ( )idu 2 dx: (3.47) 3.3.3 Deterministic trend alternative Let g(x) = Z x 0 g(u)du x Z g(u)du; 0 x 1: Theorem 3.3.6. If assumptions (3.1){(3.4), (3.6), (3.22) and HA;3 hold, then N1=2 TN Njj jj2 Z g2(x)dx D ! 2 ZZ 0(x; t) (t) g(x)dtdx (3.48) and N1=2 ( MN Njj jj2 Z g(x) Z g(y)dy 2 dx ) (3.49) D ! 2 ZZ 0(x; t) Z 0(y; t)dy g(x) Z g(y)dy (t)dtdx: The limits in (3.48) and (3.49) are normal random variables with zero mean and variances which can be expressed in terms of the long{run covariance kernel C( ; ) and the functions and g. We do not display these complex formulas to conserve space. They extend the formulas for the variances of the limits in Theorem 3.3.1 which are given in Appendix 3.7. The consistency of the procedures based on projections can be established by extending the arguments used to prove Theorem 3.3.2, however with more abstract notation. Again, to keep this work within reasonable limits of space, we do not present the details. 3.4 Implementation and nite sample performance In this section we discuss the implementation of the testing procedure developed in the sections above. A simulation study is then presented in order to investigate the nite sample properties of the test. 3.4.1 Details of the implementation To implement the tests introduced in Section 3.2, several issues must be considered. The choice of the kernel K( ) and the smoothing bandwidth h are the most obvious. Beyond that, to implement Monte Carlo tests based on statistics whose limits depend on the estimated eigenvalues, a fast method of calculating replications of these limits must be employed. The issues of bandwidth and kernel selection have been extensively 40 studied in the econometric literature for over three decades; we cannot cite dozens, if not hundreds, of papers devoted to them. Perhaps the best known contributions are those of Andrews (1) and Andrews and Monahan (2) who introduced data-driven bandwidth selection and prewhitening. While these approaches possess optimality properties in general regression models with heteroskedastic and correlated errors, they are not optimal in all speci c applications. In particular, J onsson (28) found that the nite-sample distribution of the (scalar) KPSS test statistic can be very unstable when the Quadratic Spectral kernel (recommended by Andrews (1)) is used and/or a prewhitening lter is applied. He recommends the Bartlett kernel. An elaboration on the nite sample properties of the KPSS test with many relevant references can be found in J onsson (29). This chapter focuses on the derivation and large sample theory for the stationarity tests for functional time series; we cannot present here a comprehensive and conclusive study of the nite sample properties, which are still being investigated even for scalar time series. We however wish to o er some practical guidance and report approaches which worked well for the data-generating processes we considered. Politis (2003, 2011) argues that the at top kernel K(t) = 8< : 1; 0 t < 0:1 1:1 jtj; 0:1 t < 1:1 0; jtj 1:1 (3.50) has better properties than the Bartlett or the Parzen kernels. In our empirical work, we used kernel (3.50). Our simulations showed that h = N1=2 is satisfactory for our hypothesis testing problem when the observations are independent or weakly dependent (functional autoregressive processes). The empirical sizes and power functions change little if h is taken 5 lags smaller or larger. We note that the optimal rates derived in Andrews (1) do not apply to kernel (3.50) because this piecewise function does not satisfy the regularity conditions assumed by Andrews (1). It can be shown that the optimal rates for Bartlett and Parzen kernels remain the same in the functional case, but the multiplicative constants depend in a very complex way on the high-order moments of the functions, and the arguments Andrews (1) used to approximate them cannot be readily extended. Once the kernel and the bandwidth have been selected, the eigenvalues ^ i can be computed. This allows us to compute the normalized statistics T0N (d) and M0N (d) and use the tests based on the asymptotic distribution of their limits. The critical values can be computed by using the expansions analogous to (3.51) or (3.52) (without the ^ i). Alternatively, since these limits do not depend on the distribution of the data, the critical 41 values can be obtained by calculating a large number of replications of T0N (d) and M0N (d) for any speci c functional time series. We used iid Brownian motions, and we refer to the tests which use the critical values so obtained as T0N (d)(AM) and M0N (d)(AM) (Alternative method). This method is extremely computationally intensive, if its performance is to be assessed by simulations; we needed almost two months of run time on the University of Utah Supercomputer (as of June 2013) to obtain the empirical rejection rates for T0N (d)(AM) and M0N (d)(AM) for samples of size 100 and 250 and values of d between 1 and 10. The limits of statistics TN and T N must be approximated by the MC distribution of Pd i=1 ^ i R B2 i (x)dx, and one must proceed analogously for MN and M N . Using the expansions discussed in Shorack and Wellner (49), pp. 210{211, we use the approximations ^ Td;J = Xd i=1 ^ i XJ j=1 Z2 j j2 2 ; (3.51) and ^M d;J = Xd i=1 ^ i XJ j=1 Z2 2j1 + Z2 2j 4j2 2 ; (3.52) where fZjg1j =1 are iid standard normal random variables. For large J, the sums over j approximate the integrals of the functionals of the Brownian bridge and eliminate the need to generate its trajectories and to perform numerical integration. In our work, we used J = 100, and one thousand replications to obtain MC distributions. To select d, we use the usual \\cumulative variance\" approach recommended by Ramsay and Silverman (46) and Horv ath and Kokoszka (25); d is chosen so that roughly v% of the sample variance is explained by the rst d principal components. In our implementation, we estimated the total of 49 largest eigenvalues (the largest number under which the estimation is numerically stable), and used d = dv such that ^ 1 + + ^ dv ^ 1 + + ^ 49 v: A general recommendation is to use v equal to about 90%, but we report results for v = :85; :90; :95, to see how the performance of the tests is a ected by the choice of d. This is a new aspect of the stationarity tests, which re ects the in nite dimensional structure of the functional data, and which is absent in tests for scalar or vector time series. 3.4.2 Empirical size and power We rst compare the empirical size of the tests implemented as described above. We consider two data-generating processes (DGPs): 1) iid copies of the Brownian motion (BM), 42 2) the functional AR process of order 1 (FAR(1)). There are a large number of stationary functional time series that could be considered. In our small simulation study, the focus on the BM is motivated by the application to cumulative intraday returns considered in Section 3.5; they approximately look like realizations of the BM; see Figure 3.1. The FAR(1), with Brownian motion innovations, is used to generate temporal dependence: the tests should have correct size for general stationary functional time series, not just for iid functions. The FAR(1) process is de ned by the equation Xi(t) = Z 1 0 (t; u)Xi1(u)du +Wi(t); 0 t 1; (3.53) where the Wi are independent Brownian motions on [0; 1], and is a kernel whose operator norm is not too large. The precise condition is somewhat technical; see Bosq (9) or Chapter 13 of Horv ath and Kokoszka (25). A su cient condition for a stationary solution to equation (3.53) to exist is that the Hilbert{Schmidt norm of be less than 1. We work with the kernel (t; s) = c exp t2 + s2 2 with c = :3416 so that the Hilbert{Schmidt norm of is approximately 0.5. We consider functional time series of length N = 100 and N = 250. Each DGP is simulated one thousand times, and the percentage of rejections of the null hypothesis is reported at the signi cance levels of 10 and 5%. The empirical sizes are reported in Table 3.1, which leads to the following conclusions: 1. The tests T0N (AM) and T0N (AM) have reasonably good empirical size, which does not depend on v. Note that we used the BM processes to obtain the critical values, so it is not surprising that we observe good results when using BM as the DGP. However, the observations of the FAR(1) series are no longer BMs. 2. If the limit distribution is used to calculate the critical values, the tests based on the MC distributions (statistics TN;MN; T N ;M N ) are less sensitive to the choice of the cumulative variance v. 3. The tests based on MN and M N are generally too conservative at the 5% level. 4. Even though statistic T N is too conservative at the 5% level in case of the FAR(1) model, it achieves a reasonable balance of empirical size at the 10 and 5% levels. 5. If the temporal dependence is not too strong, we recommend statistics T N with v = 90%. 43 We now turn to the investigation of the empirical power. The number of DGPs that could be considered under the alternative of nonstationarity is enormous. In our simulation study, we consider merely two examples intended to illustrate the theory developed in Section 3.3. Under the change point alternative, HA;1, the DGP is Xi(t) = ( Bi(t) if i < bN=2c Bi(t) + (t) if i bN=2c; where the Bi are iid Brownian bridges, and (t) = 2t(1 t), so that the change in the mean function is comparable to the typical size of the Brownian bridge. Under the I(1) alternative, HA;2, we consider the integrated functional sequence de ned by Xi(t) = Xi1(t) + Bi(t); 1 i N; where X0(t) = B0(t), and fBi(t)g1i =0 are iid Brownian Bridges. Again, each data-generating process is simulated 1000 times and the rejection rate of H0 is reported when the signi cance level is 10% and 5%. Table 3.2 shows the results of these simulations. The following conclusions can be reached: 1. Under the change point alternative, the T statistics have higher power than the M statistics. This is in perfect agreement with Theorems 3.3.1 and 3.3.2, which show that the leading terms of the T statistics are four times larger than those of the corresponding M statistics. 2. The same observation remains true under the integrated alternative, and again it agrees with the theoretical rates obtained in Theorems 3.3.3 and 3.3.4. The multi- plicative constants of leading terms of the T statistics are equal to second moments and those of the M statistics to corresponding variances. 3. As for empirical size, the T statistics are not sensitive to the choice of v. 4. The test based on T N has slightly lower power than those based on T0N and TN, but this is because the latter two tests have slightly in ated sizes. Our overall recommendation remains to use T N with v = 0:90. However, if very high power is of central importance, and computational time not a big concern, the method T0N (AM) might be superior. 3.5 Application to intraday price curves Some of the most natural and obvious functional data are intraday price curves; ve such functions are shown in Figure 3.2. Not much quantitative research has however focused on the analysis of the information contained in the shapes of such curves, even though they 44 very closely re ect the reactions and expectations of intraday investors. Extensive research has focused on scalar or vector summary statistics derived from intraday data, including realized volatility and noise variance estimation; see Barndor -Nielsen and Shephard (6) and Wang and Zou (51), among many others. Several papers have however considered the shapes of suitably de ned price or volatility curves; see Gabrys et al. (15), M uller et al. (38), Gabrys et al. (16), Kokoszka and Reimherr (32), and Kokoszka et al. (33). This chapter focuses on statistical methodology and the underlying theory, and we cannot include a comprehensive empirical study of functional aspects of intraday price data. We merely show that the application of our tests leads to meaningful and useful insights. Suppose Pn(tj); n = 1; : : : ;N; j = 1; : : : ;m, is the price of a nancial asset at time tj on day n. Figure 3.2 shows ve functional data objects constructed from the 1-minute average price of Disney stock interpolated by B-splines. In this case, the number of points tj used to construct each object is m = 390. Each object is viewed as a continuous curve, making these data an excellent candidate for functional data analysis. As daily closing prices form a nonstationary scalar time series, we would expect the daily price curves to form a nonstationary functional time series. When our tests are applied to su ciently long periods of time, they indeed always reject the null hypothesis of stationarity. For shorter periods of time, H0 is sometimes rejected and sometimes is not, most likely due to reduced power. To illustrate, Figure 3.3 displays the P{values for the test based on TN applied to consecutive nonoverlapping segments of length N in the time period from 04/09/1997 to 04/02/2007, which comprises 2,510 trading days. This means that there are 50 segments of length N = 50, 25 segments of length N = 100, and 10 segments of length N = 250. If N = 250, H0 is always rejected. We obtained very similar results for the other T statistics. When the M statistics are used, the rejection rates are marginally lower, but overall commensurate with those for the T statistics. We also applied the tests to several other stocks over the same period, including Chevron, Bank of America, Microsoft, IBM, McDonalds, and Walmart, and obtained nearly identical results. The results are also very similar for gold futures. The price of gold increased ve fold between 2001 and 2011, with an almost linear trend. For segments of length N = 100, the null is sometimes not rejected if the curves do not show a clear increasing tendency over that period, but otherwise we obtained strong rejections. In order to t stationary functional time series models to intraday price curves; a suitable transformation should be applied. Gabrys et al. (15) put forward the following de nition. De nition 3.5.1. Suppose Pn(tj); n = 1; : : : ;N; j = 1; : : : ;m, is the price of a nancial 45 asset at time tj on day n. The functions Rn(tj) = 100[ln Pn(tj) ln Pn(t1)]; j = 1; 2; : : : ; m; n = 1; : : : ;N; are called the cumulative intraday returns (CIDRs). The idea behind De nition 3.5.1 is very simple. If the return from the start of a trading day until its close remains within the 5% range, Rn(tj) is practically equal to the simple return [Pn(tj)Pn(t1)]=Pn(t1). Since Pn(t1) is xed for every trading day, the Rn(tj) have practically the same shape as the price curves; see Figure 3.1. However, since they always start from zero, level stationarity is enforced. The division by Pn(t1) helps reduce the scale in ation. It can thus be hoped that the CIDRs will form a stationary functional time series, which will be amenable to the statistical analysis of the shapes of the intraday price curves. We note that the CIDRs are not readily comparable to daily returns because they do not include the overnight price change. They are designed to statistically analyze the evolution of the intraday shapes of an asset. We wish to verify our conjecture of the stationarity of the CIDRs by application of our tests of stationarity. If the conjecture is true, the expectation is that the P-values will be roughly uniformly distributed on (0; 1). Figure 3.4 shows results of the test using TN when applied to sequential segments of the CIDR curves of the Disney stock. We see that the P-values appear to be uniformly distributed, which is consistent with the stationarity of the CIDRs. Again, the results for the other eight stocks are very similar. 3.6 Proofs of the results of Section 3.2 The proof of Theorem 3.2.1 is based on an approximation developed in Berkes et al. (8) (Theorem 3.6.1 below). De ne (x; t) = 1X i=1 1=2 i Wi(x)'i(t); (3.54) where Wi are independent and identically distributed Wiener processes (standard Brownian motions). Clearly, (x; t) is Gaussian with zero mean and E(x; t)(y; s) = min(x; y)C(t; s): Theorem 3.6.1. If assumptions (3.1){(3.4) hold, then 1X `=1 ` < 1 (3.55) and for every N we can de ne a sequence of Gaussian processes N(x; t) such that 46 fN(x; t); 0 x; t 1g D= f(x; t); 0 x; t 1g and sup 0 x 1 Z (VN(x; t) N(x; t))2dt = oP (1); where VN(x; t) = 1 N1=2 bXNxc i=1 i(t): (It follows immediately from (3.55) that sup0 x 1 R 2(x; t)dt < 1 a.s.) Proof of Theorem 3.2.1 Let V 0N (x; t) = VN(x; t) xVN(1; t): Under H0 ZN(x; t) = V 0N (x; t) + (t) bNxc Nx N1=2 and since 2 L2, we get sup 0 x 1 jjZN(x; t) V 0N (x; t)jj 1 N1=2 jj jj: Hence TN = ZZ (V 0N (x; t))2dtdx + oP (1) and MN = ZZ V 0N (x; t) Z V 0N(y; t)dy 2 dxdt + oP (1): Applying Theorem 3.6.1, we get immediately that TN D ! ZZ (0(x; t))2dxdt and MN D ! ZZ 0(x; t) Z 0(y; t)dy 2 dxdt; where 0(x; t) = (x; t) x(1; t): We also note that by the de nition of (x; t) in (3.54), we have 0(x; t) = 1X i=1 1=2 i Bi(x)'i(t); (3.56) where Bi are independent and identically distributed Brownian bridges. Using the fact that 47 f'i(t); 0 t 1g1i =1 is an orthonormal system, one can easily verify that ZZ (0(x; t))2dxdt = 1X i=1 i Z B2 i (x)dx and ZZ 0(x; t) Z 0(y; t)dy 2 dtdx = 1X i=1 i Z Bi(x) Z Bi(y)dy 2 dx: The following lemma is an immediate consequence of the results in Section 2.7 of Horv ath et al. (27), or of Dunford and Schwartz (13). Lemma 3.6.1. If assumptions (3.1){(3.4), (3.13){(3.16), (3.17), and H0 hold, then max 1 i d j^ i ij= oP (1) and max 1 i d jj ^'i ^ci'ijj= oP (1); where ^c1; ^c2; : : : ; ^cd are unobservable random signs de ned as ^ci = sign(h ^'i; 'ii): Proof of Theorem 3.2.2 It follows from Theorem 3.6.1 that sup 0 x 1 jhSN(x; ) 0 N(x; ); 'iij sup 0 x 1 jjSN(x; ) 0 N(x; )jj= oP (1) and by Lemma 3.6.1, we get sup 0 x 1 jh0 N(x; ); ^'i ^ci'iij sup 0 x 1 jj0 N(x; )jjjj ^'i ^ci'ijj= oP (1): It is immediate from (3.56) that for all N h0 N(x; ); 'ii; 0 x 1; 1 i d D= n 1=2 i Bi(x); 0 x 1; 1 i d o ; where B1;B2; : : : ;Bd are independent Brownian bridges. Thus we obtain that Xd i=1 1 ^ i h0 N(x; ); ^ci'ii2 D[0;1] ! Xd i=1 B2 i (x): (3.57) The weak convergence in (3.57) now implies (3.18). The same arguments can be used to prove (3.19){(3.21). 3.7 Proofs of the results of Section 3.3 Proof of Theorem 3.3.1 First we introduce the function N(x; t) = (t)fbNxc Nxg + (t)f(bNxc k )Ifk bNxcg x(N k )g: 48 Under HA;1, we can write ZN(x; t) = V 0N (x; t) + N1=2 N(x; t) (3.58) and therefore, TN = ZZ Z2N (x; t)dtdx (3.59) = ZZ (V 0N (x; t))2dtdx + 2 N1=2 ZZ V 0N (x; t) N(x; t)dxdt + 1 N ZZ 2N (x; t)dtdx: It follows from Theorem 3.6.1 that ZZ (V 0N (x; t))2dtdx = OP (1): (3.60) It is easy to check that sup 0 x 1 1 N N(x; t) (x; t) = O 1 N ; (3.61) where (x; t) is de ned in (3.23). Thus applying Theorem 3.6.1, we conclude that 1 N ZZ V 0N (x; t) N(x; t)dxdt D ! ZZ 0(x; t) (x; t)dtdx: (3.62) Also, 1 N ZZ 2N (x; t)dtdx (3.63) = N Z 2(t)dt Z 0 x2(1 )2dx + Z 1 (1 x)2 2dx + O(1): Now (3.24) is an immediate consequence of (3.59){(3.63). The second part of Theorem 3.3.1 is proven analogously. 3.7.1 Variances of the limits in Theorem 3.3.1 The next lemma is used to show that the variances of the limits in Theorem 3.3.1 are strictly positive. Lemma 3.7.1. Let be a L2 valued Gaussian process such that E (t) = 0 and E (t) (s) is a strictly positive de nite function on [0; 1]2. Let g 2 L2. Then var( R (t)g(t)dt) = 0 if and only if g = 0 a:e: Proof. By the Karhunen{Lo eve expansion and the assumption that E (t) (s) is strictly 49 positive de nite, we may write (t) = 1X `=1 `N` `(t); 0 t 1; where fNig1i =1 are iid standard normal random variables, f i(t)g1i =1 form an orthonormal basis, and i > 0 for all i 1. It follows by a simple calculation that Z (t)g(t)dt = 1X `=1 `N`h `; gi; and hence var Z (t)g(t)dt = 1X `=1 2` h `; gi2: Since P1 `=1 2` h `; gi2 = 0 if and only if g = 0 a:e:, the result follows. It is easy to see that RR 0(x; t) (x; t)dtdx is a normal random variable with zero mean. Its variance is thus equal to E ZZ 0(x; t) (x; t)dtdx 2 (3.64) = ZZZZ C(t; s) (x; t) (y; s)(min(x; y) xy)dtdsdxdy = ZZ C(t; s) (t) (s)dtds ZZ (x) (y)(min(x; y) xy)dxdy ; where (x) is de ned in (3.39). Similarly to (3.24), the limit in (3.25) is normally distributed with zero mean and variance equal to E ZZ 0(x; t) Z 0(y; t)dy (x; t) Z (y; t)dy dtdx 2 = ZZ C(t; s) (t) (s)dtds ZZ (x) (y) h min(x; y) xy Z (min(y; z) yz)dz Z (min(x; z) xz)dz + ZZ (min(z; z0) zz0)dzdz0 i dxdy = ZZ C(t; s) (t) (s)dtds ZZ (x) (y) h min(x; y) xy y(1 y) 2 x(1 x) 2 + 1 12 i dxdy : If the bivariate function C(t; s) is strictly positive de nite, then RR C(t; s) (t) (s)dtds > 0 if (t) is not the 0 function in L2. Observing that RR (x) (y)(min(x; y) xy)dxdy = 50 var( R B(x) (x)); where B is a Brownian bridge, the positivity of (3.64) follows by Lemma 3.7.1 since (x) is not the zero function and the covariance function of the Brownian bridge is strictly positive de nite. A similar application of Lemma 3.7.1 yields that ZZ (x) (y) h min(x; y) xy y(1 y) 2 x(1 x) 2 + 1 12 i dxdy > 0: Lemma 3.7.2. If assumptions (3.1){(3.4), (3.13){(3.16), (3.22), and HA;1 hold, then ^ CN(t; s) 2 (1 ) (t) (s) XN i=1 K(i=h) + CN(t; s) ! = OP (h=N1=2); where CN(t; s) = 0(t; s) + NX1 i=1 K i h f i(t; s) + i(s; t)g (3.65) with i(t; s) = 1 N XN j=i+1 ( j(t) N(t)) ( ji(s) N(s)) ; 0 i N 1: Proof. First we write with i(t) = EXi(t) and observe that ^ i(t; s) = 1 N XN j=i+1 ( j(t) N(t) [ N(t) i(t)]) ( ji(s) N(s) [ N(s) ji(s)]) = 1 N XN j=i+1 ( j(t) N(t)) ( ji(s) N(s)) + 1 N XN j=i+1 ( j(t) N(t))( ji(s) N(s)) + 1 N XN j=i+1 ( j(t) N(t))( ji(s) N(s)) + 1 N XN j=i+1 ( j(t) N(t))( ji(s) N(s)) = i(t; s) + ^ (1) i (t; s) + ^ (2) i (t; s) + ^ (3) i (t; s) with N(t) = 1 N XN `=1 ` and N(t) = (t) + N bN c N (t): By the triangle inequality, we have 51 ^ (1) 0 (t; s) + NX1 i=1 K i h (^ (1) i (t; s) + ^ (1) i (s; t)) ^ (1) 0 (t; s) + NX1 i=1 K i h ^ (1) i (t; s) + NX1 i=1 K i h ^ (1) i (s; t) : Using Theorem 3.6.1, we get ^ (1) 0 (t; s) = OP (N1=2): Using again the triangle inequality, we obtain that E NX1 i=1 K (i=h) ^ (1) i (t; s) NX1 i=1 K (i=h)Ejj^ (1) i (t; s)jj: (3.66) Furthermore, by an application of the Cauchy{Schwarz inequality, Ejj^ (1) i (t; s)jj 1 N XN j=i+1 ( ji(s) N(s)) E 1 N XN j=i+1 ( j(t) N(t)) : It is clear that max 1 i N 1 N XN j=i+1 ( ji(s) N(s)) = O(1); and by Berkes et al. (2013), max 1 i N E 1 N XN j=i+1 ( j(t) N(t)) = O(N1=2): Combining these bounds with (3.66) and assumptions (3.13){(3.15) gives E NX1 i=1 K(i=h)^ (1) i (t; s) = O(h=N1=2); and hence by Markov's inequality, NX1 i=1 K(i=h) ^ (1) i (t; s) = OP (h=N1=2): Thus we conclude ^ (1) 0 (t; s) + NX1 i=1 K i h (^ (1) i (t; s) + ^ (1) i (s; t)) = OP (h=N1=2): (3.67) Similarly to (3.67), we have 52 ^ (2) 0 (t; s) + NX1 i=1 K i h (^ (2) i (t; s) + ^ (2) i (s; t)) = OP (h=N1=2): (3.68) Using the de nition of N(t) and HA;1, we obtain that max 0 i h jj^ (3) 0 (t; s) (1 ) (t) (s)jj= O(h=N): (3.69) The lemma now follows from (3.67){(3.69). Proof of Theorem 3.3.2 The proof of Theorem 3.3.2 is based on the asymptotic properties of ^ CN under HA;1. It follows from Lemma 3.7.2 that (3.26) and (3.27) hold assuming only (3.16). We write by (3.58) hZN(x; ); ^'1i2 = hV 0N (x; ); ^'1i2 + N1h N(x; ); ^'1i2 + 2hV 0N (x; ); ^'1iN1=2h N(x; ); ^'1i: Combining Theorem 3.6.1 with the Cauchy{Schwarz inequality, we get sup 0 x 1 jhV 0N (x; ); ^'1ij sup 0 x 1 jjV 0N (x; )jj= OP (1): Using (3.61), we conclude Z N1h N(x; ); ^'1i2dx = N 3 2(1 )2h ; ^'1i2(1 + OP (1=N)): Theorem 3.6.1 and (3.27) yield N1=2 Z hV 0N (x; ); ^'1ih N(x; ); ^'1idx D! 1 jj jj Z h0(x; ); ih (x; ); idx = Z Z 0(x; t) (t)dt [(x )Ifx g x(1 )]dx = ZZ 0(x; t) (x; t)dxdt: This completes the proof of (3.29). It follows from (3.29) that ^ 1 N1=2 T0N(1) N 3^ 1 2(1 )2h ; ^'1i2 D ! 2 ZZ 0(x; t) (x; t)dxdt; and therefore, (3.29) implies (3.31). Similar arguments prove (3.30) and (3.32). If in addition we assume that h=N1=2 ! 0 as N ! 1, then by Lemma 3.7.2 and Dunford and Schwartz (13), we have (3.26), (3.27), and for every xed i 2, ^ i P! i; (3.70) 53 where 2 3 : : : 0 (di erent from the i; i 2), jj ^'i(t) ^ci 'ijj= oP (1); i 2; (3.71) with some functions '2; '3; : : :, where ^ci = sign(h ^'i; 'ii). (Of course, 'i is only de ned if ^ i > 0:) Using again (3.58) with Theorem 3.6.1 and (3.71), we obtain that Z hZN(x; ); ^'ii2dx = N 3 2(1 )2h ; ^'ii2 + OP (N1=2): Since and 'i are orthogonal for all i 2, (3.71) implies h ; ^'ii = oP (1): Hence (3.33) follows from (3.29). The results in (3.34){(3.36) can be established similarly so the proofs are omitted. Proof of Remark 3.1. Let N(x; t) = (t)fbNxc Nxg + N (t)f(bNxc k )Ifk bNxcg x(N k )g: Using (3.59) with N(x; t) replaced with N(x; t) and Theorem 3.6.1, we get TN 1 N jj Njj2= ZZ (0 N(x; t))2dtdx(1 + oP (1)) (3.72) + 2N1=2 ZZ 0 N(x; t) N (t) (x)dtdx(1 + oP (1)): By the Cauchy{Schwarz inequality ZZ 0 N(x; t) N(t) (x)dtdx = OP (jj N jj): (3.73) Elementary arguments show that 1 N jj Njj2= jj jj2Njj N jj2(1 + o(1)); (3.74) as N ! 1. If N1=2jj N jj! 0 as N ! 1 then by (3.72){(3.74), we obtain immediately that TN D ! RR (0(x; t))2dtdx. If N1=2jj N jj! 1, then again by (3.72){(3.74), we see that TN P ! 1. Since for every xed N, RR 0 N(x; t) N (t) (x)dtdx is normal with zero mean and variance RR (min(x; y)xy) (x) (y)dxdy RR N (t) N (s)C(t; s)dtds, hence (3.37) follows. In the case when N1=2 N L2 ! , it follows from (3.74) that (1=N)jj Njj2! = jj jj2jj jj2> 0. Now by (3.72) and the representation of 0 N in (3.56), we conclude 54 TN D= (1 + o(1)) + 1X `=1 1=2 ` 1=2 ` Z B2 ` (x)dx + Z B`(x) (x)dx Z '`(t)N1=2 N (t)dt (1 + oP (1)) ! + 1X `=1 n `jjB`jj2+2 1=2 ` hB`; ih'`; i o ; which completes the proof of (3.38). Lemma 3.7.3. If assumptions (3.1){(3.4) hold, then sup 0 x 1 Z UN(x; t) Z x 0 N(u; t)du 2 dt = oP (1); (3.75) where UN(x; t) = 1 N3=2 bXNxc k=1 Xk i=1 i(t); and the Gaussian processes N(x; t) are de ned in Theorem 3.6.1. Proof. It is enough to verify that sup 0 x 1 Z UN(x; t) Z x 0 VN(u; t)du 2 dt = sup 0 x 1 UN(x; ) Z x 0 VN(u; ) 2 = oP (1) and sup 0 x 1 Z Z x 0 fVN(u; t) N(u; t)g du 2 dt = oP (1): Elementary arguments yield UN(x; t) Z x 0 VN(u; t)du 1 N3=2 bXNxc i=1 i(t) : It follows from Theorem 3.6.1 that sup 0 x 1 N1=2 bXNxc i=1 i( ) = OP (1); and therefore, sup 0 x 1 UN(x; ) Z x 0 VN(u; )du = OP 1 N : Using the Cauchy{Schwarz inequality with Theorem 3.6.1, we conclude Z Z x 0 (VN(u; t) N(u; t)) du 2 dt Z Z x 0 (VN(u; t) N(u; t))2 dudt Z Z (VN(u; t) N(u; t))2 dudt 55 = oP (1): Now the proof of Lemma 3.7.3 is complete. Proof of Theorem 3.3.3 First we note that under HA;2 we have 1 N3=2 bXNxc k=1 Xk(t) = UN(x; t) + bNxc N3=2 (t): (3.76) Therefore, 1 N ZN(x; t) = UN(x; t) xUN(1; t) + bNxc xN N3=2 (t): Using (3.76), we get via the Cauchy{Schwarz inequality ZZ 1 N ZN(x; t) 2 dtdx ZZ (UN(x; t) xUN(1; t))2dtdx ZZ 1 N ZN(x; t) [UN(x; t) xUN(1; t)] 2 dtdx + 2 ZZ 1 N ZN(x; t) [UN(x; t) xUN(1; t)] UN(x; t) xUN(1; t) dtdx oP (1) + oP (1) ZZ (UN(x; t) xUN(1; t))2dtdx 1=2 = oP (1); since by Lemma 3.7.3 ZZ (UN(x; t) xUN(1; t))2dtdx = OP (1): It also follows from Lemma 3.7.3 that ZZ (UN(x; t) xUN(1; t))2dtdx D ! ZZ 2(x; t)dtdx; which completes the proof of (3.41). The proof of (3.42) is similar to that of (3.41) and therefore, the details are omitted. Lemma 3.7.4. De ne IN(z; t) = Z z 0 N(u; t)du Z Z v 0 N(u; t)du dv; where the Gaussian processes N(x; t) are de ned in Theorem 3.6.1. Let QN(t; s) = 2 Z c 0 K(w)dw Z 1 0 IN(z; t)IN(z; s)dz: 56 If assumptions (3.1){(3.4), (3.13){(3.16), and HA;2 hold, then 1 Nh ^ CN(t; s) QN(t; s) = oP (1): Proof. Since X N(t) = (t) + 1 N XN j=1 Xj `=1 `(t); Theorem 3.6.1 yields N1=2( X N(t) (t)) Z Z v 0 N(u; t)du dv = oP (1); resulting in max 1 i N1 1 N ^ i(t; s) 1 N XN j=i+1 Z j=N 0 N(u; t)du Z Z v 0 N(u; t)du dv ! (3.77) Z (ji)=N 0 N(u; s)du Z Z v 0 N(u; s)du dv ! = oP (1): Next we use the almost sure continuity with N(0; t) = 0 to conclude max 1 i ch 1 N XN j=i+1 Z j=N 0 N(u; t)du Z Z v 0 N(u; t)du dv ! (3.78) Z (ji)=N 0 N(u; s)du Z Z v 0 N(u; s)du dv ! 1 N XN j=i+1 Z j=N 0 N(u; t)du Z Z v 0 N(u; t)du dv ! Z j=N 0 N(u; s)du Z Z v 0 N(u; s)du dv ! = oP (1): Putting together (3.77) and (3.78), we get max 1 i ch 1 N ^ i(t; s) Z 1 i=N \" Z z 0 N(u; t)du Z Z v 0 N(u; t)du dv Z z 0 N(u; s)du Z Z v 0 N(u; s)du dv # dz = oP (1) and 57 max 1 i ch Z 1 i=N \" Z z 0 N(u; t)du Z Z v 0 N(u; t)du dv Z z 0 N(u; s)du Z Z v 0 N(u; s)du dv # dz Z 1 0 \" Z z 0 N(u; t)du Z Z v 0 N(u; t)du dv Z z 0 N(u; s)du Z Z v 0 N(u; s)du dv # dz = oP (1): Since K satis es conditions (3.14) and (3.15), the proof of Lemma 3.7.4 is complete. Lemma 3.7.5. For every N 1 we have ( QN(t; s); 0 t; s 1 ) D= 2 Z c 0 K(w)dw ( 1X i;j=1 1=2 i 1=2 j 'i(t)'j(s) i;j ) ; (3.79) where 1; 2; : : : ; '1; '2; : : : are de ned (3.9) and for every i; j 1 i;j D= Z \" Z z 0 Wi(u)du Z Z v 0 Wi(u)du dv Z z 0 Wj(u)du Z Z v 0 Wj(u)du dv # dz; where W1;W2; : : : are independent Wiener processes. Also, QN(t; s) is a non{negative de nite function for all N with probability one. Proof. The representation in (3.79) is an immediate consequence of (3.54). It follows from (3.79) that QN(t; s) is symmetric and QN 2 L2 with probability one. Also for any g 2 L2 we have ZZ QN(t; s)g(t)g(s)dtds = Z Z 1X i=1 1=2 i 'i(t) Z z 0 Wi(u)du Z Z v 0 Wi(u)du dv g(t)dt !2 dz 0; completing the proof. Proof of Theorem 3.3.4 The result follows immediately from the proofs of Lemmas 3.7.3 and 3.7.4. 58 Proof of Theorem 3.3.5 The result in Theorem 3.3.4 and (3.43) yield that there are processes N(x; t); N(x; t);QN(t; s) such that fN(x; t); N(x; t);QN(t; s); 0 x; t; s 1g D= f(x; t); (x; t);QN(t; s); 0 x; t; s 1g and max 0 x 1 1 N ZN(x; t) N(x; t) = oP (1) and 1 Nh ^ CN(t; s) QN(t; s) = oP (1): Similarly to (3.43) we de ne 1;N 2 ;N : : : and random functions ' 1;N (t); ' 2 (t); : : : satisfying i;N' i;N (t) = Z QN(t; s)' i;N (s)ds; 1 i < 1: (3.80) Hence max 1 i d j^ i i;N j= oP (1) and max 1 i d jj ^'i ^ci' i;N jj= oP (1); where ^c1; ^c2; : : : are random signs. By construction, f N(x; t);QN(t; s); 1;N ; : : : ; d;N ; (' 1;N (t))2; : : : ; (' d;N (t))2; 0 x; t; s 1g D= f (x; t);QN(t; s); 1 ; : : : ; d; (' 1 (t))2; : : : ; (' d(t))2; 0 x; t; s 1g; which completes the proof. 3.8 Bibliography [1] D. W. K. Andrews. Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica, 59:817{858, 1991. [2] D. W. K. Andrews and J. C. Monahan. An improved heteroskedasticity and autocor- relation consistent covariance matrix estimator. Econometrica, 60:953{966, 1992. [3] A. Antoniadis and T. Sapatinas. Wavelet methods for continuous time prediction using Hilbert{valued autoregressive processes. Journal of Multivariate Analysis, 87:133{158, 2003. [4] A. Antoniadis, E. Paparoditis, and T. Sapatinas. A functional wavelet{kernel approach for time series prediction. Journal of the Royal Statistical Society, Series B, 68:837{857, 2006. [5] A. Aue, S. H ormann, L. Horv ath, and M. Reimherr. Break detection in the covariance structure of multivariate time series models. The Annals of Statistics, 37:4046{4087, 2009. 59 [6] O. E. Barndor -Nielsen and N. Shephard. Econometric analysis of realized covariance: High frequency based covariance, regression and correlation in nancial economics. Econometrica, 72:885{925, 2004. [7] I. Berkes, R. Gabrys, L. Horv ath, and P. Kokoszka. Detecting changes in the mean of functional observations. Journal of the Royal Statistical Society (B), 71:927{946, 2009. [8] I. Berkes, L. Horv ath, and G. Rice. Weak invariance principles for sums of dependent random functions. Stochastic Processes and Their Applications, 2013. Under revision. [9] D. Bosq. Linear Processes in Function Spaces. Springer, New York, 2000. [10] R. M. de Jong, C. Amsler, and P. Schmidt. A robust version of the KPSS test based on indicators. Journal of Econometrics, 137:311{333, 1997. [11] D. A. Dickey and W. A. Fuller. Distributions of the estimattors for autoregressive time series with a unit root. Journal of the American Statistical Association, 74:427{431, 1979. [12] D. A. Dickey and W. A. Fuller. Likelihood ratio statistics for autoregressive time series with unit root. Econometrica, 49:1057{1074, 1981. [13] N. Dunford and J. T. Schwartz. Linear Operators, Parts I and II. Wiley, New York, 1988. [14] Y. Dwivedi and S. Subba Rao. A test for second order stationarity based on the discrete Fourier transform. Journal of Time Series Analysis, 32:68{91, 2011. [15] R. Gabrys, L. Horv ath, and P. Kokoszka. Tests for error correlation in the functional linear model. Journal of the American Statistical Association, 105:1113{1125, 2010. [16] R. Gabrys, S. H ormann, and P. Kokoszka. Monitoring the intraday volatility pattern. Journal of Time Series Econometrics, 2013, Forthcoming [17] L. Giraitis, P. S. Kokoszka, R. Leipus, and G. Teyssi ere. Rescaled variance and related tests for long memory in volatility and levels. Journal of Econometrics, 112:265{294, 2003. [18] C. W. J. Granger and M. Hatanaka. Spectral Analysis of Economic Time Series. Princeton University Press, 1964. [19] U. Grenander and M. Rosenblatt. Statistical Analysis of Stationary Time Series. Wiley, New York, 1957. [20] S. Hays, H. Shen, and J. Z. Huang. Functional dynamic factor models with application to yield curve forecasting. The Annals of Applied Statistics, 6:870{894, 2012. [21] S. H ormann and P. Kokoszka. Weakly dependent functional data. The Annals of Statistics, 38:1845{1884, 2010. [22] S. H ormann and P. Kokoszka. Functional time series. In C. R. Rao and T. Subba Rao, editors, Time Series, volume 30 of Handbook of Statistics. Elsevier, New York, 2012. 60 [23] S. H ormann, L. Horv ath, and R. Reeder. A functional version of the ARCH model. Econometric Theory, 29:267{288, 2013. [24] S. H ormann, L. Kidzi nski, and M. Hallin. Dynamic functional principal components. Technical report, Universit e libre de Bruxelles, 2013. [25] L. Horv ath and P. Kokoszka. Inference for Functional Data with Applications. Springer, New York, 2012. [26] L. Horv ath, M. Hu skov a, and P. Kokoszka. Testing the stability of the functional autoregressive process. Journal of Multivariate Analysis, 101:352{367, 2010. [27] L. Horv ath, P. Kokoszka, and R. Reeder. Estimation of the mean of functional time series and a two sample problem. Journal of the Royal Statistical Society (B), 75: 103{122, 2013. [28] K. J onsson. Finite-sample stability of the KPSS test. Working Paper 2006:23, Department of Economics, Lund University, 2006. [29] K. J onsson. Testing stationarity in small- and medium-sized samples when disturbances are serially correlated. Oxford Bulletin of Economics and Statistics, 73:669{690, 2011. [30] V. Kargin and A. Onatski. Curve forecasting by functional autoregression. Journal of Multivariate Analysis, 99:2508{2526, 2008. [31] J. Kiefer. K-sample analogues of the Kolmogorov-Smirnov and Cram er-v.Mises tests. Ann. Math. Statist., 30:420{447, 1959. [32] P. Kokoszka and M. Reimherr. Predictability of shapes of intraday price curves. Econometrics Journal, 2013. Forthcoming. [33] P. Kokoszka, H. Miao, and X. Zhang. Functional multifactor regression for intraday price curves. Technical report, Colorado State University, 2013. [34] D. Kwiatkowski, P. C. B. Phillips, P. Schmidt, and Y. Shin. Testing the null hypothesis of stationarity against the alternative of a unit root: how sure are we that economic time series have a unit root? Journal of Econometrics, 54:159{178, 1992. [35] D. Lee and P. Schmidt. On the power of the KPSS test of stationarity against fractionally integrated alternatives. Journal of Econometrics, 73:285{302, 1996. [36] A. W. Lo. Long-term memory in stock market prices. Econometrica, 59:1279{1313, 1991. [37] T. McMurry and D. N. Politis. Resampling methods for functional data. In F. Ferraty and Y. Romain, editors, Oxford Handbook on Statistics and FDA. Oxford University Press, 2010. [38] H-G. M uller, R. Sen, and U. Stadtm uller. Functional data analysis for volatility. Journal of Econometrics, 165:233{245, 2011. [39] V. M. Panaretos and S. Tavakoli. Fourier analysis of stationary time series in function space. The Annals of Statistics, 41:568{603, 2013. 61 [40] V. M. Panaretos and S. Tavakoli. Cram er{Karhunen{Lo eve representation and har- monic principal component analysis of functional time series. Stochastic Processes and their Applications, 123:2779{2807, 2013. [41] M. M. Pelagatti and P. K. Sen. Rank tests for short memory stationarity. Journal of Econometrics, 172:90{105, 2013. [42] D. N. Politis. Adaptive bandwidth choice. Journal of Nonparametric Statistics, 25: 517{533, 2003. [43] D. N. Politis. Higher-order accurate, positive semide nite estimation of large sample covariance and spectral density matrices. Econometric Theory, 27:1469{4360, 2011. [44] B. P otscher and I. Prucha. Dynamic Non{linear Econonometric Models. Asymptotic Theory. Springer, New York, 1997. [45] M. B. Priestley and T. Subba Rao. A test for non{stationarity of time{series. Journal of the Royal Statistical Society (B), 31:140{149, 1969. [46] J. O. Ramsay and B. W. Silverman. Functional Data Analysis. Springer, New York, 2005. [47] S. E. Said and D. A. Dickey. Testing for unit roots in autoregressive{moving average models of unknown order. Biometrika, 71:599{608, 1984. [48] X. Shao and W. B. Wu. Asymptotic spectral theory for nonlinear time series. The Annals of Statistics, 35:1773{1801, 2007. [49] G. R. Shorack and J. A. Wellner. Empirical Processes with Applications to Statistics. Wiley, New York, 1986. [50] T. Ter asvirta, D. Tj stheim, and C. W. J. Granger. Modeling Nonlinear Economic Time Series. Advanced Texts in Econometrics. Oxford University Press, 2010. [51] Y. Wang and J. Zou. Vast volatility matrix estimation for high{frequency nancial data. The Annals of Statistics, 38:953{978, 2010. [52] W. Wu. Nonlinear System Theory: Another Look at Dependence, volume 102 of Proceedings of The National Academy of Sciences of the United States. National Academy of Sciences, 2005. [53] X. Zhang, X. Shao, K. Hayhoe, and D. Wuebbles. Testing the structural stability of temporally dependent functional observations and application to climate projections. Electronic Journal of Statistics, 5:1765{1796, 2011. 62 −3 −2 −1 0 1 2 3 CIDR's 4/14/1997 4/18/1997 Figure 3.1. Five cumulative intraday returns constructed from the intraday prices displayed in Figure 3.2. 23.5 24.0 24.5 25.0 25.5 Price($) 4/14/1997 4/18/1997 Figure 3.2. Five functional data objects constructed from the 1-minute average price of Disney stock. The vertical lines separate the days. 63 l l l l l l lll l llll l l llllllll l l l l l l l l l l l l l l l l l l l l l l l l l l 0 10 20 30 40 50 0.0 0.1 0.2 0.3 0.4 0.5 N=50, 34/50 significant P−values l lllllllllll l l l l l l l l l l l l l 5 10 15 20 25 0.0 0.1 0.2 0.3 0.4 N=100, 21/25 significant Figure 3.3. P-values for consecutive segments of length N of the price curves Pn(t) of the Disney stock computed using TN with v = :9. The horizontal line shows the 5% threshold. 64 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l 0 10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 N=50, 3/50 significant P−values l l l l l l l l l l l l l l l l l l l l l l l l l 5 10 15 20 25 0.1 0.2 0.3 0.4 0.5 0.6 N=100, 2/25 significant Figure 3.4. P-values for consecutive segments of length N of the CIDR curves Rn(t) for the Disney stock. The red line shows the 5% threshold. 65 Table 3.1. Empirical sizes for the iid BM and FAR(1) DGPs. We used h = N1=2 and the at{top kernel (3.50). The standard error is approximately 0:9% for the 10% level and 0:4 % for the 5% level. DGP BM FAR(1) N 100 250 100 250 Nominal 10% 5% 10% 5% 10% 5% 10% 5% Statistics TN v .85 13.9 5.6 13.3 5.3 12.3 4.2 11.8 4.1 .90 12.6 5.4 12.5 5.2 11.8 3.7 11.0 4.1 .95 12.7 4.6 12.0 4.9 11.3 3.6 10.4 3.7 Statistics MN v .85 9.7 2.2 11.4 3.6 11.4 2.3 11.8 4.2 .90 8.8 1.5 10.5 3.0 9.4 1.6 11.2 4.0 .95 8.2 0.9 9.9 2.9 8.5 1.2 10.1 3.4 Statistics T N v .85 11.3 4.9 10.8 4.6 10.2 3.4 10.3 3.7 .90 11.2 4.4 10.7 4.8 10.0 3.4 10.3 3.4 .95 11.8 4.4 11.1 4.5 10.6 3.2 10.0 3.6 Statistics M N v .85 6.3 0.7 8.8 2.6 11.0 2.1 11.2 4.2 .90 7.0 0.8 8.9 2.3 8.9 1.3 10.7 4.0 .95 6.9 0.7 8.8 2.7 8.2 1.1 10.0 3.2 Statistics T0N v .85 10.4 3.8 10.3 3.9 10.1 2.8 9.2 3.3 .90 9.2 2.3 9.0 2.8 7.7 1.5 8.6 2.9 .95 4.6 0.8 7.6 1.4 5.0 0.1 7.2 1.3 Statistics M0N v .85 6.1 0.5 6.7 2.1 6.7 1.5 7.7 2.8 .90 4.2 0.8 5.4 1.7 5.8 1.0 7.2 2.4 .95 2.9 0.3 5.6 1.4 3.3 0.0 5.1 0.9 Statistics T0N (AM) v .85 11.9 5.4 10.2 5.1 12.1 7.1 11.7 6.1 .90 10.3 5.7 9.2 4.8 11.7 7.2 9.8 4.9 .95 9.9 4.3 9.0 4.7 11.2 6.9 9.7 5.3 Statistics M0N (AM) v .85 8.8 5.1 10.7 4.6 12.7 7.7 10.8 5.8 .90 8.6 5.3 10.0 4.5 12.1 7.3 10.5 5.4 .95 8.5 4.7 9.8 5.2 11.9 7.1 10.6 5.4 66 Table 3.2. Empirical power for change point and I(1) alternatives. We used h = N1=2 and the at{top kernel (3.50). DGP Change point I(1) N 100 250 100 250 Nominal 10% 5% 10% 5% 10% 5% 10% 5% Statistic TN v .85 80.7 56.4 99.6 98.1 99.3 96.5 99.2 96.3 .90 80.1 56.6 99.5 97.6 99.4 95.8 99.2 96.1 .95 79.2 54.4 99.4 97.6 99.1 96.2 99.2 96.3 Statistic MN v .85 50.6 14.7 95.2 84.1 93.8 68.3 97.7 92.5 .90 46.7 11.0 94.7 82.7 92.8 64.5 97.6 92.4 .95 79.2 54.4 99.4 97.6 90.9 61.4 99.2 96.3 Statistic T N v .85 77.2 52.1 99.3 97.6 98.9 95.7 98.0 94.2 .90 77.8 54.3 99.5 97.5 99.2 95.8 98.4 95.7 .95 77.5 53.7 99.4 97.6 99.1 96.0 99.1 96.1 Statistic M N v .85 39.6 8.5 93.7 78.1 93.5 67.9 94.9 88.2 .90 39.9 7.7 93.9 79.6 92.7 63.9 96.2 89.3 .95 40.8 6.8 94.5 79.8 90.5 61.2 96.6 90.0 Statistic T0N v .85 85.8 55.1 99.8 98.9 99.5 98.1 98.6 96.2 .90 86.6 52.0 100 99.6 99.7 98.8 99.3 98.7 .95 74.7 31.3 99.9 98.4 100 96.0 99.9 99.8 Statistic M0N v .85 35.0 7.8 97.2 77.7 86.1 75.2 97.9 92.7 .90 31.0 5.9 98.0 71.5 90.9 73.4 99.2 95.4 .95 21.1 4.8 93.0 63.0 96.8 75.9 100 98.5 Statistic T0N (AM) v .85 96.6 91.6 100 100 99.6 99.3 99.8 99.7 .90 94.9 85.5 100 100 100 99.9 100 100 .95 82.5 70.0 100 100 100 100 100 100 Statistic M0N (AM) v .85 85.3 71.3 99.9 99.8 93.4 85.0 99.8 99.3 .90 68.7 52.3 99.7 98.8 96.3 90.3 99.9 99.9 .95 43.7 28.5 97.0 92.0 98.6 96.3 100 100 CHAPTER 4 TESTING EQUALITY OF MEANS WHEN THE OBSERVATIONS ARE FROM FUNCTIONAL TIME SERIES 3 There are numerous examples of functional data in areas ranging from earth science to nance where the problem of interest is to compare several functional populations. In many instances, the observations are obtained consecutively in time, and thus the classical assumption of independence within each population may not be valid. In this chapter, we derive a new, asymptotically justi ed method to test the hypothesis that the mean curves of multiple functional populations are the same. The test statistic is constructed from the coe cient vectors obtained by projecting the functional observations into a nite dimensional space. Asymptotics are established when the observations are considered to be from stationary functional time series. Although the limit results hold for projections into arbitrary nite dimensional spaces, we show that higher power is achieved by projecting onto the principle components of empirical covariance operators which diverge under the alternative. Our method is further illustrated by a simulation study as well as an application to electricity demand data. 4.1 Introduction To this day, a frequently used tool to analyze multiple populations is the one{way analysis of variance (ANOVA) in which the means of k populations are compared. For scalar and vector valued data, this problem has been extensively studied under numerous conditions, and we refer to Anderson (2003) for a review of the subject. The methods developed therein are suitable to address a host of modern statistical questions; however, it is of increasing interest to consider data which take the form of functions or curves for which nite dimensional approaches are not appropriate. For this reason, the theory of functional data analysis has been steadily growing in recent years and much e ort has been put forth to adapt classical statistical procedures, such as ANOVA, for functional data. In order to 3The content of this chapter is based on joint research with Lajos Horv ath. 68 formally state the one{way functional analysis of variance (FANOVA) problem, we assume that we have observations from k functional populations which satisfy the one{way layout design Xi;j(t) = i(t) + i;j(t); t 2 [0; 1]; 1 i k and 1 j Ni; (4.1) where Xi;j is the jth observation from the ith population, i is the common mean function of the ith population, and i;j is a random error function satisfying E i;j(t) = 0; t 2 [0; 1]; 1 i k; and 1 j Ni: (4.2) The assumption that t 2 [0; 1] is made without loss of generality. We also assume that observations from separate populations are independent, namely the error sequences f i;j ; 1 j Nig are independent: (4.3) We wish to test the null hypothesis H0 : 1( ) = 2( ) = : : : = k( ); where equality holds in the L2 sense, versus the general alternative HA : H0 does not hold. Assuming two independent populations based on independent random functions, Fan and Lin (1998) and Hall and Keilegom (2007) developed testing procedures. Testing for di erences between the means of several populations, Laukaitis and Ra ckauskas (2005), Abramovich and Angelini (2006), Antoniadis and Sapatinas (2007), and Mart nez{Camblor and Corral (2011) expand the observations using wavelets and the tests are based on the wavelet coe cients. Due to the complexity of the distribution of the test statistics used, the critical values are obtained by resampling. Cuevas et al. (2004) developed a test statistic using the L2 norm of the di<"}]},"highlighting":{"197389":{"ocr_t":[]}}}