Cognitive workload modeling of task priority in detection and choice tasks

Cognitive workload modeling of task priority in detection and choice tasks

Title	Cognitive workload modeling of task priority in detection and choice tasks
Publication Type	dissertation
School or College	College of Humanities
Department	Psychology
Author	Castro, Spencer C.
Date	2019
Description	Research concerning human performance in complex multitask environments relies heavily upon the fundamental psychological principles of limited-capacity attention and top-down mechanisms of attention allocation. To provide a suitable computational model for limited attention on a measure of cognitive workload, we implement a hierarchical Bayesian evidence accumulation framework for a discrete/continuous detection task (Experiment 1) and two simultaneous choice tasks (Experiment 2). We measure fluctuations in cognitive workload for instruction-induced task priority for a standard measure of cognitive workload in driving and a steering task (Experiment 1) and simultaneous decisions of a computer-based task (Experiment 2). Evidence accumulation modeling provides evidence for changes in both information processing speed (drift rate) and certainty of responses (response threshold). The results indicate that both drift rate and response threshold vary with processing priority, with a greater contribution from response thresholds. The most robust finding suggests that-contrary to strictly resourcelimited theories of attention-strategic allocation of resources can drive performance more than a dynamic slowing in the rate of information processing.
Type	Text
Publisher	University of Utah
Subject	attention; cognitive workload; detection response task; evidence accumulation modeling
Dissertation Name	Doctor of Philosophy
Language	eng
Rights Management	© Spencer C. Castro
Format	application/pdf
Format Medium	application/pdf
ARK	ark:/87278/s6arr40x
Setname	ir_etd
ID	1724233
OCR Text	Show COGNITIVE WORKLOAD MODELING OF TASK PRIORITY IN DETECTION AND CHOICE TASKS by Spencer C. Castro A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Psychology The University of Utah December 2019 Copyright © Spencer C. Castro 2019 All Rights Reserved The University of Utah Graduate School STATEMENT OF DISSERTATION APPROVAL The dissertation of Spencer C. Castro has been approved by the following supervisory committee members: , Chair David L. Strayer 10/22/2019 Date Approved , Member Joel M. Cooper 10/22/2019 Date Approved , Member Trafton Drew 10/22/2019 Date Approved 10/22/2019 , Member Brennan R. Payne Date Approved , Member Andrew Heathcote __________ Date Approved and by Bert N. Uchino , Chair/Dean of the Department/College/School of and by David B. Kieda, Dean of The Graduate School. Psychology ABSTRACT Research concerning human performance in complex multitask environments relies heavily upon the fundamental psychological principles of limited-capacity attention and top-down mechanisms of attention allocation. To provide a suitable computational model for limited attention on a measure of cognitive workload, we implement a hierarchical Bayesian evidence accumulation framework for a discrete/continuous detection task (Experiment 1) and two simultaneous choice tasks (Experiment 2). We measure fluctuations in cognitive workload for instruction-induced task priority for a standard measure of cognitive workload in driving and a steering task (Experiment 1) and simultaneous decisions of a computer-based task (Experiment 2). Evidence accumulation modeling provides evidence for changes in both information processing speed (drift rate) and certainty of responses (response threshold). The results indicate that both drift rate and response threshold vary with processing priority, with a greater contribution from response thresholds. The most robust finding suggests that—contrary to strictly resourcelimited theories of attention—strategic allocation of resources can drive performance more than a dynamic slowing in the rate of information processing. “The art of being wise is the art of knowing what to overlook.” – William James, The Principles of Psychology (1890) TABLE OF CONTENTS ABSTRACT ..................................................................................................................... iii LIST OF TABLES ........................................................................................................... vii LIST OF FIGURES ......................................................................................................... viii ACKNOWLEDGEMENTS .............................................................................................. ix INTRODUCTION .............................................................................................................. 1 History ...................................................................................................................... 2 CHALLENGES .................................................................................................................. 7 APPROACH ..................................................................................................................... 12 EXPERIMENT 1 .............................................................................................................. 18 Research Objectives ............................................................................................. 18 Aim 1 ........................................................................................................ 19 Hypotheses ................................................................................... 19 Method .................................................................................................................. 20 Participants ............................................................................................... 20 Materials ................................................................................................... 21 Steering task ................................................................................. 21 Detection response task ................................................................ 21 Design ....................................................................................................... 21 Measures ....................................................................................... 23 Results .................................................................................................................. 23 DRT Measures .......................................................................................... 24 Reaction time ................................................................................ 24 Hit rate .......................................................................................... 25 Steering Measures .................................................................................... 25 RMSE ........................................................................................... 25 Behavioral Discussion .............................................................................. 26 Modeling Results ...................................................................................... 28 Model estimation and selection .................................................... 31 Model comparisons. ..................................................................... 31 Parameter tests .............................................................................. 32 The underlying causes of priority effects. .................................... 33 Model parameter and pursuit tracking correlations. ..................... 33 Modeling Discussion ................................................................................ 34 EXPERIMENT 2 .............................................................................................................. 36 Research Objectives ............................................................................................. 36 Aim 2 ........................................................................................................ 36 Hypotheses ................................................................................... 37 Method .................................................................................................................. 37 Participants ............................................................................................... 37 Materials ................................................................................................... 38 Procedure .................................................................................................. 38 Measures ....................................................................................... 38 Results .................................................................................................................. 39 Behavioral Measures ................................................................................ 39 Reaction time ................................................................................ 39 Accuracy ....................................................................................... 40 Behavioral Discussion .............................................................................. 41 Modeling Results ...................................................................................... 42 Priors............................................................................................. 44 Posterior inference ........................................................................ 45 Model fit ....................................................................................... 45 Model estimation and selection .................................................... 45 Model comparisons ...................................................................... 48 Parameter tests .............................................................................. 49 The underlying causes of priority effects ..................................... 49 Modeling Discussion ................................................................................ 50 GENERAL DISCUSSION ............................................................................................... 52 Modeling the Interaction of Attention Allocation and Capacity .......................... 54 Applications .......................................................................................................... 56 Driving ...................................................................................................... 56 Workplace Multitasking ........................................................................... 56 Team Workload ........................................................................................ 57 Conclusions .............................................................................................. 57 REFERENCES ................................................................................................................. 59 vi LIST OF TABLES Tables 1. Contrasts With Means and Standard Deviations by Task Priority (3) for the Detection Response Task and the Steering Task ...................................................... 24 2. The Difference Between DIC and the DIC for the Best (Bvt0 with no start-point noise) Model (DIC =-17058) for the Set of 9 Models .............................................. 31 3. Reaction Time and Accuracy of the Simultaneous Color and Shape Discrimination Tasks ......................................................................................................................... 41 4. The Difference Between DIC and the DIC for the Best (ABvt0 with start-point noise) Model (DIC = 18730 for the Set of 8 models ................................................ 48 LIST OF FIGURES Figures 1. Schematics for the modified Wald-distributed single accumulator model (A; Logan, Van Zandt, Verbruggen, and Wagenmakers, 2014) and the linear ballistic accumulator model (B; Brown & Heathcote, 2008) 2AFC task ........................................................... 5 2. Representations of possible POCs based on trade-offs in performance and their accompanying performance-resource functions (PRFs) ...................................................20 3. Participants control the triangle in Photograph A in an attempt to keep its lateral position equivalent to the circle’s lateral position with the steering wheel .............. 22 4. Both graphs depict performance on two tasks in standardized units with three conditions of priority ................................................................................................. 25 5. A schematic for the Wald EAM of the DRT represented in Castro et al. (2019) ..... 29 6. An example trial presents a red circle to participants with the E-Prime 2.0 software (Psychology Software Tools, Pittsburgh, PA) .......................................................... 39 7. Both graphs depict performance on two tasks in standard deviation units with three conditions of priority ................................................................................................. 40 8. A schematic for the two LBA evidence-accumulation models of the two choice tasks ........................................................................................................................... 43 9. Cumulative distribution functions for data (thick lines) and fits (thin black lines) of the LBA model with start-point variability (ABvt0) to the choice data for the color task ............................................................................................................................ 46 10. Cumulative distribution functions for the LBA model with start-point variability to the choice shape data................................................................................................. 47 ACKNOWLEDGEMENTS When reflecting upon the vast infrastructure and network of support required for one individual to go to school, get an education, and receive a degree, the acknowledgements might be in danger of outgrowing the paper. However, I will keep it short. Academic – thank you to my advisor David L. Strayer for providing the resources, tacit knowledge, and personal support necessary for a sane and balanced approach to being in a graduate program; thank you to my committee for being willing to guide this dissertation project, actually read it, and improve the quality of its contents; thank you to my labmates for the collaborative atmosphere and commiseration in the process of intellectual growth; thank you to my research assistants for suffering tedium. Family – thank you to my ancestors, community, family, and parents for roots, a sense of self, and unparalleled advantages in life far beyond your means. I will always be indebted to you. Finally, thank you to my wife for showing me the way. INTRODUCTION Decades ago, researchers established that performing more than one attentiondemanding task at a time can degrade performance on both tasks (Kahneman, 1973; Pashler, 1994; Wickens, 1991). Recently, others demonstrated that chronic multitasking could lead to difficulties in accomplishing goals (Ophir, Nass, & Wagner, 2009; Sanbonmatsu, Strayer, Medeiros-Ward, & Watson, 2013). However, people have not stopped performing multiple tasks simultaneously. For example, people often multitask with technology at work, thereby decreasing productivity (Duke & Montag, 2017). Studies of the underlying processes of multitasking performance can impact these realworld events, not only advancing basic psychological research but also improving everyday life. In psychology, several theoretical frameworks address the mechanisms of multitasking limitations. Some accepted theories relate performance to our limited capacity for attention (Kahneman, 1973; Navon & Gopher, 1979). However, multitasking requires allocating attention among two or more goals (e.g., Braver, 2012; Wickens & McCarley, 2019). Researchers argue that this process involves a fundamentally different mechanism than maintaining the effort to accomplish one goal (e.g., Howard, Evans, Innes, Brown, & Eidels, under review; Norman & Shallice, 1986). In multitasking, allocating attention may account for performance changes more than attention’s limited processing capacity. 2 We propose to quantify limited capacity and attention allocation’s effect on performance (Wickens & McCarley, 2019) with computational models of simple detection and choice tasks (Brown & Heathcote, 2008; Logan, Van Zandt, Verbruggen, and Wagenmakers, 2014). We incorporate the performance operating characteristics (POCs) of a limited-capacity attentional framework (Kahneman, 1973; Navon & Gopher, 1979) to demonstrate consistency with the resource-limited account of our experimental manipulations. Then, we utilize a hierarchical Bayesian approach to evidence accumulation models (EAMs; e.g., Brown & Heathcote, 2008; Castro, Matzke, Strayer, & Heathcote, 2019) to operationalize attention allocation, limited capacity, and their relative effects on performance changes. Finally, we employ an applied method from human factors for measuring cognitive workload in the automobile—the International Organization for Standardization detection response task (DRT; Conti, Dlugosch, Vilimek, Keinath, & Bengler, 2012; ISO 17488, 2016). With this method, we aim to maximize its ecological validity for future use in real-world settings. History Miller (1956) famously established cognitive capacity limitations by demonstrating how many chunks of information can be concurrently held in short-term memory. Later, Kahneman (1973) identified voluntary, goal-directed attention as the limiting factor in cognition that leads to impairments when demand exceeds supply. Miller (1956) refers to short-term memory, and Kahneman (1973) identifies attention as the limiting factor of human performance. However, Barrouillet, Bernardin, Portrat, Vergauwe, and Camos (2007) propose that processing and maintaining information both rely on the same limited resource—the attention utilized by voluntary, controlled 3 processes (e.g., Cowan, 1999; Engle, Kane, & Tuholski, 1999). This proposal predicts that limitations of performance in short-term recall, responses in simple decision-making (i.e., accuracy), and completion time (i.e., reaction time) all result from the limited capacity of attentional resources. There are two ways the performance of simple goal-directed behaviors can change; one can perform the task correctly, or accurately, and one can perform the task quickly (e.g., Luce, 1986), usually measured as reaction time (RT). Within this framework, any goal-directed and timed experiment results in longer completion times and reduced accuracy for tasks that possess a higher demand (e.g., Barrouillet et al., 2007). The current multitasking paradigm builds upon these theories of resource-limited information processing utilizing RT and accuracy; the more demanding one task becomes, the more that task reduces the available resources for other tasks (Schneider & Shiffrin, 1977; Shiffrin & Schneider, 1977). The relationship described by Schneider and Shiffrin (1977) can result in degraded accuracy and slower RTs in one or both tasks, depending upon priority (Posner, 1978). For example, several studies have demonstrated impairments in speed and accuracy with the addition of a second task (e.g., Castro, Cooper, & Strayer, 2016; Castro, Strayer, Matzke, & Heathcote, 2019). Also, RT does not necessarily increase linearly by increasing the steps required by the task (i.e., task complexity), but as a function of task demands (Logan, 1980). Kahneman and Beatty (1966) refer to this mental effort as memory load, and Sheridan and Stassen (1979) define the information processing load placed on a human operator while performing a task as a cognitive workload. Hart and Staveland (1988) define workload as “a hypothetical construct that represents the cost incurred by a human 4 operator to achieve a particular level of performance” (p. 2). We refer to the cognitive cost—as opposed to the visual or manual cost—needed to perform a task as cognitive workload (Strayer, Watson, & Drews, 2011) and account for the visual and manual costs with a separate parameter in models described below (t0, Figure 1). Subsequent applied studies demonstrate cognitive workload’s correlation with observable behavior (e.g., Engström, Markkula, Victor, & Merat, 2017) and physiology like pupil dilation (e.g., Granholm, Asarnow, Sarkin, & Dykes, 1996; Padilla, Castro, Quinan, Ruginski, & Creem-Regehr, in press). Researchers have documented the pervasive effects of cognitive workload, from sociocultural phenomena, such as stereotyping (Biernat, Kobrynowicz, & Weber, 2003), to its interaction with self-efficacy and attitude in learning (Efendioglu, 2016). RTs measured under a cognitive workload also interact with myriad phenomena like the automaticity that practice and skill can provide for a consistently mapped and repetitive task (Shiffrin & Schneider, 1977) or the maintenance of multiple simultaneous goals (Kane & Engle, 2003). We account for these interactions with a computational model of cognitive workload. Within the information processing framework (e.g., Craik & Lockhart, 1972; Navon & Gopher, 1979), resource-limited tasks share limited capacity while operating in parallel (i.e., the proportions of attention allocated toward tasks dictate their speed of processing). Other theories predict that attention switches in an all-or-none manner among tasks. Researchers describe this framework with single-channel bottleneck theories (e.g., Pashler, 1994; Welford, 1952). In both cases, multiple tasks compete for a limited resource, overloading cognitive processes, and leading to a smooth degradation of performance in accuracy and RT (Norman & Bobrow, 1975). The fact that performance 5 Figure 1. Schematics for the modified Wald-distributed single accumulator model (A; Logan, Van Zandt, Verbruggen, and Wagenmakers, 2014) and the linear ballistic accumulator model (B; Brown & Heathcote, 2008) for a 2AFC task. The models share drift rate (v) and threshold (b) parameters, as well as nondecision time (t0). Panel B additionally demonstrates a variable start-point (a) selected from a uniform distribution between 0 and A, but both models can employ start-point variability. We fit the model depicted in panel A to DRT data in Experiment 1, and the model depicted in panel B to 2AFC data from Experiment 2. often degrades continuously instead of reaching a discrete failure point can help make distinctions between proposed mechanisms of information processing. Because overloading cognitive capacity does not immediately result in failures to respond but merely increased RT, response failures may be due to an independent or tangentially related mechanism. Additionally, Norman and Bobrow (1975) distinguish between data-limited and resource-limited processes, predicting that resource-limited processes result in longer 6 processing time for increased accuracy. Researchers term this relationship the speedaccuracy tradeoff (SAT; for a review, see Heitz, 2014). Some researchers consider many real-world tasks (e.g., driving an automobile) resource-limited (e.g., Levy, Pashler, & Boer, 2006). They argue that our ability to process and react to environmental cues dictates performance due to a bottleneck of processing with metrics like brake reaction time (Levy et al., 2006). This framework terms the minimum time needed to maintain performance between two tasks as the psychological refractory period (PRP; Davis, 1956). PRP studies demonstrate that this period represents a hard limit upon simultaneous tasks by placing a short interval between their onset (i.e., a stimulus onset asynchrony or SOA). Within the PRP paradigm, early perceptual encoding can co-occur, but response selection must be cued, resulting in the response selection bottleneck (RSB; Pashler, 1994; Pashler & Johnston, 1989). With a sufficiently small SOA, the serialization of response selection slows the second task’s RT. However, others have demonstrated that people can perform sufficiently practiced simple tasks concurrently (e.g., Schumacher et al., 2001), and further studies claim that specific response-selection processes penetrate the RSB (Schubert, Fischer, & Stelzel, 2008). In both theories, increased cognitive workload degrades performance. CHALLENGES As evidenced by multiple proposed theories and published studies on cognitive limitations and human performance, cognitive psychologists and neuroscientists have long sought to understand our attempts to achieve multiple goal-directed tasks simultaneously. Many of these theories rely on data produced by participants in the form of RTs as the essential dependent variable (e.g., Neisser, 1967; Pashler, 1993; Posner, 1980;). Utilizing RT, the effects of multitasking have been demonstrated repeatedly in the laboratory (e.g., Tombu & Jolicœur, 2004) and in real-world scenarios (e.g., Strayer, Drews, & Johnston, 2003). Pachella (1973) describes RT as one of the most prevalent dependent variables in human experimental psychology and ascribes its resurgence in interest to psychology’s renewed focus on unobservable processes. However, a few underlying assumptions of RT as the dependent variable in a statistical model present challenges to its efficacy as a surrogate for the cognitive workload of multitasking. For example, we must account for SAT in a model of responses because hasty decisions lead to more errors in resource-limited tasks (Luce, 1986; Norman & Bobrow, 1975). Wickelgren (1977) outlines six basic methods of producing the function describing SAT: instructions, payoffs, deadlines, time bands, response signals, and partitioning RTs. In this outline, instructions influence priority (Posner, 1978), payoffs and deadlines provide the carrots and sticks of motivation, and time bands add a lower RT limit to deadlines. Response signals refer to an auxiliary cue 8 to respond and partitioning RTs refers to an analytic approach of averaging accuracy over RT bins. In this paper, we employ instructional manipulation to demonstrate how response priority interacts with a resource-limited task. By explicitly manipulating task priority in the manner of SAT and fitting a process model to the data, we can incorporate how the interaction of limited capacity and attention allocation processes affects performance limitations. However, other challenges exist within this paradigm that we must address. For example, an ongoing response is required in order to utilize RT as a good surrogate for insight into the current state of mental systems. This requirement remains the main critique of utilizing RT as an online measurement of cognitive workload (i.e., primary task intrusion; see O’Donnell & Eggemeier, 1986). Several passive methods for testing the cognitive causes of workload exist, but they can also be challenging to apply to naturalistic scenarios. For example, eye tracking can be sensitive to variable light conditions, and electroencephalography is susceptible to artifacts from muscle movements. Both of these variables can be difficult to control in naturalistic settings for realistic goal-directed behaviors. Lohani, Payne, and Strayer (2019) summarize the advantages and disadvantages of the most prominent psychophysiological measures in the context of driving and provide a list of challenges across validity, reliability, establishing baselines, sampling rates, and signal quality. If we wish to design an approach to quantifying the contributions of attentional processes to performance limitations that we can take out of the lab, we must consider these challenges. However, according to Schumacher et al. (2001), heavily practiced simple tasks like the ISO DRT (ISO 17488, 2016) can have no detectable impairment of another task, which is what 9 Stojmenova and Sodnik (2018) found with pupil dilation, average speed, and secondary task performance comparing the addition of the DRT to driving alone. Even when participants produce an RT as a surrogate for mental processes, there can be undetected cognitive changes when comparing mean RTs across conditions. For example, Kramer, Sirevaag, and Braune, (1987) utilized the P300 component of an eventrelated potential (ERP) as a measure of processing resources, and therefore mental workload in conjunction with RT measures in pilots. In this case, RTs and accuracy measures did not differ across the manipulation of flight simulator difficulty for the RT task. However, the amplitude of the P300 decreased when flight difficulty increased. This study found that tested RT differences were not sensitive to task difficulty, but ERPs did vary along with heading and altitude deviation. Kramer et al., (1987) suggest that differences in RT averages across conditions did not have detectable differences as a measure of cognitive workload in their study. However, the RT distribution and method of collection may present problems for traditional statistical methods. Researchers continue to develop statistical methods (e.g., Molenaar & Bolsinova, 2017; Ranger & Kuhn, 2012; Wang, Chang, & Douglas, 2013) to address heteroscedastic error terms and skewed distributions truncated by zero in RT data (Luce, 1986). We utilize the latest R-based Dynamic Models of Choice (DMC; Heathcote et al., 2019) software to fit computational models to the entire RT distribution, determine the goodness of fit, and detect changes in the hierarchically estimated posterior distributions of the model’s parameters. By utilizing these techniques, we may be able to detect where in the distribution RT differs and infer which attentional processes lead to the observed performance outcomes. 10 Researchers first developed advanced modeling techniques to link aspects of the distribution of RTs to cognitive processes in decision-making with successful outcomes (e.g., Brown & Heathcote, 2008; Ratcliff & Rouder, 1998). This approach turns distributional challenges into advantages and allows for the proposal and testing of cognitive processes as mathematical formalizations. These models were developed initially to explain the probability of simple binary decisions but have been expanded to measure cognitive workload (Castro et al., 2019; Tillman, Strayer, Eidels, & Heathcote, 2017). Models in which evidence toward a decision accumulates have supported important discoveries in decision-making, such as predicting neural pathways in the lateral intraparietal cortex responsible for choices (Beck et al., 2008; Bennur & Gold, 2011; Hanks, Ditterich, & Shadlen, 2006). We refer to all of these types of models under the umbrella term evidence accumulation model (EAM). EAMs maintain several advantages over traditional connectionist models (for a review, see McClelland & Cleeremans, 2009), including simplicity (e.g., Brown & Heathcote, 2008) and interpretability of its parameters. For example, drift rate—the rate at which evidence accumulates—can be interpreted as the limited resource of attention (Castro et al., 2019; Tillman et al., 2017) described by Kahneman (1973), creating a tendency toward a response. The values of these parameters that can be fit to the data allow for testing in similar experimental designs or simulations for novel scenarios. This approach allows the values to be empirically testable, establishing their ability to provide evidence for the processes of decisions in goaldirected behavior. Therefore, EAMs afford an excellent choice for fitting empirical data (Wagenmakers, 2009). Also, the fitting process occurs across the entire RT distribution, 11 which allows for the detection of effects beyond comparisons of central tendency metrics, and additional parameters can account for failures of attention in detection and choice tasks as well (Castro et al., 2019; Matzke, Love, & Heathcote, 2017). In summary, researchers continue to develop more sophisticated methods for RT data. Widely used methods of RT measurement have demonstrated some difficulties but have also significantly advanced our understanding of mental processes. In conjunction with neurological, physiological, and subjective measures, we can provide converging evidence as to the nature of RT tasks across basic and applied implementations. However, the increased complexity of such endeavors introduces new challenges for researchers, which we can simplify through interpretable models. Utilizing RT, accuracy, and evidence accumulation modeling, we propose a simple but informative strategy for the measurement and analysis of the components of cognitive workload. This novel assessment of workload employs psychological theories of information processing to establish a limited-resource account of performance and demonstrates how this account interacts with response caution, early perceptual processes, and motoric actions for simple detection and choice responses. It also explicitly quantifies the theoretical constructs of limited resources and attention allocation. Finally, the approach establishes the feasibility of these methods for use in applied settings. 12 APPROACH A process model combined with a data-driven approach to workload provides a theoretical framework for evaluating DRT responses. Even though the DRT is usually utilized to evaluate potentially distracting in-vehicle technology and how to instrument vehicles (Strayer et al., 2015; Strayer et al., 2017), only a few investigations of the DRT have attempted to provide a framework for its theoretical implications (Castro et al., 2019; Ratcliff & Strayer, 2014; Tillman, Strayer, Eidels, & Heathcote, 2017). Due to the DRT’s importance as a standard for measuring workload and the allocation of attention in a myriad of dynamic environments, it is vital to understand the cognitive processes underlying the measurement. We can expand research into the theoretical underpinnings of the DRT with evidence accumulation modeling (EAM; e.g., Castro et al., 2019). According to Huk, Katz, and Yates (2014), evidence accumulation in decision making consists of sequentially sampling sensory information until sufficient evidence has been acquired to select one option over another or many other options. In computational neuroscience, researchers established evidence accumulation as a fundamental aspect of cognition (e.g., Hanks, Ditterich, & Shadlen, 2006; Usher & McClelland, 2001) and mathematical models of how evidence accumulates account for behavioral phenomena in many laboratory tasks (e.g., Brown & Heathcote, 2008; Ratcliff & Rouder, 1998; Usher & McClelland, 2001). EAM historically provides a framework to model choice task responses requiring two or more response options (Brown & 13 Heathcote, 2008; Leite & Ratcliff, 2010), but recently researchers applied EAM to tasks requiring only one response, which is more applicable to the DRT (e.g., Heathcote, 2004; Ratcliff, 2015). For both types of tasks, EAM proposes an initial encoding stage, an accumulation stage, and a response-production stage. In the first stage, the senses extract evidence from a stimulus. Then, the possible responses accrue evidence at a specific rate. Stimulus quality or the difficulty of the tasks dictate this rate. The evidence accumulates (i.e., drift rate v) until it reaches some threshold (b), which initiates the response-production stage (tr; see Figure 1). RT equals the sum of the time to reach the threshold (i.e., decision time td) plus the encoding and response production times (i.e., nondecision time t0). Researchers often assume that limited resources underlie the effects of cognitive workload (e.g., Kahneman, 1973; Strayer & Johnston, 2001), which in EAMs relates to the rate of information processing parameter (drift rate; Castro et al., 2019; Tillman et al., 2017). However, in a prospective memory (PM) paradigm, Heathcote, Loft, and Remington (2015) demonstrated that dual-task slowing of RTs stemmed primarily from individuals delaying responses to ongoing tasks. The models suggest that participants allowed future PM responses to compete with the more salient current response by slowing the current response. Researchers initially theorized that PM slowed present tasks by sharing resources with a future goal and thus limiting overall performance, but strategic choices in the relative importance of each task may overshadow resource limitations. This conclusion calls into question prevailing theories of PM and the effect of a limited capacity system. Tillman, Strayer, Eidels, and Heathcote (2017) found similar outcomes utilizing 14 EAMs with DRT data during a conversation in a vehicle, contrary to capacity sharing accounts of the DRT and driving performance (e.g., Strayer et al., 2011, 2013). Again, strategic shifts in priority (i.e., threshold) were demonstrated with little to no evidence of rate effects. Additionally, these more cautious controlled responses differentiate the cognitive workload of multitasking from the cognitive workload of increased difficulty (Howard, Evans, Innes, Brown, & Eidels, under review). The addition of EAM simultaneously captures changes in limited resources and response certainty due to multitasking. It is also possible that workload slows early perceptual encoding (i.e., nondecision time) or causes failures to encode evidence from the stimulus (i.e., response omissions). For the DRT, HR represents failures to encode the stimulus. Previous research (e.g., Castro, Cooper, & Strayer, 2016; Motzkus et al., 2018) suggests HR, and therefore failures to encode evidence may have a separate but correlated or even independent relationship to RT effects. In fact, Prinzmetal, McCool, and Park (2005) propose this exact framework, demonstrating that involuntary (i.e., automatic) attention affects only RT while controlled attention changes both metrics. In 2-alternative forced-choice tasks (2AFCs), the quality of evidence may also be affected by attention. This effect occurs in models where independent accumulators race to a threshold in order to trigger a response (see Figure 1 Panel B; Brown & Heathcote, 2008, Leite & Ratcliff, 2010) and in single accumulator models that accumulate evidence competitively (see Figure 1 Panel A; Logan et al., 2014). Lower quality evidence can result in choice errors unless evidence is collected for a longer time, and so may indirectly cause slowing if participants raise their threshold to maintain accuracy (i.e., 15 SAT; Ratcliff & Rouder, 1998). Evidence with a degraded signal may also cause false detection responses, especially under reduced attention from a cognitive workload. Participants can compensate by increasing thresholds under larger cognitive workloads to maintain accuracy. Questions about the interaction between these aspects of information processing and which is altered under divided attention remain preliminary. Different types of workload demonstrate changes in the parameters of evidence accumulation models. For example, a purely visual workload affected accumulation rates in perceptual choice tasks, according to Eidels, Donkin, Brown, and Heathcote (2010). Schmiedek et al. (2007) also found correlations between individual differences and evidence-accumulation rates on a variety of tasks (working memory, reasoning, and psychometric speed) in verbal, numeric, and spatial choice paradigms. Therefore, DRT measures that can be influenced by multiple mechanisms of attention (Wickens & McCarley, 2019) are sensitive to different types of workload (Motzkus et al., 2018), and they demonstrate effects not detectable in comparisons of means (Stojmenova & Sodnik, 2018) benefit enormously from EAM. Castro, Strayer, Matzke, and Heathcote (2019) directly assessed the relationship between DRT measures of cognitive workload and parameters of EAMs, in choice and simple detection tasks. The researchers measured cognitive workload effects in a dualtask paradigm by contrasting primary-task performance between conditions with and without a secondary task. Castro et al. (2019) had participants perform a primary tracking task and a secondary task of counting backward by threes. The researchers also utilized the tertiary DRT to assess cognitive workload in all the conditions. Prior researchers had validated only models that could differentiate between rate 16 and threshold effects of workload for choice responses and not detection responses (Ratcliff & Strayer, 2014). In order to compare the effects of cognitive workload on drift rate and threshold parameters for the DRT, Castro et al. (2019) developed a choice response task (CRT) and compared the patterns of parameter changes under cognitive workload between the DRT and CRT for multiple models. The researchers concluded that a modified version of Logan, Van Zandt, Verbruggen, and Wagenmakers’ (2014) Wald-distributed model most successfully described cognitive changes for the assessment of cognitive workload with the DRT. Cognitive workload affected the start-point (a), drift rate (v), threshold (b), and nondecision time (t0) parameters of the model. These changes indicate that participants monitor for evidence of the light’s presence to varying degrees between counting backward and not counting, thus changing the start-point of evidence accumulation. They also compensate for a decrease in the rate of evidence for detecting the light under workload by increasing the evidence required to make a response and potentially attempt to compensate for this slowing by making faster motor responses under a cognitive workload. Again, the threshold parameter accounted for the majority of the effect despite the addition of a purely cognitive secondary task. However, the drift rate also accounted for a portion of the cognitive workload effect in contrast to studies that found no need for the drift rate parameter at all (Tillman et al., 2017). Researchers concluded that both the predetermined attention allocated and limited capacity most accurately accounted for the cognitive workload effect. In an effort to apply EAMs to cognitive workload in naturalistic settings, these studies make several assumptions about the relationship between decision making and 17 cognitive workload. For example, some of these studies equate the processes of monitoring for a change and choice (e.g., Tillman et al., 2017) or find the same underlying patterns for both task types (Castro et al., 2019). Others point out that cognitive workload encompasses task difficulty and multitasking, despite evidence that these processes differ (Howard et al., under review). Finally, these studies demonstrate that strategic attention allocation determines performance changes without explicitly manipulating task priority to elicit these changes. In the current paper, we demonstrate EAM’s ability to fit multitasking data with discrete detection tasks, continuous tasks, and choice tasks across task priority. Additionally, we quantify the proportions of attention allocation and limited capacity responsible for performance changes across task priority. 18 EXPERIMENT 1 Research Objectives Previous experiments have explored the DRT’s relationship to cognitive and visual-manual workload (e.g., Castro, Cooper, & Strayer, 2016), EAMs (Castro et al., 2019; Tillman et al., 2017), and the difference between difficulty and multitasking (Howard et al., under review). These studies demonstrate several outcomes and propose some challenges to measuring workload. For example, The DRT may impose a small but measurable workload (Castro et al., 2019) as an active task compared to a passive measure, and may interact with priority for responses, which would occur as a threshold effect in EAMs. Cognitive workload also changes the perceptual encoding and motor response parameter (t0) in a compensatory fashion (Castro et al., 2019). These outcomes suggest that the DRT measures more than a pure information processing bottleneck, where the participant does not have enough available resources to parse the complex environment fast enough. We aim to validate parameter changes in the model when mapped to the established psychological procedure of instructional manipulation (Wickelgren, 1977). Through these methods, we demonstrate the proportion of cognitive workload accounted for by EAM parameters while explicitly manipulating task priority. 19 Aim 1 First, we establish the relationship of task priority to the parameters of EAMs and experimentally manipulate these relationships utilizing instructions of task priority. The model must account for the perceptual salience (signal strength) of the DRT stimulus, the potential interaction of DRT responses with measures of performance and other tasks (e.g., a continuous tracking task), and the DRT’s interaction with visual and manual effects of workload. EAMs provide the flexibility and simplicity necessary to account for these phenomena. To establish the feasibility of this approach, we manipulate task priority and analyze the subsequent RTs and HRs in terms of their performance operating characteristics (POCs; Navon & Gopher, 1979). Navon and Gopher (1979) demonstrated that in an optimally resource-limited system, one unit of improvement in one task decreases the other task by that unit (i.e., the objective substitution rate). In neuroscience literature, researchers refer to this phenomenon as resource reciprocity (Sirevaag, Kramer, Coles, & Donchin, 1989). A linear POC describes a perfect utility trade-off (see Figure 2a), and a rectangular POC describes a complete lack of trade-off (see Figure 2c & 2d; Navon & Gopher, 1979). We manipulate the processing priority of the DRT and a tracking task across three levels of priority (i.e., DRT emphasis, equal emphasis, and steering emphasis). The behavioral POC plots demonstrate the extent to which the two tasks share limited resources, informing our predictions for parameter changes in EAMs. Hypotheses. For behavioral outcomes when manipulating task priority, we predict that DRT RTs will trade off with the steering task in a linear fashion, implying that the two processes compete for the same cognitive resource. To test this hypothesis, we fit model parameters representing the rate of information processing (drift rate) and 20 Resources Invested Resources Invested Task A Performance Performance Task A Performance Performance Task A Performance Performance Performance Task A Performance (d) Task B Performance (c) Task B Performance (b) Task B Performance Task B Performance (a) Resources Invested Resources Invested Figure 2. Representations of possible POCs based on trade-offs in performance and their accompanying performance-resource functions (PRFs). The figure is based on a schematic in Wickens and McCarley (2019, p. 118). the amount of evidence required for a response (threshold) to the DRT data. These parameters either vary with the priority manipulation or remain fixed. In line with previous research (Castro et al., 2019; Tillman et al., 2017), we predict that delays in responses to the DRT will be mostly due to a threshold shift across conditions, contradicting the behavioral interpretation of a linear POC. Method Participants After Institutional Review Board approval, 20 participants (19-33 years old, M=23.2) were recruited via psychology courses at the University of Utah (11 males, 9 females) and were compensated for class credit upon completion of a 1-hour session. All reported normal visual acuity and normal color vision. 21 Materials Steering task. The materials for this study were previously used in Castro et al., (2019). Participants viewed a 101.6 cm Samsung LCD (1920 x 1080 pixels) that displayed the steering task (see Figure 3). The screen did not display a realistic driving simulation. Instead, the forward screen displayed a continuous lateral steering tracking task (see Figure 3A). Participants manipulated a steering wheel from a driving simulator (Figure 3B) to track a ball that moved continuously on the screen with a triangle cursor (see Figure 3A). Participants sat approximately 91 cm from the display. The usercontrolled, equilateral triangle cursor possessed sides of 20 pixels (~0.96 cm), which was the same length as the diameter of the ball. The steering wheel updated the location of the cursor via a Sparkfun™ Electronics rotary encoder set to sample the position at 30 Hz. Detection response task. The DRT device presented a dash-mounted red light (see Figure 3B) to simultaneously measure cognitive and visual workload (Motzkus et al., 2018). Randomly between 3-5 seconds, the dash-mounted red-light stimulus would turn on for 1 second or until a response was made and responses were made by pressing a microswitch attached to the participant's dominant thumb. Design Upon arrival, participants completed a consent document and then were familiarized with the tracking task, DRT, and the priority instructions. Participants performed both tasks in every condition. Participants were given an example of the effort to be allocated to each task by describing the difference as 80% effort to the priority task, 20% effort to the secondary task, or 50% effort to both tasks. First, participants completed three training blocks–one for each condition. Then, the experiment was 22 Figure 3. Participants control the triangle in Photograph A in an attempt to keep its lateral position equivalent to the circle’s lateral position with the steering wheel. Photograph A also shows the simulator, steering wheel, and center screen used to display the steering task. Photograph B shows a closer view of the steering wheel, the thumb-mounted button, and the dash-mounted DRT stimulus for displaying the dim and bright, simple DRT and choice DRT. Figure from Castro et al. (2019). conducted across nine counterbalanced blocks for 5 minutes each. Three conditions instructed participants to give priority to the steering task, the DRT, or both equally. The three conditions were counterbalanced using a balanced Latin square design. Each participant practiced the tasks individually, and then together before testing began. The steering task was designed and used in previous studies to simulate steering on a moderately curvy road (Castro et al., 2016; Castro et al., 2019; Cooper et al., 2016). Participants were instructed to maintain the cursor as close as possible to a ball that moved horizontally across the screen at a slow constant rate of 100 pixels per second (see Figure 1A). As the ball approached the edge of the screen, it changed direction, following a normal distribution centered on the middle of the screen. That is, the ball moved smoothly through the center third of the screen (corresponding to one standard deviation either side of the middle) approximately 68% of the time, and the center two thirds approximately 95% of the time (corresponding to two standard deviations either side of 23 the middle). Measures. RT to the dashboard light was recorded to the nearest millisecond. According to ISO standard 17488 (2016), RTs shorter than 100 milliseconds and trials with two or more responses were excluded from the analyses as anticipatory or guessing (0.13%), whereas RTs greater than 2500 milliseconds were treated as misses. The difference between the position of the cursor and the target was sampled at 30 Hz and used to compute the root mean squared error (RMSE) for the steering task. The RMSE was calculated with the following formula: ∑1 (y+ − 𝑦. )0 𝑅𝑀𝑆𝐸 = ' .23 𝑛 where the RMSE is the sum of 1 to an average of 9537 observations over an approximately 5-minute block. These observations consist of the mean position of the target (y) subtracted from the mean lateral position of the cursor (ȳ). RMSE tracking error observations that were three standard deviations above the individual participant's mean were also removed (1.63%). Results Analyses were performed in R (R Development Core Team, 2018), the statistical computing and graphics environment. We utilized the lme4 package (Bates, Maechler, Bolker, & Walker, 2015) to create a linear mixed-effects model (LMM) and reported Type II Wald chi-square tests of differences in RT, HR, and RMSE across conditions. Participants were included as a random effect. We also reported 95% confidence intervals in square brackets. Table 1 contains a summary of omissions, accuracy, and RT-mean 24 Table 1 Contrasts With Means and Standard Deviations by Task Priority (3) for the Detection Response Task and the Steering Task. Dependent Task Priority Mean SD p Variable DRT emphasis 99.2 8.9 <.01 Hit Rate (%) Equal emphasis 98.4 12.6 <.01 Steering emphasis 98.1 13.8 DRT DRT emphasis 289 128 <.01 Reaction Time Equal emphasis 324 147 (ms) <.01 Steering emphasis 350 165 DRT emphasis 3.53 2.96 <.01 Steering Error (RMSE) Equal emphasis 2.73 2.24 <.01 Steering emphasis 2.49 2.01 Note. Comparisons between rows were performed with planned contrasts for pairwise comparisons of means and p values represent these comparisons between the two adjacent rows. comparisons. The behavioral results demonstrated that participants prioritized one task over the other task as instructed. We also compared the priority of steering against the DRT in standardized units and observed which outcomes affect hypothesized parameters (see Figure 4). DRT Measures Reaction time. Statistical analyses were run on log-transformed RTs but are reported in milliseconds for clarity. The effect of priority upon log-RT (LRT) was calculated with the formula: 𝐿𝑅𝑇7. = 𝛽9 + 𝑃𝑎𝑟𝑡𝑖𝑐𝑖𝑝𝑎𝑛𝑡97 + 𝛽3 𝐸𝑚𝑝ℎ𝑎𝑠𝑖𝑠. + 𝑒7. where Participant has a random intercept. The manipulation of task priority significantly affected RT c2(2) = 473.38, p < .001 so that from focusing upon the DRT (M = 289 ms, 95% CI [277, 300]) to focusing upon the steering task (M = 350 ms, 95% CI [340, 362]), participants’ RTs slowed an average of 62 milliseconds. In pairwise contrasts, priority 25 Reaction Time by Steering Deviation Accuracy by Steering Deviation 1.0 Priority ● DRT Equal 0.5 ● 0.5 Steering ● 0 0.0 −0.5 −0.5 −1 Hit Rate Performance (Standardized Units) Reaction Time Performance (Standardized Units) 1 −1.0 −1 −0.5 0 0.5 Steering Performance (Standardized Units) 1 −1 −0.5 0 0.5 Steering Performance (Standardized Units) 1 Figure 4. Both graphs depict performance on two tasks in standardized units with three conditions of priority. Better performance is positive, and worse performance is negative. The left panel shows that DRT RT performance is highest when prioritizing the DRT and subsequently decreases steering performance by a standard deviation from the equal priority condition. The right panel demonstrates the same effect occurs in the hit rate to the DRT. Error bars are 95% confidence intervals around the mean utilizing the Cousineau-Morey method (Cousineau, 2005; Morey, 2008; Baguley, 2012). from the DRT to equal priority to the steering task significantly increased RT (see Table 1). Hit rate. The effect of priority on HR was significant c2(2) = 19.76, p < .001 so that from focusing on the DRT (M = 99.20%, 95% CI [98.88%, 99.52%]) to focusing upon the steering task (M = 98.07%, 95% CI [97.59%, 98.55%]), participants’ HR decreased an average of 1.13%. Contrasts were fit with a binomial LMM by maximum likelihood using Laplace approximation and found significant effects of priority from DRT to equal priority to the steering task for HR (see Table 1). Steering Measures RMSE. One participant’s steering error failed to record and was removed from the analysis. Root mean squared error also differed across the priority manipulation from 26 2.49, 95% CI [2.48, 2.50] in the steering condition to 3.53, 95% CI [3.52, 3.54] in the DRT condition c2(2) = 55733, p < .001, 95% CI [1.01, 1.07]. Behavioral Discussion The traditional interpretation of the results suggests that the DRT competes for limited resources with steering, trading off performance in a linear fashion when prioritizing one or the other task according to the POCs (see Figure 4). We hypothesize that the DRT reflects changes in cognitive workload due to sharing limited resources with the steering task. However, the small difference between focusing on both tasks equally and prioritizing the steering task indicates that the DRT does not strongly affect testing where steering takes precedent. Also, the difficulty of the tasks appears to be asymmetrical, which may reflect a data limit for the DRT. According to Navon and Gopher (1979), dual-task performance lies on a spectrum between perfect trade-off of resources (i.e., linear trade-off where 1 unit of increase in performance for task x results in 1 unit of decrease for task y) and no tradeoff of resources (i.e., no change of x values for task x, and no change of y values for task y). The current data suggest that the DRT and the steering task more closely resemble the former scenario, indicative of resource reciprocity (see Figure 2). In an EAM, we might predict that a model that allowed only drift rate (v, see Figure 1) to vary across the conditions would more parsimoniously represent the data than a model with more free parameters. However, this model makes several assumptions about the participants’ effort and capacity. For example, we assume that the amount of resource utilized remains constant across the conditions. Navon and Gopher (1979) formalize this phenomenon with indifference curves, (i.e., equal-utility curves) where participants predetermine a finite 27 level of intended performance. In EAMs of cognitive workload, threshold (b) represents these processes (Castro et al., 2019). Our manipulation of instructions (Wickelgren, 1977) assures that we can manipulate this level of performance for each task. These data also demonstrate another aspect of multitasking. In addition to maximizing the use of resources, a lower limit also exists on task performance. At some unknown point, dedicating effort toward one task to the exclusion of the other task will lead to diminishing returns on the former task’s improvement (Navon & Gopher, 1979). This maxim implies that an optimal efficiency of resource budgeting may exist where a decrease in 1 unit of cognitive workload for the DRT should lead to a 1 unit increase of cognitive workload for the steering task but that this exchange potentially deteriorates near the extremes of prioritization. For the DRT, which can be described as a highly automated task (e.g., Prinzmetal et al., 2005), the standard units of performance do not vary across conditions as much as the continuous steering task. This smaller difference may indicate that the participants could allocate more resources toward DRT performance at the cost of steering performance, but DRT performance would not increase much. This proposition holds true for RT and HR with HR averages hovering close to 100%. The data also partially support Prinzmetal et al. (2005) by demonstrating that more automated processes tend to result in RT differences more than accuracy differences. We propose that the appropriate model for fitting the data while maintaining theoretical plausibility will account for decreased performance due to the splitting of cognitive resources. Because we can model only tasks with discrete responses, the EAM will be applied only to the DRT responses. Posterior plausible-value (Ly et al., 2017) correlations for the steering task with parameters will be calculated to demonstrate task 28 priority’s effect on steering and DRT tradeoffs. Additionally, the model must account for the voluntary allocation of these resources to one task or the other. The changes in performance should occur due to a lack of available resources to process a stimulus (drift rate v) combined with the changing certainty of a response (threshold b). Finally, we may capture the relative level of preparedness in monitoring for a response as start-point variability. According to Castro et al. (2019), start-point variability should differ under cognitive workload in anticipation of the DRT. Modeling Results Castro et al. (2019) found that a model assuming stochastic within-trial evidence accumulation best fits the DRT. We fit the modified stochastic racing one-barrier diffusion (Leite & Ratcliff, 2010) or Wald model used in Castro et al., (2019)—illustrated in Figure 5—to our DRT data across conditions of task priority. One fit of the Wald race model in Castro et al. (2019) allowed the starting point of evidence accumulation to be sampled from a uniform distribution from trial to trial to represent anticipation of the stimulus. Identifying the configuration of the Wald model requires several steps. First, we fix the standard deviation of the moment-to-moment variability represented by the dashed path in Figure 5 at 1 in order to estimate the mean rate of accumulation (v). In typical decision data, researchers estimate one average rate for the correct stimulus accumulator and another rate for the incorrect accumulator. For the DRT, however, we accumulate evidence for only one response and utilize a process-, or encoding-failure (pf) parameter to capture failures to respond. Parameter A represents the trial-to-trial varying starting point of evidence accumulation (see Figure 1). We also parameterize the accumulator’s 29 Decision Time (td) b Evidence Threshold v Mean Rate 0 Response Production Time (tr) Encoding Time (te) Stimulus Onset (t = 0) Time (t) Response (t = RT) Figure 5. A schematic for the Wald EAM of the DRT represented in Castro et al. (2019). The dashed evidence accumulation path represents a caricature of the more rapidly varying accumulation of evidence. Nondecision time (t0) results from the addition of te and tr. From 0 to a threshold (b) represents the evidence required for a decision. When the mean rate (v) reaches a threshold (b) decision time (td) is determined. amount of evidence (B) as the distance between A and the threshold (b) (i.e., B = b – A > 0). Finally, we estimate nondecision time (t0) as the sum of the times to encode the stimulus and to produce a response. As mentioned, participants occasionally fail to respond to the DRT, particularly under an increased workload. Castro et al. (2019) suggest that these omissions should be conceptualized either as a failure to sample evidence from the encoded stimulus (i.e., “trigger failure” in models of the stop-signal paradigm, Matzke, Love & Heathcote, 2017; 30 Matzke et al., 2017) or perceptual failure to even begin encoding the stimulus, (i.e., inattention blindness during distracted driving, Simons & Chabris, 1999; Strayer & Drews, 2007). In line with previous research (i.e., Brown & Heathcote, 2008; Ratcliff & McKoon, 2008), we assume that sensory information transmission creates an internal representation of evidence after stimulus onset. During that process, the evidence is accessed at some rate, resulting in accumulation. For the DRT, we assume that a sufficient change in the magnitude of relevant sensory information results in evidence accumulation and a response (i.e., a hit; Ratcliff & Van Dongen, 2011). Therefore, encoding failure (i.e., a miss) represents a lack of sufficient magnitude change, or that the evidence accumulation processes did not seek to access information about this change. Thus, the probability of these failures is accounted for with the parameter, pf. We can directly observe this probability, and there is a small but significant difference between priority conditions. Therefore, pf was allowed to vary with priority. Within the model, the likelihood of a response (i.e., a hit) R at time t assumes a likelihood of (1-pf) X l(R,t\|q); the probability of omission is pf. Therefore, pf is the inverse of HR. For modeling estimation and selection, we report several analyses comparing fits of model permutations. We first parameterize several instantiations of the model, resulting in varying goodness of fits according to the Deviance Information Criterion (DIC) as described by Spiegelhalter, Best, Carlin, and Van Der Linde (2002; see Table 2). Through this process, we select the best parameterization for the task priority data. We then analyze the parameters of the best-fit model and compare relative levels of influence for each parameter in its ability to explain the effect of priority. Finally, we 31 Table 2 The Difference Between DIC and the DIC for the Best (Bvt0 with no start-point noise) Model (DIC =-17058) for the Set of Nine Models. Model Bvt0 aBvt0 Bv ABvt0 Bt0 vt0 B v t0 Difference 0 4 7 23 28 74 139 220 585 Note. We compare Bayesian Hierarchical Models utilizing the Deviance Information Criterion (DIC) described in Spiegelhalter, Best, Carlin, and Van Der Linde (2002). correlate model parameters and steering error to demonstrate the tradeoff between the two tasks. Model estimation and selection. We utilize Turner, Sederberg, Brown, and Steyvers’ (2013) differential evolution algorithm to estimate the models in a Bayesian manner. The two-step sampling procedure, which is described in Castro et al. (2019), consists of fitting individuals before population estimates for a fully hierarchical model (also see Heathcote, Lin, Reynolds, Strickland, Gretton, & Matzke, 2018). We estimate start point (a), nondecision time (t0), omission probability (pf), threshold (B), and mean rate (v) separately for each priority condition, resulting in 15 total parameters. Model comparisons. We fit the full model (i.e., with start-point noise and with priority effects on the probability of omission, rates, thresholds and nondecision time) and nine simpler versions, dropping one or two effects of priority (pf always varies) with a fixed lower 50 ms nondecision time. All other priors remain the same as the initial fits. The most complex model, named ABvt0, consists of 15 parameters. Start point (A) varies in the full model, gets estimated as equal to one across conditions for the second-most complex model (aBvt0) with 13 parameters, and is omitted in the next model (Bvt0) with 12 parameters. The next level of models drops the priority effect from one parameter across conditions, resulting in vt0, Bt0, and Bv, consisting of 9 parameters. Finally, the last three models maintain a priority effect on only one parameter; B, v, and t0 consist of 6 32 parameters. Our results demonstrate consistency with previous work (e.g., Castro et al., 2019) and with the parameter analyses reported below; they provide evidence that the priority manipulation could be reliably captured by three parameters (i.e., a model that dropped the t0 parameter did not differ by order of magnitude according to DIC). The DIC for the selected model was very similar to the initial fit of this model, and analyses of parameters replicated a very similar pattern of p values. Parameter tests. Parameter estimates were reported as posterior medians with 95% credible intervals (in square brackets). The effects of task priority on parameter differences between conditions were tested with Bayesian p values (e.g., Matzke, Dolan, Batchelder, &Wagenmakers, 2015; Matzke et al., 2017; Klauer, 2010). The Bayesian p value represents the probability of one parameter being greater than another for the given sample; no difference would result in p = .5, and a small or large p would indicate evidence for a difference. We report the tail value to be consistent with the convention of small values representing significant differences. The response omission parameter (pf) was 1.11%, [.67, 1.42] higher (.08%, 95% CI [.02, 1.12] vs. 1.88%, 95% CI [1.68, 2.02], p = .02) when focusing on steering compared to focusing on the DRT. The nondecision time (t0) parameter showed evidence of being 8 ms [3, 13] faster in the Equal condition (123 ms, 95% CI [119, 127]) than the DRT condition (131 ms 95% CI [128, 134], p = .001), but it did not show evidence of being 1 ms 95% CI [-7, 5] slower than the Steering condition (123 ms, 95% CI [119, 127] vs. 124 ms 95% CI [119, 128], p = .41). The response threshold (b) was higher in the Steering condition than the Equal condition (by 0.08, 95% CI [.02, .13]: .96, 95% CI [.92, 1.0] vs. .89, 95% CI [.85, .92], p = .003), and the Equal condition threshold was higher than the DRT condition (by 0.18, 95% CI [.13, 33 .22]: .89, 95% CI [.85, .92] vs. .71, 95% CI [.68, .74], p < .001). The mean rate did not show evidence of being different in the Steering condition compared to the Equal condition (.13 95% CI [-.06, .32]; 4.39, 95% CI [4.26, 4.52] vs. 4.52, 95% CI [4.39, 4.66], p = .09). The Equal condition rate (4.39, 95% CI [4.26, 4.52]) also did not show evidence of being different from the DRT condition (4.69, 95% CI [4.54, 4.84]; .17 95% CI [-.03, .38] p = .045). The underlying causes of priority effects. In order to quantify the importance of each parameter in explaining the effect of priority on DRT RT and HR, we held one parameter constant (i.e., set to the average value between conditions) and simulated data from this model. We then subtracted the proportion of the predicted priority effects in mean RT from actual values. The increased RT when focusing upon steering was due to both higher thresholds and lower accumulation rates. The 8 ms effect of t0 was compensatory, meaning faster perceptual encoding/motor responses decreased the RT effect from DRT to Steering priority from 67 ms to 59 ms. Of the underlying 67 ms mean RT effect, around 57 ms (85.07%) was due to the higher threshold in the workloadpresent condition, and the remaining 10 ms (14.93%) was attributed to mean rate differences. Model parameter and pursuit tracking correlations. We also employed plausible-value correlations (Ly et al., 2017) to relate model parameter differences to the changes in steering error. The parameters from the best-fit Wald model were used, and the RMSE steering error for each participant was used for the subject covariates. Correlations were tested for all of the parameters that were allowed to vary in the winning model with means of the steering error. The full analysis procedure can be found in Castro et al. (2019). 34 The correlations tested between RMSE and model parameters consisted of the posterior distributions for omission rate (pf), threshold (b), and evidence accumulation rate (v), separately in the DRT and steering conditions. Ninety-five percent credible intervals (i.e., 95% CIs) are given in square brackets. Bayesian p values are tested with correlations for RMSE and the distribution of posterior parameter estimates. Again, we maintained the smaller p value convention to support the existence of differences; A p value close to zero would provide evidence for strong negative or positive relationship and a p value near .5 would provide evidence for no relationship. For the steering task, there was a negative correlation with mean rate (v) in the steering priority condition r(19) = -0.31, [-.48, -.13], p = .001. For the DRT priority condition, there was a negative correlation r(19) = -.37, [-.48, -.25], p < .001 with the threshold (b) parameter. Other correlations were weak (i.e., credible intervals included 0). Modeling Discussion We found further evidence that a Wald-distributed model of DRT that accounts for response omissions, variable rates of evidence, and certainty of responses best fits RT and HR measures, even under manipulations of instructed priority. Our findings indicate that the DRT competes for limited capacity with the steering task, and participants voluntarily allocate attention to improve performance on either measure. The increased rate of evidence accumulation most strongly correlates with better steering performance. However, when focusing on the DRT, threshold (b) dictates the relationship between DRT and steering performance. This outcome suggests that depending upon task priority, the process of dual-tasking changes. Usually, a cognitively demanding secondary task pulls our attention away from the driving context through bottom-up processes, and 35 effortful voluntary attention is needed to return to driving while filtering out that task. In this scenario, rate effects seem to dictate steering performance until participants choose to focus on a secondary task compared to driving. In this case, the threshold of evidence necessary for a response dictates steering performance. Howard et al. (under review) predict that processes of attention separate difficulty and multitasking. In the case of using the DRT to measure workload, steering alone may be a task that increases difficulty, and therefore drift rate in an EAM, but as a second task is added attention allocation dictates its performance. In Experiment 1, performance across task priority shifts in a linear manner (see Figure 4), which indicates resource reciprocity, but is also driven by the thresholds biasing one task above the other in the winning EAM. These findings indicate that both processes of attention allocation and limited capacity are required to accurately represent the effects of manipulations of instructed priority in multitasking. However, there are several aspects of Experiment 1 beyond task priority that could be responsible for the resource reciprocity demonstrated by POCs, the selected model, and the dominant threshold effect. For example, the tasks used were discrete and continuous, which may be the main source of the demonstrated effects. Therefore, we use two simple discrete tasks in Experiment 2. The tested models have demonstrated similar effects in Castro et al., (2019) for a cognitive workload manipulation with both detection and choice tasks. Therefore, we test different models and choice tasks in Experiment 2 in order to provide converging evidence for the effect of task priority on attention allocation and limited capacity. 36 EXPERIMENT 2 Research Objectives In measures of human performance, the underlying mechanisms that dictate speed and accuracy have been thoroughly studied with simple tasks (e.g., Navon & Gopher, 1979; Schiffrin & Schneider, 1977; Schneider & Schiffrin, 1977). Therefore, in testing the validity of our models of evidence accumulation for quantifying processes of task priority, we must employ a basic research paradigm. EAMs traditionally capture the dynamics of making a 2-alternative forced-choice task (2AFC). However, Castro et al. (2019) found similar parameter changes for the ISO DRT (ISO 17488, 2016) and a modified choice response task (CRT). In order to provide converging evidence, Experiment 2 also contains a 2AFC. However, unlike Castro and colleague’s (2019) CRT, we can increase the data to be modeled by replacing the continuous steering task with another 2AFC. The 2AFCs are fit with a linear ballistic accumulator model (LBA; Brown & Heathcote, 2008) as described by Heathcote et al. (2019). By demonstrating similar parameter changes across different RT tasks and models, we provide further evidence for the underlying processes of limited capacity and task prioritization captured with EAMs. Aim 2 To establish an EAMs’ ability to demonstrate cognitive limitations in simultaneous 2AFCs, we will fit the model to data more closely resembling an 37 experimental design from PRP literature. According to Pashler (1994), when participants produce two responses in quick succession, the response-selection bottleneck (RSB) increases the second RT. This effect should demonstrate specific changes for the drift rate (v) parameter of the model. However, multitasking so far leads to changes in the threshold (b) parameter of our models (Castro et al., 2019; Heathcote et al., 2015; Tillman et al., 2017). To test these outcomes, we present two simultaneous choices and require that participants change which task has priority for a response. Unlike most PRP designs, we do not employ an SOA to preserve the similarity of Experiment 2’s manipulation to Experiment 1. Hypotheses. The RSB framework dictates that capacity limitations reside with the decision/selection process. Therefore, evidence for one response or another should accumulate more slowly when trying to make two simultaneous responses. To match the previous manipulation of task priority, we utilize the same instruction manipulation and demonstrate that the tasks trade off in terms of their POCs. We predict that the two responses will trade off in performance similar to the DRT and steering deviation observed in Experiment 1. This fundamental research task should also demonstrate the same pattern of parameter changes in EAMs. Method Participants As in Experiment 1, we recruited 20 participants from the University of Utah undergraduate psychology courses. Participants received course credit as compensation upon completion of the 2-hour study. Participants were prescreened for colorblindness utilizing the SONA participant database. 38 Materials The task was presented on a desktop computer utilizing E-Prime 2.0 software to present stimuli (Psychology Software Tools, Pittsburgh, PA). Responses were made using two configurations of the ‘z’, ‘x’, ‘.’, and ‘/’ keys on a desktop keyboard. These keys were covered with a button representing blue, red, square, or circle. Procedure Participants respond to the color and the shape presented on the screen. Each stimulus presentation of a colored shape requires two responses. Then, the program displays feedback for both RTs and whether they are correct overall. The stimulus appears after a set interstimulus-interval of 500 milliseconds. The experiment was conducted across four counterbalanced sessions. Each session had 80 trials of practice, a block prioritizing color, a block prioritizing shape, and a block with equal priority. Figure 6 demonstrates an example trial with a red circle stimulus. For color priority, participants focus on responding to whether the stimulus is red or blue. For the shape emphasis, participants focus on responding to whether the stimulus is a circle or a square. In all conditions, participants respond to both the shape and the color of the stimulus (i.e., participants make two responses—one for each task). Measures. Two RTs to the presentation of the stimulus were recorded as well as the number of correct responses to the color and shape of the stimulus. Responses before 100 milliseconds are considered anticipatory, and thus we labeled them as “guessing” and responses after 3 seconds were labeled “too slow.” These responses were removed from the analysis (1.2%). 39 Figure 6. An example trial presents a red circle to participants with the E-Prime 2.0 software (Psychology Software Tools, Pittsburgh, PA). Results For the second experiment, we present two simultaneous choices and require that participants change which task has priority for a response. Results indicate that choice tasks trade off in a similar manner to the DRT and steering (see Figure 7). However, the task priority shifted emphasis across a smaller range of standardized units (~1 vs. >~.5). In an EAM, the drift rate parameter of the model accounts for this effect. The stimulus appears after a set interstimulus-interval of 500 milliseconds, so start-point variability (A) should be observed in EAM. The caution level of the response (threshold) should also differ across the task in EAM. Behavioral Measures Reaction time. We found an effect of task priority c2(1) = 15.90, p < .001 and a significant interaction of task priority and response type c2(1) = 2348.47, p < .001. In the color first condition (M = 1041 ms, SD = 162 ms), color responses were 160 40 Reaction Time Accuracy Shape RT Performance (Standardized Units) 0.5 Priority ● color equal shape 0 0.0 ● ● −0.5 Shape Accuracy Perfromance (Standardized Units) 0.5 −0.5 −0.5 0 Color RT Performance (Standardized Units) 0.5 −0.5 0.0 0.5 Color Accuracy Performance (Standardized Units) Figure 7. Both graphs depict performance on two tasks in standard deviation units with three conditions of priority. The left graph demonstrates a fairly linear tradeoff in performance between the two tasks, with some potential interference for attempting to perform the tasks simultaneously. The right panel demonstrates a slight tradeoff in accuracy for the two tasks. Error bars are 95% confidence intervals around the mean for repeated measures utilizing the Cousineau-Morey method (Cousineau, 2005; Morey, 2008; Baguley, 2012). milliseconds, 95% CI [092 ms, 244 ms] faster, t(19) = 8.32, p < .001 than in the shape first condition (M = 883 ms, SD = 152 ms). For shape responses in the shape first condition (M = 1048 ms, SD = 159 ms), almost the exact opposite pattern occurred compared to the color first condition (M = 847 ms, SD = 146 ms), t(19) = 9.32, p < .001 (see Table 3). Accuracy. According to a Type II Wald chi-square test, responding to shape or color was not significantly different c2(1) = 3.83, p = .05. However, there was a strong interaction c2(1) = 146.93, p < .001 between the priority manipulation and the response type. When participants were instructed to respond to color first, accuracy for the color responses was 95.50%, 95%CI [94.9, 96.2]; when asked to respond to shape first, accuracy for the color responses decreased to 92.47%, 95% CI [92.3, 93.8], t(19) = 7.34, 41 Table 3 Reaction Time and Accuracy of the Simultaneous Color and Shape Discrimination Tasks. Dependent Task Priority Response Mean SD p Variable Accuracy (%) Color-Shape Reaction Time (s) Color-Shape DRT Color 95.5%, 92.5% 20.7%, 26.4% <.01 Shape 92.8%, 96.5% 25.8%, 18.4% <.01 Color 883 ms, 1041 ms 152 ms, 162 ms <.01 Shape 1048 ms, 847 ms 159 ms, 146 ms <.01 Note. Planned contrasts of levels within the winning LMM for pairwise comparisons of means were tested—p values represent these comparisons. p < .01. The same pattern occurred for the shape priority condition with 96.48%, 95% CI [95.3, 97.7], accuracy when responding to shape first and 92.83%, 95% CI [91.8, 93.7], accuracy when responding to color first, t(19) = 9.62, p < .01. This pattern demonstrates the effect of priority upon accuracy with a consistent trade-off between color discrimination and shape discrimination. Behavioral Discussion Results suggest that the priority traded off reciprocally between the two choice tasks in the same manner as Experiment 1. This dataset should, therefore, have a similar configuration and resultant pattern in its model. When participants responded to the shape task first, they were faster and more accurate in their responses to the shape discrimination task. The same pattern occurred for color discrimination responses when participants were asked to respond to the color task first. In the equal priority condition, it appears that participants may have experienced slight interference from making decisions about two features of the same object. This interference can be seen as a slight dip in the POC for performance in the equal condition. In this case, a model that accumulated 42 evidence across the two tasks and delayed response initiation until both sets of accumulators reached a threshold would probably be more representative of the underlying processes. Therefore, we fit different instantiations of the model to the unequal priority response condition separately from the equal priority condition. The POCs also demonstrate a slight bias for shape responses in accuracy, which serves as an asymmetric effect often captured by response threshold (b) in EAM. However, the shape and color tasks may not be psychometrically equivalent, which would be captured by the encoding and motor response time (i.e., nondecision time t0) in EAM. We can remove this effect by modeling the color task and the shape task separately across the priority conditions, thereby fitting models to changes in RT distributions based upon the priority manipulation regardless of the task. However, a model accounting for all of the effects beyond our manipulation would better represent the data. We fit the full model to instructional emphasis, the task type (i.e., color or shape), and the order of response (i.e., whether the response for the color task was red or blue first, and whether the response for the shape task was square or circle first). For subsequent analyses, we collapse across response type (i.e., averages of shape and color effects across task priority), determining model fits to priority responses and non-priority responses. Modeling Results For the LBA model, each accumulator samples independently on each trial from a uniformly distributed interval 0-A for its starting points (see Figure 8). There are two accumulators per trial. For the full model, we assume a variable start-point noise (A). For all other models, we assume the same value of start-point noise (A) for all conditions. Evidence accrues linearly at an independently drawn rate for each accumulator and trial. 43 Figure 8. A schematic for the two LBA evidence-accumulation models of the two choice tasks. A previous configuration of this model is represented in Castro et al. (2019). The evidence accumulation path represents a linear accumulation of evidence. In this schematic, a decision for the circle key would be chosen, largely due to a change in the drift rate (v) parameter. 44 Rates are selected from a normal distribution with mean v and rate standard deviation sv. Both distributions are truncated below by zero. The threshold (b) varies between accumulators to accommodate potential response biases (e.g., a lower threshold for the square accumulator would cause a bias to respond “circle”), and it is parameterized concerning the gap (B) between the top of the start point noise and threshold (i.e., B = b – A), with B allowed to vary across the conditions. We estimate the v parameter for matching and mismatching accumulators of each stimulus (i.e., color and shape). Nondecision time (t0) is assumed the same for all accumulators but allowed to vary with task priority. In order to identify the model (Donkin, Brown, & Heathcote, 2009), the sv parameter remains fixed at 1 for the accumulator that mismatches the stimulus (i.e., the red accumulator when the stimulus is blue and the circle accumulator when the stimulus is a square), but sv for the other (matching) accumulator is estimated. The parameter varies with the stimulus (i.e., color or shape). Priors. We chose vague priors designed to have little influence on estimation. The individual parameter estimates were used as the mean parameters for the population distributions. The standard deviations of hyperparameters (i.e., population-level parameters) are assumed to have exponential distributions with a scale parameter of one. In individual participant estimation, priors are normal distributions that are truncated below at zero for B, A, t0, and sv parameters. The t0 prior is truncated above by 1s, and no posterior samples ever approached this limit. No other truncations are assumed, so the v prior remains unbounded. For B, A, and v for the false accumulator, and all sv parameters, the prior mean equals 1. The v parameters for matching accumulators take a prior mean of 3 and for mismatching accumulators a prior mean of 1. The t0 parameters have a prior mean of 1, and there is no pf parameter (corresponding to a 0% omission rate). All priors 45 take a standard deviation of 2. The LBA model assumes that thresholds (B) mean rates (v) and nondecision time (t0) vary with task priority. Separate models were fit for response type. Hence, the full LBA model has 29 total parameters: 16 vs (priority x response x match), 8 Bs (priority x match), 2 t0s (priority), 2 As (priority) and 1 sv. Posterior inference. To take account of correlations between parameter estimates in within-subject comparisons, we first calculate parameter differences for each posterior sample for each participant and then average the differences for each posterior sample over participants. The p value corresponds to the proportion of average posterior sample differences more or less than zero, depending on the direction that is consistent with the stated hypothesis. Ninety-five percent credible intervals are calculated from the values below and above which 2.5% of posterior samples (or differences) occurred. Model fit. In Figures 9 and 10, the fit of the LBA model is shown using cumulative distribution function lines averaged over participants. Each panel contains pairs of cumulative functions for each response and corresponds to a cell of the design. Note that these asymptote at the probability of that response. Error rates are calculated by subtracting one from the sum of the asymptotic heights of each function. Note that the sum corresponds to the probability that a correct response was made. In Figures 9 and 10, the thin black line and solids black points denote the model prediction averaged over posterior samples, and the thick gray line and open points represent the data. Percentile predictions for 500 randomly selected sets of posterior parameter samples are averaged for the calculation. Model estimation and selection. Again, we estimated the models utilizing a Bayesian approach and employed the differential evolution algorithm proposed by Turner, Sederberg, Brown, and Steyvers, (2013). As described in Castro et al. (2019), the 46 1.0 s2 color COLOR 1.0 s1 color COLOR ● ● 0.8 ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● 1.0 1.5 2.0 2.5 3.0 ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● 1.0 1.5 2.0 RT (s) s1 shape COLOR s2 shape COLOR 2.5 3.0 1.0 RT (s) 0.6 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Average r1 Average r2 ● ● 0.4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● Average r1 Average r2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●●● ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ●● ● ● 1.5 RT (s) 0.0 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 Probability ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.6 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● 0.0 ● ● ●● ● ● ● ● ● 0.5 1.0 0.5 0.0 ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● 0.4 0.0 ● ● ● ● ● ● ● ● Probability Average r1 Average r2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.4 0.6 Average r1 Average r2 Probability ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.4 Probability 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2.0 2.5 3.0 ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ●●●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● 0.5 ● 1.0 1.5 RT (s) 2.0 2.5 3.0 Figure 9. Cumulative distribution functions for data (thick lines) and fits (thin black lines) of the LBA model with start-point variability (ABvt0) to the choice data for the color task. In the titles, COLOR represents color responses. Lowercase shape and color refer to the task priority manipulation. The s1 and s2 refer to either red or blue, but the button locations were switched in blocks to counterbalance finger responses. When the stimulus is s1, and the response is r2, that trial is an incorrect response. Each panel contains results for both correct (i.e., the higher curve) and incorrect (i.e., the lower curve) responses. Symbols mark the 10th, 30th, 50th, 70th and 90th percentiles (solid for average fits, open for data). The black line and black solid points are the average of 500 fits. 47 1.0 s2 color SHAPE 1.0 s1 color SHAPE ● 0.4 ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 1.0 1.5 2.0 2.5 0.6 3.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● 0.5 1.0 ● 1.5 2.0 s2 shape SHAPE 2.5 3.0 1.0 s1 shape SHAPE 1.0 RT (s) ● ● 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 Average r1 Average r2 ● ● 0.4 0.6 Average r1 Average r2 Probability 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● RT (s) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● 0.4 Probability Average r1 Average r2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● 0.0 ●● ● ● ● ● ●● ● 0.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 0.4 0.6 Average r1 Average r2 Probability ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 Probability 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1.0 1.5 RT (s) 2.0 2.5 3.0 ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● 0.5 1.0 1.5 2.0 2.5 3.0 RT (s) Figure 10. Cumulative distribution functions for the LBA model with start-point variability to the choice shape data. In the titles, SHAPE refers to shape responses. Lowercase shape and color refer to task priority, and s1 and s2 refer to either circle or square. When s matches r, that is a correct response. The thick lines represent the data, and the thin lines represent the fits. Each panel contains results for both correct (higher) and incorrect (lower) responses. Symbols mark the 10th, 30th, 50th, 70th and 90th percentiles (solid for average fits, open for data). The black line and black solid points are the average of 500 fits 48 two-step sampling procedure consists of fitting individuals before population estimates for a fully hierarchical model (also see Heathcote, Lin, Reynolds, Strickland, Gretton & Matzke, 2018). For each priority condition, nondecision time (2 t0s), threshold (8 bs), mean rate (16 vs), and start-point variability (2 As) were calculated separately, producing a total of 28 parameters. Model comparisons. We fit the most parsimonious model and seven simpler versions, dropping one or two effects of priority with a fixed lower 0 s nondecision time. All other priors were the same as the initial fits. According to DIC, the most parsimonious model did not include start-point noise and had priority effects on thresholds and rates (see Table 4). The model entitled ABvt0 was the most complex and consisted of 29 parameters. Start point variability was dropped for the Bvt0 model, but A was estimated for 28 parameters. In the next level of models, the priority effect from one parameter was dropped, resulting in vt0, Bt0, and Bv, consisting of 24, 20, and 27 parameters, respectively. Finally, we included a priority effect on only one parameter; B, v, and t0 consisted of 19, 23, and 16 parameters, respectively, for the last three models. In line with the parameter analyses reported below, the results of this analysis confirm the reliability of the priority effects on 4 of the 4 parameters. All of the models that drop one or more effect perform worse. The analyses of parameters replicate a similar pattern with p values. Further, the DIC for the initial fit and the DIC for the selected model are very Table 4 The Difference Between DIC and the DIC for the Best (ABvt0 with start-point noise) Model (DIC = 18730 for the Set of Eight models. Model ABvt0 Bvt0 Bv Bt0 B vt0 v t0 Difference 0 145 173 1247 1299 2756 2995 10063 Note. We compare Bayesian Hierarchical Models utilizing the Deviance Information Criterion (DIC) described in Spiegelhalter, Best, Carlin, and Van Der Linde (2002). 49 similar. Parameter tests. As seen in the square brackets, we show the parameter estimates in terms of posterior medians with 95% credible intervals. Bayesian p values are used to test the effects of task priority on parameter differences between conditions (e.g., Matzke, Dolan, Batchelder &Wagenmakers, 2015; Matzke et al., 2017; Klauer, 2010). For the given sample, the Bayesian p value illustrates the probability of one parameter being greater than another. For example, p = .5 indicates that no difference exists between parameters while a smaller or larger p provides evidence for a difference. To be consistent with the convention of small values representing significant differences, we report the tail value. When focusing on shape verses focusing on the color, there is not compelling evidence that the condition differed for nondecision time (t0). For example, when focusing on red, responses to red are 1 ms [-2, 5] faster (151 ms, [149, 152] vs. 152 ms [148, 153], p = .45). The response threshold (b) is higher for color in the shape priority condition than the color priority condition (by 0.23, [.16, .27]: 1.35, [1.20, 1.41] vs. 1.12, [.96, 1.99], p < .001). The shape priority mean rate is lower than the color priority condition for color responses as well (by .14 [.06, .26]; 2.67, [2.61, 2.78] vs. 2.53, [2.45, 2.63], p < .001). The underlying causes of priority effects. To more thoroughly explain the effects of priority on color and shape responses, we sought to understand the impact of each parameter. To do this, we held one parameter constant (i.e., set to the average value between conditions) and simulated data from the model. Then from the actual values, we then subtracted the proportion of the predicted priority effects in mean RT. Higher 50 thresholds and lower accumulation rates accounted for the increased RT when focusing on the opposite task. The higher threshold in the shape priority condition for color accounted for the underlying 57 ms DRT mean RT effect, around 50 ms (87.52%) and the remaining 7 ms (12.48%) was due to the mean rate differences. Modeling Discussion We found further evidence that an LBA model that accounts for variable rates of evidence and certainty of responses best fits RT and HR measures, even under shifting voluntary attention. This shifting of attention occurs as a threshold change in our models, coupled with a smaller effect of limited capacity. However, unlike Castro et al. (2019), the shifting of priority does not require start-point variability. This could be due to participants making fewer anticipatory responses in a dual-choice task. However, it is important to emphasize that the expected driver of RT according to limited-capacity attention theories is drift rate (v), sharing a pool of resources across multiple tasks (e.g., Strayer, Watson, & Drews, 2011). The drift rate parameter maps to “the speed of information uptake” according to Voss, Nagler, and Lerche (2013, p. 4). Although drift rate (v) may explain increases in cognitive workload for increasing difficulty with a single task, it could be that multitasking involves other processes (e.g., Howard et al., under review). We find that the processing speed is overshadowed in our manipulation of task priority by the threshold (b) effect, which mirrors Experiment 1’s findings with the Wald-distributed model for DRT. However, there may be some elements of the current experimental design that artificially force the outcome. For example, although participants do not appear to perform at the ceiling of performance, there are relatively few errors recorded, leading to 51 poor fits for errors. Additionally, participants may be processing both the shape and color simultaneously, but then waiting to respond to both in quick succession after both decisions have been made. Although this explains the data and describes the winning model being a result of strategic response caution, the type of response caution minimizes RT differences that otherwise would have occurred. Therefore, the proportion of the effect accounted for by response thresholds may be inflated. Despite these potential caveats, the pattern converges with Experiment 1, suggesting that detection multitasking and choice multitasking utilize similar mechanisms. These mechanisms rely heavily on threshold adjustments to counteract performance decrements due to dividing attention. In simultaneous choice tasks, these threshold effects are dominant. GENERAL DISCUSSION In Experiment 1, participants performed a continuous steering task and a discrete detection task (i.e., the DRT; ISO 17488, 2016). In Experiment 2, participants responded to the color and shape of an object for two 2AFCs simultaneously presented on a computer screen. In both experiments, participants were asked to prioritize one task, the other task, and both tasks equally in counterbalanced blocks. Behavioral results from both experiments demonstrated similar patterns in terms of standardized units for POCs. According to Navon and Gopher (1979), the linear POCs represent resource reciprocity, where a finite resource trades off linearly between two tasks across task priority. Within this framework, limited capacity dictates the effect between prioritizing one task or the other. However, this framework underemphasizes processes to determine the resources necessary to complete the task. Also, a linear POC indicates that equivalent amounts of the attentional resource were utilized across task priority in total, potentially representing 100% of resources available. In practice, maximizing resources in dualtasking leads to diverging measures of accuracy and RT (Fox, Park, & Lang, 2007), with fast errors increasing (i.e., guessing, or very low thresholds) and failing to respond to the secondary task more likely (Castro, et al., 2016; Castro et al., 2019). According to these criteria, the data from our manipulations of task priority do not indicate cognitive overload, where the resource demanded by the tasks exceeds the amount available. Instead, we can infer fairly consistent performance across task priority (i.e., within one standard deviation) and propose that predetermined criteria for 53 performing the task “well enough” dictate the overall attention utilized, as well as the proportions of attention allocated to each task across task priority. This proposal emphasizes that our capacity for attending to each task is not the main predictor of performance, but that a schema of acceptable performance predicts the effect of task priority. This schema is represented by the threshold parameter of EAM. However, effects of limited capacity in line with Kahneman (1973) or Welford’s (1952) PRP framework cannot be wholly dismissed from these data. For example, the equal priority condition in Experiment 2 seems to demonstrate the dual-task interference described by Pashler (1993, 1994). There are several types of interference that could describe the decreased simultaneous performance of the color and shape tasks compared to the priority conditions. For example, the effects of early perceptual interference for determining different features of the same stimulus would be captured by nondecision time (t0) in EAM. Although Navon and Gopher (1979) largely dismiss interference as difficult to interpret at best, performing below the potential family of POC curves could be explained by interference as Kantowitz and Knight (1976) demonstrate, or by changing task priority and task difficulty. However, the effect occurs only in accuracy, we only manipulated task priority, and the other conditions already demonstrate a large change in threshold. The explanation that participants solve both problems serially (e.g., Pashler, 1993, 1994; Karayanidis, Coltheart, Michie, & Murphy, 2003) and wait to press the keys at the same time should lead to slowed RT in addition to poorer accuracy but would still be represented by an increase in the threshold. Wickens (2002, 2008) presents another alternative with multiple pools of resources for the different sensory modalities. Although all of the information presented is visual, features of a single object might be more related 54 than different facets of two different objects. The behavioral data alone do not help to disentangle this phenomenon. Other features of the behavioral data stand out as well, such as the asymmetric effect of focusing on the DRT compared to the steering task in Experiment 1. When focusing on the DRT, steering suffers disproportionately. We hypothesized that startpoint variability would account for increased anticipation of the pseudo-randomly presented DRT stimuli, but the best-fit model provided contrary evidence. The explanation for why the DRT does not change as drastically as steering may relate to the aforementioned discrete/continuous pairing, and the simple, automated DRT task (Schumacher et al., 2001; Shiffrin & Schneider, 1977). Additionally, Navon and Gopher (1979) describe diminishing returns in trading units of resource in order to approach maximum performance. Focusing upon steering did not greatly impair DRT performance, which maps well to its intended use in the vehicle as a secondary task. Additionally, as a measure of cognitive workload, it would be helpful for the DRT to be somewhat resistant to the voluntary allocation of attention. However, EAMs of the effects created by changing task priority challenge exactly what the DRT is measuring. Modeling the Interaction of Attention Allocation and Capacity The studies described in this text demonstrate how the voluntary priority of one task over another shifts the importance of parameters in both Wald and LBA models. These shifts in the allocation of resources are mostly captured by the threshold parameter of our models. However, limited capacity must also be accounted for in order to accurately capture the data. In Experiment 1, response omissions do not seem to have as much importance in priority shifts compared to manipulations of cognitive workload 55 (e.g., Castro et al., 2019). In Experiment 2, the consistency of the interstimulus interval may help to account for the selection of start-point variability (A), which is thought of as an anticipatory effect in EAM. Both models provided reasonably accurate representations of the data and demonstrated similar patterns across their parameters. These configurations of parameters implicate the importance of attention allocation and the certainty of responding over limited cognitive capacity. These conclusions continue a line of research supporting control over capacity (e.g., Heathcote et al., 2015), which advances a more holistic conceptualization of human performance in multitasking. However, the threshold parameter could represent withholding cognitive resources due to response certainty, as there was no explicit time pressure or bias toward less automated tasks. In the latter case for Experiment 1, the quantity of steering task-updating and feedback outweighs the intermittent monitoring for DRT signals, creating a natural bias toward that task. However, this effect occurred in Experiment 2 as well; participants slightly favored the shape task over the color task in the accuracy of responses. Despite these challenges, an EAM account of multitasking performance with the manipulation of instructional task priority provides further insight into the relationship between the allocation of attention and limited capacity. Future studies should determine this approach’s feasibility outside of the lab and what insights could be gained in naturalistic multitasking. However, competing explanations still exist that provide evidence for the mechanisms of performance limitations and should be further explored with this approach. 56 Applications Driving These findings have implications for measuring distraction in automobiles due to a prevalent confound in distracted driving literature; when a person is distracted, they can be distracting themselves by allocating too many resources to noncritical tasks, or they could be overwhelmed by tasks of driving and talking, giving voice commands, or listening to an audiobook. Future research must take into account the fairly recent phenomenon identified by the late Clifford Nass: we have this true design challenge that we've never encountered before, which is the entire field of automotive design has to switch from how can I, the designer, stop distracting you because you really want to pay attention to the road, to a radically different world in which the driver says I don't want to pay attention to the road, and the auto designer has to say how can I force you back onto paying attention to the road? (C. Nass, personal communication, May 10, 2013) This assertion concludes that people no longer wish to allocate their resources toward the life-saving but ultimately boring task of moving from point A to point B via automobile. Before designers can begin to remove drivers from the responsibility of maintaining their own safety (e.g., via automation), it is important to determine that this is indeed the case. Through simple experimental designs and robust statistical methods, we can determine what human capabilities attentional limitations and distraction affect, but we must also test how these human-machine interactions occur in reality to determine how we might optimize them. Workplace Multitasking People often decrease their productivity at work by multitasking with technology (Duke & Montag, 2017). Interrupting a radiologist can lead to failures to detect cancerous 57 nodules (Williams & Drew, 2017). Workplace multitasking can be personally initiated or externally driven (König, Oberacher, & Kleinmann, 2010). In these cases, determining whether workers are overloaded with tasks that have expended their limited capacity or whether inefficient prioritizing of tasks is hurting productivity could be invaluable to the office, lab, or emergency rooms. In these cases, EAM of cognitive workload could help to determine optimal task orders or whether attentional training would increase productivity. Team Workload Teams also flexibly allocate resources, and this approach could be expanded to incorporate the relative workload of n nodes in a system. Patterns in nature often repeat themselves at different scales, and determining the similarities and differences of multitasking and multiperson multitasking could have a strong impact on our structures for work, war, or play. In the field of team cognition (Fernandez et al., 2017), EAMs with n accumulators could demonstrate bottlenecks in the system instead of just the person. The relative workload of dyads, as well as overall workload, could be calculated for interlocutors. Conclusions The DRT coupled with models of evidence accumulation can account for previously uncontrolled confounds in measurements of cognitive workload, but only after determining a reasonable set of cognitively plausible, identifiable, and recoverable parameters. This paper establishes the characteristics of proposed parameters in a task priority manipulation and finds that predetermined attention allocation predicts performance beyond the limited capacity of attention. However, both processes are 58 needed to represent the data most accurately. Future research should build on this approach to understanding the role of cognitive workload in applied settings such as driving and interacting with technology. The understanding of cognitive workload in this context will improve people’s physical safety, psychological stress, and the efficiency of the systems that currently cause us to multitask. Although the approach may be resource, data, and computationally intensive, our technological capabilities and integration of converging methods will only improve. Whether the goal is to elucidate some fundamental mechanism of how we perceive and act upon our environment or to answer specific questions about performance in applied settings, this flexible framework can support it. 59 REFERENCES Baguley, T. (2012). Calculating and graphing within-subject confidence intervals for ANOVA. Behavior Research Methods, 44(1), 158-175. Barrouillet, P., Bernardin, S., Portrat, S., Vergauwe, E., & Camos, V. (2007). Time and cognitive load in working memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33(3), 570. Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01. Beck, J. M., Ma, W. J., Kiani, R., Hanks, T., Churchland, A. K., Roitman, J., ... Pouget, A. (2008). Probabilistic population codes for Bayesian decision making. Neuron, 60(6), 1142-1152. Bennur, S., & Gold, J. I. (2011). Distinct representations of a perceptual decision and the associated oculomotor plan in the monkey lateral intraparietal area. Journal of Neuroscience, 31(3), 913-921. Biernat, M., Kobrynowicz, D., & Weber, D. L. (2003). Stereotypes and shifting standards: Some paradoxical effects of cognitive load. Journal of Applied Social Psychology, 33(10), 2060-2079. Braver, T. S. (2012). The variable nature of cognitive control: A dual mechanisms framework. Trends in Cognitive Sciences, 16(2), 106-113. Brooks, S. P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7(4), 434-455. Brown, S. D., & Heathcote, A. (2008). The simplest complete model of choice response time: Linear ballistic accumulation. Cognitive Psychology, 57(3), 153-178. Castro, S. C., Cooper, J. M., & Strayer, D. L. (2016). Validating two assessment strategies for visual and cognitive load in a simulated driving task. Proceedings of the Human Factors and Ergonomics Society 56th Annual Meeting (pp. 1899-1903). Sage CA: Los Angeles, CA: SAGE Publications. Castro, S. (2017, September). How handheld mobile device size and hand location may 60 affect divided attention. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (pp. 1370-1374). Sage CA: Los Angeles, CA: SAGE Publications. Conti, A., Dlugosch, C., Vilimek, R., Keinath, A., & Bengler, K. (2012). An assessment of cognitive workload using detection response tasks. In N. A. Stanton (Ed.), Advances in human aspects of road and rail transportation (pp. 735–743). Boca Raton, FL: CRC Press. Cooper, J. M., Castro, S. C., & Strayer, D. L. (2016, September). Extending the detection response task to simultaneously measure cognitive and visual task demands. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (pp. 1962-1966). Sage CA: Los Angeles, CA: SAGE Publications. Cousineau, D. (2005). Confidence intervals in within-subject designs: A simpler solution to Loftus and Masson’s method. Tutorials in Quantitative Methods for Psychology, 1, 42-45. Cowan, N. (1999). An embedded-processes model of working memory. Models of Working Memory: Mechanisms of Active Maintenance and Executive Control, 20, 506. Craik, F. I., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11(6), 671-684. Davis, R. (1956). The limits of the “psychological refractory period.” Quarterly Journal of Experimental Psychology, 8(1), 24-38. Donkin, C., Brown, S. D., & Heathcote, A. (2009). The overconstraint of response time models: Rethinking the scaling problem. Psychonomic Bulletin & Review, 16(6), 1129-1135. Duke, É., & Montag, C. (2017). Smartphone addiction, daily interruptions and self- reported productivity. Addictive Behaviors Reports, 690-695. doi:10.1016/j.abrep.2017.07.002 Efendioglu, A. (2016). How do the cognitive load, self-efficacy and attitude of pre-service teachers shift in the multimedia science learning process?. Educational Research and Reviews, 11(8), 743-764. Eggemeier, F. T. (1988). Properties of workload assessment techniques. In Advances in psychology (Vol. 52, pp. 41-62). Amsterdam, North-Holland: Elsevier. Eidels, A., Donkin, C., Brown, S. D., & Heathcote, A. (2010). Converging measures of workload capacity. Psychonomic Bulletin & Review, 17(6), 763-771. 61 Engle, R. W., Kane, M. J., & Tuholski, S. W. (1999). Individual differences in working memory capacity and what they tell us about controlled attention, general fluid intelligence and functions of the prefrontal cortex. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control (pp. 102-134). New York, NY: Cambridge University Press. Engström, J., Johansson, E., & Östlund, J. (2005). Effects of visual and cognitive load in real and simulated motorway driving. Transportation Research: Part F, 8, 97120. doi:10.1016/j.trf.2005.04.012 Engström, J., Markkula, G., Victor, T., & Merat, N. (2017). Effects of cognitive load on driving performance: The cognitive control hypothesis. Human Factors, 59(5), 734-764. Fox, J. R., Park, B., & Lang, A. (2007). When available resources become negative resources: The effects of cognitive overload on memory sensitivity and criterion bias. Communication Research, 34(3), 277-296. Fernandez, R., Shah, S., Rosenman, E. D., Kozlowski, S. W., Parker, S. H., & Grand, J. A. (2017). Developing team cognition: A role for simulation. Simulation in Healthcare: Journal of the Society for Simulation in Healthcare, 12(2), 96. Granholm, E., Asarnow, R. F., Sarkin, A. J., & Dykes, K. L. (1996). Pupillary responses index cognitive resource limitations. Psychophysiology, 33(4), 457461. Hanks, T. D., Ditterich, J., & Shadlen, M. N. (2006). Microstimulation of macaque area LIP affects decision-making in a motion discrimination task. Nature Neuroscience, 9(5), 682. Harbluk, J. L., Noy, Y., Trbovich, P. L., & Eizenman, M. (2007). An on-road assessment of cognitive distraction: Impacts on drivers’ visual behavior and braking performance. Accident Analysis & Prevention, 39, 372-379. doi:10.1016/j.aap.2006.08.013 Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. Advances in Psychology, 52, 139183. Heathcote, A. (2004). Fitting Wald and ex-Wald distributions to response time data: An example using functions for the S-PLUS package. Behavior Research Methods, 36, 678–694. Heathcote, A., Brown, S.D. & Wagenmakers, E-J. (2015). An introduction to good practices in cognitive modeling. In B. U. Forstmann & E.-J.Wagenmakers (Eds.), An introduction to model-based cognitive neuroscience. New York, NY: 62 Springer. Heathcote, A., Lin, Y. S., Reynolds, A., Strickland, L., Gretton, M., & Matzke, D. (2019). Dynamic models of choice. Behavior Research Methods, 51(2), 961-985. Heathcote, A., Loft, S., & Remington, R. W. (2015). Slow down and remember to remember! A delay theory of prospective memory costs. Psychological Review, 122(2), 376-410. Heitz, R. P. (2014). The speed-accuracy tradeoff: History, physiology, methodology, and behavior. Frontiers in Neuroscience, 8, 150. Howard, Z. L., Evans, N. J., Innes, R., Brown, S., & Eidels, A. (in press). Is multitasking just a form of difficulty?. Huk, A.C., Katz, L.N., & Yates, J.L. (2014) Accumulation of evidence in decision making. In D. Jaeger & R. Jung (Eds.), Encyclopedia of computational neuroscience. New York, NY: Springer. International Organization for Standardization. (2016). Road Vehicles—Transport Information and Control Systems—Detection–Response Task (DRT) for Assessing Attentional Effects of Cognitive Load in Driving. Geneva, Switzerland: ISO; 2016. ISO 17488. Janssen, C. P., Gould, S. J., Li, S. Y., Brumby, D. P., & Cox, A. L. (2015). Integrating knowledge of multitasking and interruptions across different perspectives and research methods. International Journal of Human-Computer Studies, 79, 1-5. doi:10.1016/j.ijhcs.2015.03.002 Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall. Kahneman, D., & Beatty, J. (1966). Pupil diameter and load on memory. Science, 154(3756), 1583-1585. Kane, M. J., & Engle, R. W. (2003). Working-memory capacity and the control of attention: The contributions of goal neglect, response competition, and task set to Stroop interference. Journal of Experimental Psychology: General, 132(1), 47. Kantowitz, B., & Knight, J. (1976). On experimenter-limited processes. Psychological Review, 83(6), 502-507. Karayanidis, F., Coltheart, M., Michie, P. T., & Murphy, K. (2003). Electrophysiological correlates of anticipatory and poststimulus components of task switching. Psychophysiology, 40(3), 329-348. Kass, S. J., Cole, K. S., & Stanny, C. J. (2007). Effects of distraction and experience on situation awareness and simulated driving. Transportation Research Part F: 63 Traffic Psychology and Behaviour, 10(4), 321-329. doi:10.1016/j.trf.2006.12.002 Klauer, K. C. (2010). Hierarchical multinomial processing tree models: A latent–trait approach. Psychometrika, 75, 70-98. König, C. J., Oberacher, L., & Kleinmann, M. (2010). Personal and situational determinants of multitasking at work. Journal of Personnel Psychology, 9, 99103. Kramer, A. F., Sirevaag, E. J., & Braune, R. (1987). A psychophysiological assessment of operator workload during simulated flight missions. Human Factors, 29(2), 145-160. Laming, D.R.J. (1968). Information theory of choice-reaction times. London, England: Academic Press. Leite, F. P., & Ratcliff, R. (2010). Modeling reaction time and accuracy of multiplealternative decisions. Attention, Perception, & Psychophysics, 72(1), 246-273. Levy, J., Pashler, H., & Boer, E. (2006). Central interference in driving: Is there any stopping the psychological refractory period? Psychological Science, 17(3), 228235. Logan, G. D. (1980). Attention and automaticity in Stroop and priming tasks: Theory and data. Cognitive Psychology, 12(4), 523-553. Logan, G. D., Van Zandt, T., Verbruggen, F., & Wagenmakers, E. J. (2014). On the ability to inhibit thought and action: General and special theories of an act of control. Psychological Review, 121(1), 66-95. doi:10.1037/a0035230. Lohani, M., Payne, B. R., & Strayer, D. L. (2019). A review of psychophysiological measures to assess cognitive states in real-world driving. Frontiers in Human Neuroscience, 13. Luce, R. D. (1986). Response times: Their role in inferring elementary mental organization (No. 8). New York, NY: Oxford University Press. Ly, A., Boehm, U., Heathcote, A., Turner, B.M., Forstmann, B., Marsman, M., & Matzke, D. (2017). A flexible and efficient hierarchical Bayesian approach to the exploration of individual differences in cognitive-model-based neuroscience. In A.A. Moustafa (Ed.), Computational models of brain and behavior (pp. 467-480). London, England: Wiley Blackwell. Marsman, M., Maris, G., Bechger, T., & Glas, C. (2016). What can we learn from plausible values? Psychometrika, 81(2), 274-289. 64 Matzke, D., Dolan, C. V., Batchelder, W. H., & Wagenmakers, E.-J. (2015). Bayesian estimation of multinomial processing tree models with heterogeneity in participants and items. Psychometrika, 80, 205-235. Matzke, D., Hughes, M., Badcock, J.C., Michie, P. & Heathcote, A. (2017). Failures of cognitive control or attention? The case of stop-signal deficits in schizophrenia, Attention, Perception & Psychophysics, 79, 1078-1086. Matzke, D., Love, J. & Heathcote, A. (2017). A Bayesian approach for estimating the probability of trigger failures in the stop-signal paradigm. Behavior Research Methods, 49, 267-281. McClelland, J. L., & Cleeremans, A. (2009) Connectionist models. In T. Byrne, A. Cleeremans, & P. Wilken (Eds.), Oxford companion to consciousness. New York, NY: Oxford University Press. McLeod, P. (1987). Visual reaction time and high-speed ball games. Perception, 16(1), 49-59. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81. Molenaar, D., Bolsinova, M., & Vermunt, J. K. (2018). A semi-parametric within-subject mixture approach to the analyses of responses and response times. British Journal of Mathematical and Statistical Psychology, 71(2), 205-228. Monsell, S. (2003). Task switching. Trends in Cognitive Sciences, 7(3), 134-140. doi:10.1016/S1364-6613(03)00028-7 Morey, R. D. (2008). Confidence intervals from normalized data: A correction to Cousineau (2005). Tutorial in Quantitative Methods for Psychology, 4, 61-64. Motzkus, C. J., Getty, D. J., Campos, A., Cooper, J. M., & Strayer, D. L. (2018, September). Utilizing a remote LED stimulus to concurrently measure cognitive and visual task demand. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (pp. 1-5). Los Angeles, CA: SAGE Publications. Nass, C. (2013, May 10). The Myth of Multitasking, (I. Flatow, Interviewer). National Public Radio. Retrieved from https://www.npr.org/2013/05/10/182861382/themyth-of-multitasking. Navon, D., & Gopher, D. (1979). On the economy of the human-processing system. Psychological Review, 86(3), 214. Neisser, U. (2014). Cognitive psychology: Classic edition. New York, NY: Psychology Press. 65 Norman, D. A., & Shallice, T. (1986). Attention to action. In Consciousness and selfregulation (pp. 1-18). Springer, Boston, MA. Ophir, E., Nass, C., & Wagner, A. D. (2009). Cognitive control in media multitaskers. Proceedings of the National Academy of Sciences, 106(37), 1558315587. Pachella, R. G. (1973). The interpretation of reaction time in information processing research (No. TR-45). Michigan University Ann Arbor Human Performance Center, Ann Arbor, MI. Padilla, L. M. K., Castro, S. C., Quinan, P. S., Ruginski, I. T., & Creem-Regehr, S. H. (2019). Toward objective evaluation of working memory in visualizations: A case study using pupillometry and a dual-task paradigm. IEEE Transactions on Visualization and Computer Graphics, 26(1), 332-342. Palada, H., Neal, A., Tay, R., & Heathcote, A. (2018). Understanding the causes of adapting, and failing to adapt, to time pressure in a complex multistimulus environment. Journal of Experimental Psychology: Applied, 24(3), 380. Pashler, H. (1993). Dual-task interference and elementary mental mechanisms. In D. Meyer & S. Kornblum (Eds.), Attention and performance XIV (pp. 245-264). Cambridge, MA: MIT Press. Pashler, H. (1994). Dual-task interference in simple tasks: Data and theory. Psychological Bulletin, 116(2), 220-244. doi:10.1037/0033-2909.116.2.220 Pashler, H., & Johnston, J. C. (1989). Chronometric evidence for central postponement in temporally overlapping tasks. The Quarterly Journal of Experimental Psychology, 41(1), 19-45. Pashler, H., Johnston, J. C., & Ruthruff, E. (2001). Attention and performance. Annual Review of Psychology, 52, 629-651. doi:10.1146/annurev.psych.52.1.629 Pashler, H., Jolicœur, P., Dell'Acqua, R., Crebolder, J., Goschke, T., De Jong, R., ... Hazeltine, E. (2000). Task switching and multitask performance. In S. Monsell, J. Driver, S. Monsell, & J. Driver (Eds.), Control of cognitive processes: Attention and performance XVIII (pp. 275-423). Cambridge, MA, US: The MIT Press. Posner, M. I. (1964). Information reduction in the analysis of sequential tasks. Psychological Review, 71, 491-504. Posner, M. I. (1978). Chronometric explorations of mind. Oxford, England: Lawrence Erlbaum. Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32(1), 3-25. 66 Posner, M. I., & Rossman, E. (1965). Effect of size and location of informational transforms upon short term retention. Journal of Experimental Psychology, 70, 496-505. Prinzmetal, W., McCool, C., & Park, S. (2005). Attention: Reaction time and accuracy reveal different mechanisms. Journal of Experimental Psychology: General, 134(1), 73. R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.Rproject.org/. Ranger, J., & Kuhn, J. T. (2012). A flexible latent trait model for response times in tests. Psychometrika, 77(1), 31-47. Ranney, T. A., Baldwin, G. H., Smith, L. A., Mazzae, E. N., & Pierce, R. S. (2014). Detection Response Task (DRT) Evaluation for Driver Distraction Measurement Application (No. DOT HS 812 077). Ratcliff, R. (2015). Modeling one-choice and two-choice driving tasks. Attention, Perception, & Psychophysics, 77(6), 2134-2144. Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873-922. Ratcliff, R., & Rouder, J. N. (1998). Modeling response times for two-choice decisions. Psychological Science, 9(5), 347-356. Ratcliff, R., & Strayer, D.L., (2014). Modeling simple driving tasks with a one-boundary diffusion model. Psychonomic Bulletin & Review, 21(3), 577-589. doi:10.3758/s13423-013-0541-x. Ratcliff, R., & Van Dongen, H. P. A. (2011). Diffusion model for one-choice reactiontime tasks and the cognitive effects of sleep deprivation. Proceedings of the National Academy of Sciences of the United States of America, (pp. 11285– 11290). http://doi.org/10.2307/27978781 Rogers, Y. (2009, November). The changing face of human-computer interaction in the age of ubiquitous computing. In Symposium of the Austrian HCI and Usability Engineering Group (pp. 1-19). Berlin Heidelberg, Germany: Springer. Rogers, R. D., & Monsell, S. D. (1995). Costs of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology. General, 124(2), 207. Rubinstein, J. S., Meyer, D. E., & Evans, J. E. (2001). Executive control of cognitive 67 processes in task switching. Journal of Experimental Psychology: Human Perception and Performance, 27(4), 763-797. doi:10.1037/00961523.27.4.763 Sala, S. D., Baddeley, A., Papagno, C., & Spinnler, H. (1995). Dual‐task paradigm: A means to examine the central executive. Annals of the New York Academy of Sciences, 769(1), 161-172. Sanbonmatsu, D. M., Strayer, D. L., Medeiros-Ward, N., & Watson, J. M. (2013). Who multi-tasks and why? Multi-tasking ability, perceived multitasking ability, impulsivity, and sensation seeking. Plos ONE, 8(1), 1-8. doi:10.1371/journal.pone.0054402 Schmiedek, F., Oberauer, K., Wilhelm, O., Süß, H.-M., & Wittmann, W. W. (2007). Individual differences in components of reaction time distributions and their relations to working memory and intelligence. Journal of Experimental Psychology: General, 136(3), 414-429. Schneider, W., & Shiffrin, R. M. (1977). Controlled and automatic human information processing: I. Detection, search, and attention. Psychological Review, 84(1), 1. Schubert, T., Fischer, R., & Stelzel, C. (2008). Response activation in overlapping tasks and the response-selection bottleneck. Journal of Experimental Psychology: Human Perception and Performance, 34(2), 376. Schumacher, E. H., Seymour, T. L., Glass, J. M., Fencsik, D. E., Lauber, E. J., Kieras, D. E., & Meyer, D. E. (2001). Virtually perfect time sharing in dual-task performance: Uncorking the central cognitive bottleneck. Psychological Science, 12(2), 101-108. Sheridan, T. B., & Stassen, H. G. (1979). Definitions, models and measures of human workload. In Mental workload (pp. 219-233). Boston, MA: Springer Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: II. Perceptual learning, automatic attending and a general theory. Psychological Review, 84(2), 127. Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional blindness for dynamic events. Perception, 28(9), 1059-1074. doi:10.1068/p2952 Sirevaag, E. J., Kramer, A. F., Coles, M. G., & Donchin, E. (1989). Resource reciprocity: An event-related brain potentials analysis. Acta Psychologica, 70(1), 77-97. Sitterding, M. C., Ebright, P., Broome, M., Patterson, E. S., & Wuchner, S. (2014). Situation awareness and interruption handling during medication administration. Western Journal of Nursing Research, 36(7), 891-916. 68 doi:10.1177/0193945914533426 Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4), 583-639.s Stojmenova, K., & Sodnik, J. (2018). Detection-response task—Uses and limitations. Sensors, 18(2), 594. Strayer, D. L., Biondi, F., & Cooper, J. M. (2017). Dynamic workload fluctuations in driver/nondriver conversational dyads. In D. V. McGehee, J. D. Lee, & M. Rizzo (Eds.), Driving Assessment 2017: International Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design (pp. 362-367). The University of Iowa, Iowa City, IA: The Public Policy Center. Strayer, D. L., & Drews, F. A. (2007). Cell-phone-induced driver distraction. Current Directions in Psychological Science, 16, 128-131. Strayer, D. L., Drews, F. A., & Johnston, W. A. (2003). Cell phone-induced failures of visual attention during simulated driving. Journal of Experimental Psychology: Applied, 9(1), 23-32. doi:10.1037/1076-898X.9.1.23 Strayer, D. L., & Fisher, D. L. (2016). SPIDER: A framework for understanding driver distraction. Human Factors, 58(1), 5-12. doi:10.1177/0018720815619074 Strayer, D. L., & Johnston, W. A. (2001). Driven to distraction: Dual-task studies of simulated driving and conversing on a cellular telephone. Psychological Science, 12(6), 462-466. Strayer, D. L., Turrill, J., Coleman, J. R., Ortiz, E. V., & Cooper, J. M. (2014). Measuring cognitive distraction in the automobile ii: Assessing in-vehicle voice-based. Accident Analysis & Prevention, 372, 379. Strayer, D. L., Turrill, J., Cooper, J. M., Coleman, J. R., Medeiros-Ward, N., & Biondi, F. (2015). Assessing cognitive distraction in the automobile. Human Factors, 57(8), 1300-1324. doi:10.1177/0018720815575149 Strayer, D. L., Watson, J. M., & Drews, F. A. (2011). Cognitive distraction while multitasking in the automobile. Psychology of Learning and Motivation-Advances in Research and Theory, 54, 29-58. Strickland, L., Loft, S., Remington, R.W. & Heathcote, A. (in press). Racing to remember: A theory of decision control in event-based prospective memory. Psychological Review. Strickland, L., Loft, S., Remington, R.W. & Heathcote, A. (2017). Accumulating 69 evidence for the delay theory of prospective memory costs, Journal of Experimental Psychology: Learning, Memory & Cognition, 43, 1616-1629. Tillman, G. Strayer, D., Eidels, A., Heathcote, A. (2017). Modeling cognitive load effects of conversation between a passenger and driver. Attention, Perception & Psychophysics 79(6), 1795-1803. Tombu, M., & Jolicœur, P. (2004). Virtually no evidence for virtually perfect timesharing. Journal of Experimental Psychology: Human Perception and Performance, 30(5), 795. Turner, B. M., Sederberg, P. B., Brown, S. D., & Steyvers, M. (2013). A method for efficiently sampling from distributions with correlated dimensions. Psychological Methods, 18(3), 368-384. Usher, M., & McClelland, J. L. (2001). The time course of perceptual choice: The leaky, competing accumulator model. Psychological Review, 108(3), 550. Voss, A., Nagler, M., & Lerche, V. (2013). Diffusion models in experimental psychology. Experimental Psychology, 60, 385-402. https://doi.org/10.1027/16183169/a000218. Wagenmakers, E. J. (2009). Methodological and empirical developments for the Ratcliff diffusion model of response times and accuracy. European Journal of Cognitive Psychology, 21(5), 641-671. Wang, C., Chang, H. H., & Douglas, J. A. (2013). The linear transformation model with frailties for the analysis of item response times. British Journal of Mathematical and Statistical Psychology, 66(1), 144-168. Welford, A. T. (1952). The ‘psychological refractory period’ and the timing of high‐ speed performance—A review and a theory. British Journal of Psychology, 43(1), 2-19. Wickelgren, W. A. (1977). Speed-accuracy tradeoff and information processing dynamics. Acta Psychologica, 41(1), 67-85. Wickens, C. D. (1984). Processing resources in attention. In R. Parasuraman & R. Davies (Eds.), Varieties of attention (pp. 63-101). New York, NY: Academic Press. Wickens, C. D. (1991). Processing resources and attention. Multiple-Task Performance, 1991, 3-34. Wickens, C.D. (2002). Situation awareness and workload in aviation. Current Directions in Psychological Science (Wiley-Blackwell), 11(4), 128-133. 70 Wickens, C. D. (2008). Multiple resources and mental workload. Human Factors, 50(3), 449-455. doi:10.1518/001872008X288394 Wickens, C. D., & McCarley, J. S. (2019). Applied attention theory. Boca Raton, FL: CRC Press. Williams, L. H., & Drew, T. (2017). Distraction in diagnostic radiology: How is search through volumetric medical images affected by interruptions?. Cognitive Research: Principles and Implications, 2(1), 12.
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6arr40x