| Title | Privacy enabled crowdsourced transmitter localization using adjusted measurements |
| Publication Type | thesis |
| School or College | College of Engineering |
| Department | Computing |
| Author | Singh, Harsimran |
| Date | 2018 |
| Description | We address the problem of location privacy in the context of crowdsourced localization of spectrum oenders where participating receivers report received signal strength (RSS) measurements and their location to a central controller. We present a novel approach that we call the adjusted measurement approach, in which we generate pseudolocations for partic- ipating receivers and report these pseudolocations along with adjusted RSS measurements as if the measurements were made at the pseudolocations. The RSS values are adjusted by representing those as a weighted linear combination of the RSS values at the receivers, where receivers closer to the false location have a higher weight than those far away. We use two RSS datasets, one from a cluttered oce (indoor) and another from roadways in Phoenix, Arizona (outdoor) to evaluate our approach. We compare the localization error of our approach with that of the naive approach that simply adds noise to locations. Our results demonstrate that location privacy can be preserved without a signicant increase in the localization error. We also formulate an adversary attack that attempts to solve the inverse problem of determining the true locations of the receivers from their false locations. Our evaluations show that the adversary does no better than random guessing of true locations in the monitored area. |
| Type | Text |
| Publisher | University of Utah |
| Dissertation Name | Master of Science |
| Language | eng |
| Rights Management | © Harsimran Singh |
| Format | application/pdf |
| Format Medium | application/pdf |
| ARK | ark:/87278/s65r2wzd |
| Setname | ir_etd |
| ID | 1748483 |
| OCR Text | Show PRIVACY ENABLED CROWDSOURCED TRANSMITTER LOCALIZATION USING ADJUSTED MEASUREMENTS by Harsimran Singh A thesis submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Master of Science in Computer Science School of Computing The University of Utah August 2018 Copyright c Harsimran Singh 2018 All Rights Reserved The University of Utah Graduate School STATEMENT OF DISSERTATION APPROVAL The dissertation of Harsimran Singh has been approved by the following supervisory committee members: Sneha Kasera , Chair(s) 10 May 2018 Date Approved Neal Patwari , Member 10 May 2018 Date Approved Aditya Bhaskara , Member 10 May 2018 Date Approved by Ross Whitaker , Chair/Dean of the Department/College/School of Computing and by David B. Kieda , Dean of The Graduate School. ABSTRACT We address the problem of location privacy in the context of crowdsourced localization of spectrum offenders where participating receivers report received signal strength (RSS) measurements and their location to a central controller. We present a novel approach that we call the adjusted measurement approach, in which we generate pseudolocations for participating receivers and report these pseudolocations along with adjusted RSS measurements as if the measurements were made at the pseudolocations. The RSS values are adjusted by representing those as a weighted linear combination of the RSS values at the receivers, where receivers closer to the false location have a higher weight than those far away. We use two RSS datasets, one from a cluttered office (indoor) and another from roadways in Phoenix, Arizona (outdoor) to evaluate our approach. We compare the localization error of our approach with that of the naive approach that simply adds noise to locations. Our results demonstrate that location privacy can be preserved without a significant increase in the localization error. We also formulate an adversary attack that attempts to solve the inverse problem of determining the true locations of the receivers from their false locations. Our evaluations show that the adversary does no better than random guessing of true locations in the monitored area. For my parents, family and friends. CONTENTS ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii NOTATION AND SYMBOLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x CHAPTERS 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. RELATED WORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 Location Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Transmitter Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 7 ADVERSARY MODEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1 Model 1: Malicious Central Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Model 2: Malicious Third Party Applications . . . . . . . . . . . . . . . . . . . . . . . . . 9 10 4. PRIVACY METRIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5. METHODOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.1 Naive Approach: Adding Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Adjusted Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Our Method: Adjusted Measurement with Random Locations . . . . . . . 12 13 16 EVALUATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 6.1 Indoor: Cluttered Office Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Simple Noise Addition Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 Adjusted Measurement Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Outdoor: Phoenix, Arizona . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Simple Noise Addition Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Adjusted Measurement Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 18 20 21 23 23 ADVERSARY ATTACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 7.1 Inverse Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Inverse Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 26 28 USER TUNABLE PRIVACY SETTINGS . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3. 6. 7. 8. PRACTICAL CONSIDERATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 10. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 9. vi LIST OF FIGURES 1.1 Adjusted measurement idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 5.1 RSS field contour plots in an area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5.2 Adjusted measurement approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5.3 Error in RSS value estimation for different methods of interpolation (a) weighing method 1: wi = di−c (b) weighing method 2: wi = e−di /c . . . . . . . . . . . . . . 16 Localization error for varying group size for different noise levels in (a) simple noise addition approach (b) adjusted measurement approach . . . . . . . . . . . . . . 19 Location error (m) for varying group sizes when false locations are randomly sampled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6.3 Transmitter location (Red) along with receiver positions (Blue) . . . . . . . . . . . . 22 7.1 Loss function vs. iteration of attack algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . 29 7.2 Adversary guesses, true locations and reported false location in an area . . . . . 29 7.3 Mapping of the adversary’s guesses to true receiver locations . . . . . . . . . . . . . . 30 7.4 Matching cost between the true locations and (a) guesses from the adversary’s inverse attack and (b) random guesses of true locations . . . . . . . . . . . . . . . . . . 31 Matching cost between (a) adversary guesses of true locations and false locations reported (b) adversary guesses to one another for 100 runs of adversary’s inverse attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 7.6 Matching distance CDF for adversary attack for varying group sizes . . . . . . . . 33 7.7 Matching distance CDF for random guess of true locations for varying group sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1 6.2 7.5 LIST OF TABLES 6.1 6.2 6.3 Localization error (in m) for various transmitter locations for simple noise addition method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Localization error(in m) for various transmitter locations for adjusted measurement approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Localization error(in m) for various transmitter locations for adjusted measurement approach with random sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 NOTATION AND SYMBOLS Lt Rt Lf Rf La Ra Rc f vector of true locations of the receivers RSS values observed by receivers at true locations Lt vector of false locations reported to the central controller vector of RSS values reported along false locations L f vector of locations representing adversary’s guess of true locations vector of RSS values representing adversary’s guess of RSS values at L a vector of RSS values calculated at L f using L a and R a Algorithm used to adjust RSS measurements at the false locations ACKNOWLEDGEMENTS It feels like yesterday when I started my Masters. Time just flew by and two years have passed. These two years have been a wonderful experience, and I would like to thank several people who have contributed to it. I have been fortunate to have great advice from brilliant researchers, and the undying support of my friends and family to keep my spirits up throughout my stay at the U. First of all, I would like to thank my advisor Sneha Kasera for his guidance and support. His remarkable intuition about various problems and the simplicity with which he approaches them have always amazed me. His positive attitude and constant push and motivation during hard times to keep me on track have really helped me a lot during my thesis. It is one of the several simple yet great lessons to keep in mind that I have learned from him for my journey ahead. Next, I would also like to thank Aditya Bhaskara for several discussions which gave me valuable insights about my research. Many times, when I was stuck, these discussions resulted in a guiding direction for my thesis. I have always admired his ability to quickly grasp a problem even in a new domain and approach it from basics in a calm and meticulous way, and I can only hope that I learned some of it too. I would also like to thank Neal Patwari for being extremely patient with me and for dealing with my somewhat silly basic questions about RF sensing. This helped me understand basics which proved very helpful in making progress throughout my research. I am thankful to my friends for the fun we have had, the assignments we have solved and the deadlines we have gone through together. Last but not least, I would like to thank my family, my grandparents, parents and my little sister for their immense support and encouragement throughout, and especially my mother for never failing to keep a check on me. It is to them that I dedicate this thesis. CHAPTER 1 INTRODUCTION Many users are willing to have their networked devices participate in distributed sensing and data collection applications. As an example, car drivers would like to report traffic conditions and car velocities, if that allows city and state authorities to better plan the road network. Such reporting can also lead to short-term gains for car drivers in terms of a centralized service providing safer, less congested routes. Such a distributed and crowdsourced/crowdsensed data collection is expected to grow in the future. However, the participants may be seriously concerned about their privacy. They do not wish to have their data associated with their identities or locations. In many cases, a participating node reports its identity, its location, and its measurement (car velocity measurements, could also be radio frequency measurements, etc.) to a central controller which then makes these data available for different applications. However, such a data collection system does not necessarily preserve the location privacy of the participating users [8, 20, 34]. Users can be linked to their locations, and multiple pieces of such information over a period of time can be linked together to profile users, which leads to unsolicited targeted advertisements or price discrimination [3]. Even worse, a user’s habits, personal and private preferences, religious beliefs, and political affiliations, can be inferred from the user’s whereabouts. Therefore, users who are willing to participate in the crowdsourcing system for societal good or some incentives will be uncomfortable or in the worst case might not even participate if they feel that their privacy is compromised. The traditional way to preserve location privacy is to add noise to the location with the hope that the measured data would still be useful and would not severely reduce the quality of the service or the accuracy/utility of the application [10]. Location-based services (LBS) where the application response is based on the user’s geographic location [16, 18, 34] use such an approach. Some examples of LBS include location aware task reminder (pick up groceries 2 when near a store), advertisements, and emergency services. For these applications, the coarse location of the user is acceptable. Even for applications like building a weather map of a city, which use both location and the measurements, the measurement does not change over a few hundred meters. Thus, adding noise to the location does not significantly reduce the accuracy of the application. However, for some applications the measurements are closely tied to the location, i.e., there might be a significant change in measurements made at two locations only a few meters apart. For example, various localization applications based on wireless sensor networks rely heavily on the accuracy of reported sensor measurement as well as sensor’s locations [4, 5, 19]. The traditional method of adding noise as done in LBS may lead to a large drop in utility, when used in applications that are highly sensitive to location information, as the measurements at the false locations are expected to be significantly different from those at the actual locations. We address the problem of location privacy in the context of crowdsourced localization of spectrum offenders as in the work of Khaledi et al. [19]. In this context, participating wireless receivers report their location and the received signal strength (RSS) they measure of signals emitted from transmitters within their range to a central controller. The central controller collects the RSS and location data and feeds this information as input to localization algorithms to localize spectrum offenders. It is important to note that the central controller has the location data of all the participating receivers. In this thesis, we consider two adversary models where different entities in the system can be adversarial. In our first model, we consider the central controller to be the adversary. In our second model, the central controller is trusted but the third-party applications using the data at the central controller are adversarial and these try to infer the location of the receivers/users from the data. As we show in the thesis, the solution to the location privacy problem corresponding to the first adversary model also applies to the second adversary model. Therefore, unless we mention the second model explicitly, in formulating the problem, in developing the location privacy solutions, and in our evaluations, we assume the first adversary model. We present a novel approach towards solving the problem of preserving a participating receiver’s location privacy. In our approach, which we call the adjusted measurement approach, we generate pseudolocations and report the pseudolocations along with adjusted 3 measurements as if the measurements were made at the pseudolocations. This idea of adjusting RSS measurements for pseudolocations is shown in Fig. 1.1. In this figure, receivers, Rx1-4, are measuring the RSS of the signals transmitted by the transmitter, Tx, and reporting their locations and the measured RSS values to the data collector. However, 0 for the purpose of protecting its location privacy, an adjusted measurement RSS2 along 0 0 with the pseudolocation (x2 ,y2 ) is reported for Rx2 instead of its actual location (x2 ,y2 ) and its actual measurement RSS2. The key challenge here is to adjust the RSS measurements of signals from an unknown transmitter suitably to minimize the impact on the transmitter localization accuracy while preserving the participating receiver’s privacy. Ideally, the variation of RSS values across the monitored area can be approximated using the path loss model for radio waves [26]. Approximating the RSS values using a path loss model requires knowledge of transmitter’s characteristics including transmit power and antenna gain. However, for localizing an unknown transmitter for the purpose of monitoring, its characteristics are unavailable. Compu.ng cloud Data Collector Communica.on Network (x2, y2) Rx2 Tx RSS2 RSS3 Rx3 (x2’, y2’) (x3, y3) Rx2 RSS2’ Rx4 RSS4 (x , y ) 4 4 Figure 1.1. Adjusted measurement idea Rx1 RSS1 (x1, y1) 4 Existing work has proposed forming a collaborative group to achieve k-anonymity where a reported (location, measurement) pair could belong to any one of the k receivers. However, reporting true locations can still lead to privacy violations. Adversaries can correlate the reported locations with other meta information to identify the participating receivers [12], e.g., a home owner participating in crowdsourcing from his home can be identified by the reported location and directory information. Therefore, it is important that true locations are not reported. We propose a collaborative approach where the receivers form groups. Within each group, the receivers pick one among themselves as a leader and report their true locations and true RSS measurements to the leader. The leader, then, chooses false locations for these receivers by randomly sampling in a region that includes all the group members and adjusts the corresponding RSS values for the false locations, based on the true RSS measurements it receives. The RSS values are adjusted by representing those as a weighted linear combination of the true RSS values at the receivers within a group, where receivers closer to the false location have a higher weight than those far away. The details of this approach are described in Chapter 5. Finally, the leader reports the falsified locations and the corresponding adjusted RSS values to the central controller. The leaders of all groups perform the above tasks. We use two RSS datasets, one from a cluttered office (indoor) and another from the city of Phoenix, Arizona (outdoor) to evaluate our adjusted measurement approach. We also compare the localization error of our adjusted measurement approach with the naive approach which simply adds noise to locations for varying levels of noise addition. We observe that the transmitter localization error increases arbitrarily with increasing noise levels for both the datasets for the naive approach. In the indoor environment, we find that the localization error increases from 1.73 meters to 8.5 meters, and in the outdoor environment it increases from 134.24 meters to 232.77 meters. However, our adjusted measurement approach significantly reduces this increase in localization error in both the indoor and outdoor settings. Specifically, in the indoor environment, with location noise uniformly distributed in (-14, 14) meters along both latitude and longitude, the localization error reduces from 8.5 meters to 3 meters. In the outdoor environment, with location noise uniformly distributed in (-350, 350) meters along both latitude and longitude, the 5 localization error reduces from 232.77 meters to 167.02 meters. Our method using randomly selected location for receiver has an error of 1.8 meters and 155.60 meters in indoor and outdoor setting, respectively. We also formulate an adversary attack that attempts to solve the inverse problem of determining the true locations of the receivers from their false locations. Our evaluations show that the adversary does no better than random guessing of true locations in the monitored area. In summary, in this thesis, we make the following contributions: • We develop a collaborative interpolation based adjusted measurement approach to adjust RSS measurements at false locations such that the transmitter localization accuracy is least affected. • We evaluate our adjusted measurement approach in both indoor and outdoor environments to show that the adjusted measurement approach significantly improves accuracy in comparison to simple noise addition. • We formulate an inverse attack, by an adversary, to determine the true receivers locations using false locations and their adjusted RSS measurements. We show that the adversary does no better than random guessing of true locations in the monitored area thereby demonstrating the privacy preserving nature of our approach. CHAPTER 2 RELATED WORK Related work includes (a) different threats associated with location sharing and techniques used to preserve the privacy of the user, and (b) various approaches and applications for transmitter localization. 2.1 Location Privacy Sharing of location can have various threats associated with it. These threats are well studied in the existing literature [3, 8, 17, 20, 34]. Users can be identified even if they share their location sporadically [8]. To reduce the threat to location privacy certain applications anonymize or obfuscate their data [6, 7, 9]. However, a knowledge of the social graph of the user (relations among the users) can help an adversary to deanonymize their location traces [30]. Also, seemingly nonintuitive, location sharing of a user also has the potential to diminish the privacy of others [33]. In the obfuscation approaches, a user can report true identities but instead of the true location, it reports a nearby but false location [7]. Apart from being ineffective in preventing absence disclosure [27], obfuscation based approaches can cause considerable degradation in utility which can be a deterrent in their deployability. k-anonymity approaches have been used to make user indistinguishable from k-1 other users. These also incorporate a user defined privacy level based on the choice of k. Gedik et al. [9] show one such customizable kanonymity system and alongside implement a spatial-cloaking algorithm which anonymizes the location and cloaks it in an area before forwarding the location information to an LBS server. Collaborative approaches have also been used to preserve privacy for LBS [6, 29]. Shokri et al. [29] describe a collaborative privacy preserving approach called MobiCrowd which forms a peer-to-peer network and only queries the LBS if none of the peers has the required information for a given location. Chow et al. [6] have a similar approach of forming a peer-to-peer network to form a spatial cloaking region. The user can then filter 7 out the results based on its precise location. The above approaches are designed for LBS (where user is the beneficiary of the data/information) and hence, cannot be used directly for ‘reporting data’ at a false location. For our application, we need to report measurements at a false location, so we use a collaborative approach to adjust the measurements for the false locations before reporting them. 2.2 Transmitter Localization Localization of an RF source has been extensively studied over several decades [32], primarily using time and time-difference measurements. RSS measurements preserve privacy in the sense that the receiver does not need to record the received signal itself, which may contain private data, and can provide accurate localization due to the high density of transceivers in our environments, for example using WiFi fingerprinting [2], in sensor networks [25], or as a complement to GPS for cellular localization [35]. More recently, opportunistic spectrum reuse emerged as a means to improve the efficiency of our use of the radio spectrum [11]. A collaborative sensing algorithm can identify the “holes” where secondary use of the spectrum may occur [23]. While [23] simply locates these holes, an alternative approach is to locate the transmitters and identify their gain patterns so that their coverage area can be calculated [22]. Primarily, it is assumed that one transmitter is located at a time, that if multiple transmitters are to be located, their signals can be separated at the receivers. During jamming attacks, a sophisticated adversary can make this impossible. When multiple signals cannot be separated, methods [19, 24] can localize multiple transmitters from RSS measurements. While [24] has relatively high time complexity and assumes that the number of transmitters is known, [19] estimates the number of transmitters and separates the problem over space into individual transmitter localization problems. None of the above spectrum sensing approaches addresses the privacy of the user who participates in the system, which is seen as one of the major issues limiting the deployment of cognitive radio networks [13]. One privacy vulnerability is that the RSS measurements (without the coordinate) can be used to locate a receiver [14, 21], for example using RSS fingerprinting methods like [2] or maximum likelihood estimators as in [25]. Cryptographic methods can help by limiting the resolution of RSS information provided to the central 8 coordinator [14]. Our method modifies both the RSS measurements and the provided coordinate in an effort to make the attacker unable to estimate the true receiver coordinate any better than the provided coordinate. CHAPTER 3 ADVERSARY MODEL We consider the following two adversary models in this thesis. 3.1 Model 1: Malicious Central Controller In our work, users (corresponding to the receivers) wish to protect their location information from the central controller. Thus, we treat the controller as an adversary. The controller has access to all the readings of the receivers participating in the crowdsourcing system. These readings are reported in the form (Timestamp, Location, RSS), where Timestamp is the time when a measurement is made, and Location is an ( x, y) tuple representing the latitude and longitude of the receiver’s location. We also assume that the central controller knows the number of participating receivers and the algorithm used to adjust the RSS measurements for the false location. That the adversary has a complete knowledge of the algorithm used to adjust the RSS measurements is an important assumption. Since the output of this algorithm is a function of the true receiver locations and the true RSS values, the adversary could try to reverse engineer the algorithm and find out the true locations of the receivers (indeed, we consider such an attack in Chapter 7). We assume that the adversary does not deploy nodes to locate the participating receivers when they are sending measurements to their leader or the central controller as such a threat could exist with or without adjusted measurement approach. In our proposed collaborative approach, we assume that the communication in each receiver group is secure, and that the participants, including the group leader, are trustworthy, and that they do not collude with the adversary (i.e., the central controller). 10 3.2 Model 2: Malicious Third Party Applications In this adversary model, the central controller is not an adversary but the applications that use the collected data are adversarial. This adversary model then would allow the participating receivers to report their true data and location to the central controller which now adjusts the measurements. Such a model does not require group members to trust any leader receiver or each other. This model represents many scenarios in which users are willing to trust a central service but not necessarily other peers. Moreover, the receivers in this model need not communicate with each other. However, very importantly, our methodology for adjustment of measurements, that we develop for the first model, applies to this second model as well. Note that each of the two models is considered disjointly and not in conjunction with the other model. CHAPTER 4 PRIVACY METRIC To measure the privacy of a user, we propose two metrics: 1. k-anonymity: This is one of the most widely used privacy metrics. Simply put k-anonymity means that an adversary can narrow the identity of an individual down to a set of k people, but no smaller [31]. In our proposed solution, the central controller will be able to associate a measurement with a group, but not to any smaller subset of receivers in the group. So, our method achieves k-anonymity, where k is the number of receivers in the group. By increasing k, we reduce the adversary’s ability to associate a measurement to a single receiver or user. 2. Proximity to true locations: Another way to measure privacy is by the extent to which the adversary can determine/guess the receiver locations. In our evaluation, we consider the “matching cost” between the true locations and the adversary’s guesses. For formal definitions, we refer to Chapter 6. CHAPTER 5 METHODOLOGY In this section, we first describe a simple noise addition approach followed by our proposed adjusted measurement approach and then our final method of random selection of false locations with adjusted measurements. In the simple noise addition approach, receivers simply add noise to their true locations in order to protect their privacy. 5.1 Naive Approach: Adding Noise A simple way for a user to preserve his/her location privacy is to report a false location. Specifically, for some chosen noise level, the user can report a latitude, lat, and a longitude, lon, given by the following equations lat = lat + random.uniform(-noise level, noise level) (5.1) lon = lon + random.uniform(-noise level, noise level) (5.2) The higher the noise level, the further the false location is from the true one, on average, and thus the higher the privacy. However, as we move away from the true location, the RSS measurements which were made at the true location slowly stop making sense at the false location. Therefore, with increasing noise levels the localization error increases rapidly (see Section 6). Besides the drop in the utility (transmitter localization), this approach has other concerns as well. Since the location is reported by the device itself and the false location is obtained by adding random noise to the true location, there is a possibility of averaging and linkage attacks by the central controller [12]. Furthermore, in this approach, the user is unaware of the number of other users in the system in its vicinity. Thus, the user may face difficulty in choosing the right noise level for the desired privacy guarantee. 13 5.2 Adjusted Measurements It is challenging to preserve utility (i.e., localization accuracy) while reporting false locations. Our approach towards achieving both utility and privacy is to report a noisy location, while carefully adjusting the reported RSS measurement. While natural, this idea can be tricky to implement. The RSS field in a region can exhibit a complex behavior, and thereby, making it difficult to determine a plausible RSS value at the false location. For instance, Fig. 5.1(a) and Fig. 5.1(b) show the contour plot for the RSS field in a cluttered office space area. The transmitter locations in these figures are (5.5, 4.1) and (3.2, 9.1), respectively. Note that the RSS contour lines deviate significantly from the concentric circles expected from a standard log-path-loss model thus showing the contours in the real world can be complex. This makes modeling the variation of RSS values across the monitored area appropriately with the help of well-known propagation models harder/infeasible. Moreover, given that we do not have any prior information on the offending transmitter’s characteristics including transmit power, antenna type, angle etc., using path loss models which relies on transmitter location/characteristics is not possible. Our proposal to overcome this challenge is to use collaboration among small groups of receivers (i.e., users). This idea is illustrated in Fig. 5.2. The participating receivers form a group and select a leader. They then report their (loc, rss) pair to the leader, who is responsible for reporting these readings to the central controller. After receiving the readings from all the users in the group, the leader then chooses a false location for each receiver by adding some noise (similar to the simple noise addition method) but also adjusts the RSS measurements for the false location. The leader then reports these false locations and their corresponding adjusted RSS to the central controller. At first, it seems that we have simply transferred the difficulty of estimating RSS values at the false locations to the leaders. However, the key advantage now is that a leader has access to the RSS measurements at k different locations in a region, and can thus interpolate (described below) to estimate the RSS values. Also, a leader reporting values has other advantages: the adversary has no way to determine which user a particular measurement belongs to; this inherently prevents averaging and linkage attacks. We now describe the interpolation procedure that our leader uses. Our method is based on a simple yet powerful idea: the RSS value at a desired location is closer to the RSS values 14 − 32 10 − 36 Y-coordinat e − 44 6 − 48 − 52 4 RSS values in dBm − 40 8 − 56 2 − 60 0 −2 0 2 4 X-coordinat e 6 8 − 64 (a) − 39 10 − 42 Y-coordinat e − 48 6 − 51 − 54 4 RSS values in dBm − 45 8 − 57 2 − 60 0 −2 0 2 4 X-coordinat e 6 8 − 63 (b) Figure 5.1. RSS field contour plots in an area at locations near it than those far away. Therefore, given a false location f (at which the RSS value is unknown), we may express the RSS value at f as a weighted linear combination of the RSS values at receivers, where the receivers closer to f have a higher weight than those far away. Specifically, if we have a receiver at a distance d from a point f as above, the RSS value of the receiver has a contribution proportional to wi to the estimated RSS value at f . Formally, if we have n receivers in an area, the RSS value at a false location f is: 15 ① Form a group ② Select a leader(orange) ③ Receivers report (loc, rss) pair to the leader. ④ Leader selects set of false location for each receiver and adjusts the RSS values (green) ⑤ Leaders reports these false (loc’, rss’) pairs to central controller 1 4 2 Reports to controller 3 5 Central Controller Figure 5.2. Adjusted measurement approach RSS f = We tried two weighing methods. ∑in=1 wi RSSi . ∑in=1 wi (5.3) In the first method, wi = di −c where di is the distance between the ith receiver and false location f and c is a constant dependent on the environment. This constant determines how quickly the signal strength decays. It tends to be higher in an obstructed area, such as a downtown area, than a relatively open environment, such as a flat rural area. In the second method, wi = e−di /c , where di is the distance between the ith receiver and false location f and c is a constant equal to half the average distance between neighboring receivers. Figure 5.3 shows the error in estimated RSS with varying group size for both interpolation methods. Since the di −c method is slightly better, we use it as our method for interpolation for the remainder of this thesis. The method to choose false locations for the receivers and adjustment of RSS values is summarized in Algorithm 1. In Algorithm 1, the variable totatWeights is the denominator of eq. 5.3 and the variable WeightedRSS is the numerator. These false locations and adjusted RSS values are then reported to the central controller. 16 weighing method 1 weighing method 2 RSS estimation error (dB) 8 7 6 5 4 3 0 5 10 15 20 25 30 Group size/Number of receivers 35 40 Figure 5.3. Error in RSS value estimation for different methods of interpolation (a) weighing method 1: wi = di−c (b) weighing method 2: wi = e−di /c 5.2.1 Our Method: Adjusted Measurement with Random Locations The final algorithm we propose is a slight variant of the one above. The users form groups, and each group elects a leader, who reports the perturbed locations of the points, along with the RSS measurements computed as above. However, to choose a perturbed location of a receiver, we do not add noise to its true location, but instead take a more global approach. Our approach also hides the precise number of users in the group. Formally, we consider the spatial region R corresponding to the group (in our experiments, we use a slightly enlarged bounding box), and we select k ≤ n random locations from R as the points we report. We then use the interpolation procedure described above to compute the RSS values to report. The details can be found in Algorithm 2. This approach certainly preserves privacy better (as we now do not give out information such as the approximate positions of the receivers or even their number). As we see in our experiments, it does not reduce the utility in any significant manner. 17 Algorithm 1 Adjusted Measurement Approach 1: procedure AdjustedMeasurement(receiver list, noise level) 2: newLoc ← [] 3: newRss ← [] 4: for receiver in receiver list do 5: latitude ← receiver.latitude 6: longitude ← receiver.longitude 7: latitude ← latitude + random.uniform(-noise level, noise level) 8: longitude ← longitude + random.uniform(-noise level, noise level) 9: weightedRSS ← 0 10: totalWeights ← 0 11: for recv in receiver list do 12: distance ← euclideanDist(latitude, longitude, recv.latitude, 13: recv.longitude) 14: if distance == 0 then 15: weightedRSS ← recv.rssVal 16: totalWeights = 1 17: break 18: weightedRSS ← weightedRSS + 1/dc * recv.rssVal 19: totalWeights ← totalWeights + 1/dc 20: modifiedRSS ← weightedRSS / totalWeights 21: newLoc.append((latitude, longitude)) 22: newRss.append(modifiedRSS) 23: return newLoc, newRss Algorithm 2 Adjusted Measurement with Random Sampling 1: procedure RandomSample(R, receiver list, num to sample) 2: Randloc ← [] 3: AdjRSS ← [] 4: for (i=0; i < num to sample; i++) do 5: latitude ← random.sample(R.xmin, R.xmax) 6: longitude ← random.sample(R.ymin, R.ymax) 7: weightedRSS ← 0 8: totalWeights ← 0 9: for recv in receiver list do 10: distance ← euclideanDist(latitude, longitude, recv.latitude, 11: recv.longitude) 12: if distance == 0 then 13: weightedRSS ← recv.rssVal 14: totalWeights ← 1 15: break 16: weightedRSS ← weightedRSS + 1/dc ∗ recv.rssVal 17: totalWeights ← totalWeights + 1/dc 18: modifiedRSS ← weightedRSS/totalWeights 19: RandLoc.append((latitude, longitude)) 20: AdjRSS.append(modifiedRSS) 21: return RandLoc, AdjRSS CHAPTER 6 EVALUATION For the evaluation of utility, our baseline (the ‘gold standard’) is the localization accuracy when the true location and RSS measurements are reported by the receivers to the central controller but all privacy is lost. We compare our adjusted measurements approach against the simple noise addition approach, in both indoor and outdoor settings. 6.1 Indoor: Cluttered Office Space For our indoor experiments, we use a public dataset [25] that was collected in an office area that is cluttered with desks, bookcases, filing cabinets, computers, and equipment. In the experimental setup for this data collection, 44 sensors were placed randomly in a 15m by 14m area. The sensors transmitted sequentially. When one sensor transmitted, an RSS measurement was made by all the remaining sensors. Each of the 44 sensors transmitted once thereby providing us 44 transmitter locations. Our aim here is to evaluate the adjusted measurement approach. For every experiment, we consider all the 44 transmitter locations for localization of transmitter and take the average of the localization error of all the 44 locations for each group size. The group size is the number of receivers collaborating with the leader along with the leader himself. We vary the group size to show that with increasing size the localization error generally decreases; this demonstrates the power and potential of the collaborative approach to obtain both privacy and accuracy. 6.1.1 Simple Noise Addition Approach Figure 6.1(a) shows the results when each receiver simply adds noise to its location before reporting the measured RSS value. The noise is added to the location to get a false location according to (5.1) and (5.2). Each user independently adds noise to the location before reporting to the server. We add noise of varying levels in the range (0m, 14m). We 19 14m 10m Noise 6m 3m 0m 9 Localization Error (m ) 8 7 6 5 4 3 2 0 10 20 30 40 30 40 Group Size (a) 9 Localizat ion Error (m ) 8 7 6 5 4 3 2 0 10 20 Group Size (b) Figure 6.1. Localization error for varying group size for different noise levels in (a) simple noise addition approach (b) adjusted measurement approach 20 can see that the localization error initially drops as the number of participating receivers increases but gradually flattens out at high error values as expected. For large noise levels, the improvement in localization accuracy expected by increasing the number of receivers is almost negligible. With smaller noise the pattern is comparable to noise free localization where with increasing receivers the localization error decreases. However, the final error is still considerably higher than noise-free localization; even with noise level being 6m, the localization error increases by roughly 2.5 times. This is because the RSS measurements made at the true location stop being meaningful at the false location as we gradually increase the noise levels. 6.1.2 Adjusted Measurement Approach Figure 6.1(b) shows the same experiment but with our adjusted measurement approach for varying noise level. We see that the error, though initially high with smaller group sizes, as expected, gradually decreases as we increase the number of receivers. As is clearly evident, this method significantly improves over simple noise addition method. Even in the case of highest noise of 14m, the localization error is under 3m. Also with increasing noise levels, we see a slight increase in the localization error. This is due to the difficulty in predicting the RSS measurements even after using our interpolation method. The most interesting case is the one where the false locations are randomly sampled in an area (Algorithm 2) instead of adding noise to the receiver location. In this case, all the false locations are within the area, and therefore, the RSS adjustments are more accurate. Figure 6.2 shows the localization error vs. the group size. We can see that the localization error decreases with increasing group size. It also improves the localization error 3m (in case of adjusted measurement approach with 14m of noise level) to 2m. Also with about 25-30 points, the localization error starts to flatten out. Hence, while reporting to the central controller, the leader can sample fewer false locations than actual receivers to minimize information exposed about the true receiver locations to the central controller without comprising the localization accuracy. For the outdoor setting, we obtained a dataset collected in the roadways of the city of Phoenix, Arizona. This dataset is collected by placing a transmitter at a fixed location and driving around the city area and recording the RSS measurements. This is repeated for 21 8 random sample with adjustment Localization Error (m) 7 6 5 4 3 2 0 10 20 Group Size 30 40 Figure 6.2. Location error (m) for varying group sizes when false locations are randomly sampled multiple transmitter locations and the readings are recorded for 1 second at each location. Figure 6.3(a) and Fig. 6.3(b) show two transmitter locations. The red marker shows the transmitter, and the blue markers show the location where RSS measurements were recorded for each case. 6.2 Outdoor: Phoenix, Arizona The outdoor dataset helps us further validate our adjusted measurement approach and shows that it scales well to larger areas without any significant reduction in efficiency or accuracy. In our collaborative setup, we restrict the communication range between the participating receivers in the outdoor, city-scale setting. For our outdoor data set we restrict this range to 350m. Essentially, we select a receiver with a local RSS maxima as the leader in an area, and then select the receivers around it within a range of 350m (see Chapter 9). However, if our central controller is nonadversarial and if privacy must be preserved from applications that use the data collected at the central controller, our participating nodes do 22 (a) (b) Figure 6.3. Transmitter location (Red) along with receiver positions (Blue) not need to communicate with each other. Then the adjustment of the RSS values can be shifted from the leader to the central controller. For the outdoor setting, we compare our adjusted measurement approach with the simple noise addition approach. The performance of the localization algorithm with no noise serves as the performance (localization accuracy) baseline in this comparison. In general, as we move away from the transmitter the RSS values become more noisy. For this reason, considering receivers around the local maxima is a reasonable choice for the purpose of localizing the transmitter. However, from our experiments we observe that, in certain scenarios, using only a set of receivers around the local maxima can lead to poorer results (localization accuracy of the transmitter) compared to the case where all the receivers in the monitored area are used. These kinds of scenarios occur when the transmitter is at an edge or the number of receivers near the transmitter is small. Once such example scenario is shown in Fig. 6.3(b). This is justified by the fact that, in scenarios like Fig. 6.3(b) if the number of receivers around the local maxima is low, contributions from other receivers (that are away from the local maxima) help in localizing the transmitter. So, the performance of our baseline, i.e., localization algorithm with no noise, is evaluated in the following way. We run our localization algorithm using two methods: (a) using all the receivers in the monitored area and (b) selecting a local maxima and using receivers around it in a 23 fixed radius. Among these two methods, the performance of the superior is considered the baseline. The column labelled “No Noise” in Table 6.1, Table 6.2 and Table 6.3 contains the minimum localization error obtained from these two methods. 6.2.1 Simple Noise Addition Approach Table 6.1 contains the results for addition of noise with varying noise level. We show results for 250m, 300m and 350m of noise levels. Recall that the noise is added to both latitude and longitude of the receiver where based on the noise level, the noise is randomly picked from a uniformly distribution in the range(-noise level, noise level). As expected with increasing noise levels the localization error increases arbitrarily. In quite a a few cases like case 1 and 6, the localization error almost doubles with the noise level of 350m. On an average with 350m of noise, the localization error increases by 98.52 meters (approx 76%). 6.2.2 Adjusted Measurement Approach The Table 6.2 and Table 6.3 contain the results for adjusted measurement approach. We present the result for the highest noise setting, i.e., 350m. The results are shown for both cases where false locations are obtained by adding 350m of noise and random sampling. The adjusted measurement approach significantly improves the localization error over simple noise addition approach. We see that on average the localization error drops from 98.52 meters to 32.77 meters for 350m noise addition along with adjusted measurement and 21.36 meters in case of randomly sampled false location with adjusted measurement. Table 6.1. Localization error (in m) for various transmitter locations for simple noise addition method TX loc No noise Noise 250m Noise 300m Noise 350m 1 2 3 4 5 6 Avg. 136.27 146.58 147.03 101.57 161.41 112.63 134.24 223.28 229.68 229.81 148.21 192.32 219.33 207.10 239.58 234.48 236.22 169.91 208.96 226.57 219.28 270.34 235.11 238.90 193.79 225.18 233.31 232.77 Error increase for Noise 350m w.r.t. No Noise 134.07 88.53 91.87 92.22 63.77 120.68 98.52 24 Table 6.2. Localization error(in m) for various transmitter locations for adjusted measurement approach TX Loc No Noise 1 2 3 4 5 6 Avg. 136.27 146.58 147.03 101.57 161.41 112.63 134.24 Adjusted measurement and Noise of 350m 147.72 159.76 194.03 170.84 171.08 158.69 167.02 Error increase w.r.t No Noise 11.45 13.18 47.0 69.27 9.67 46.06 32.77 Table 6.3. Localization error(in m) for various transmitter locations for adjusted measurement approach with random sample TX Loc No Noise 1 2 3 4 5 6 Avg. 136.27 146.58 147.03 101.57 161.41 112.63 134.24 Adjusted Measurement with Random Sample and Noise of 350m 142.93 147.28 181.17 136.31 168.80 157.14 155.60 Error increase w.r.t No Noise 6.66 0.70 34.14 34.74 7.39 44.51 21.36 It is interesting to see that in certain cases, 1, 2 and 5, the localization error increase is almost negligible. For certain transmitter locations like Fig. 6.3(b) where the transmitter is on the edge, the localization error is relatively high but still significantly less than the simple noise addition method. This is because there aren’t enough receivers around the transmitter to accurately interpolate the RSS values. Also, such cases are unlikely to occur in a crowdsourced environment where receivers are spread around more uniformly. Interestingly, the random sample approach is able to bring down errors in certain cases like 4 which happens to be a situation like Fig. 6.3(b). CHAPTER 7 ADVERSARY ATTACK The adversary (in our case, the central controller) receives k readings from a region R by the group leader. The false locations of these readings (as explained in the methodology) are chosen at random from R and therefore, given the locations alone, the only knowledge the adversary could gain is that the group leader is in the region R, and that there are k users in the region. However, note that the adversary is not just aware of the false locations. It also receives the adjusted RSS measurements (which are computed using the true locations by the group leader). Thus, it is natural to ask if the adversary can use the RSS values to set up an inverse problem to solve for the true locations. 7.1 Inverse Problem We now consider an attack based on the idea above. Note that for each group, the adversary receives the false locations and the adjusted RSS measurements. The adversary knows the interpolation procedure used to generate the RSS values from the true measurements. The adversary thus needs to solve an inverse problem: given the false locations and the corresponding RSS values, what are the different ‘configurations’ of true locations and RSS values that could have produced them? The adversary’s hope is that there are only a few configurations that can explain the reported values. But on the other hand, if there are several distinct configurations, the adversary has no real way to know which configuration corresponds to the true locations. In our experiments, we show that the latter situation dominates, indicating privacy preservation. Next, we describe the attack formulation and evaluation in detail. As the different groups have no interaction, we will focus on one group of receivers. 26 7.2 Inverse Attack The controller has access to the following information: • number of receivers/reported noisy locations (n), • reported noisy locations (represented by vector L f ), • corresponding adjusted RSS values (represented by vector R f ), and • knowledge of the Algorithm 1 (represented by f ) used to generate the false locations and the corresponding adjusted RSS values. Next, we describe the adversary’s attack formulation. 1. Adversary initializes its guess of true locations (L a ) with random guess within the area being monitored. Each element of vector L a is a tuple comprised of latitude and longitude (lat, long). Based on the reported false locations L f , the adversary computes the bounding box represented by x min, y min, x max and y max. The random initialization of each location is done using a uniform distribution as follows: lat = uniform(x min, x max) (7.1) long = uniform(y min, y max) (7.2) 2. Adversary then initializes the RSS (represented by vector R a ) at locations guesses (L a ) with the RSS value of the the false location which is nearest to the location guess. This is a better initialization than random because RSS values are likely to be similar to locations near to the current location guess rather than some random RSS value. 3. Next using the expression from Algorithm 2, the adversary calculates the RSS values at each of the the reported false locations based on his current guess of true locations and their corresponding RSS. 27 4. The loss function is defined as the sum of square of the difference between the adversary’s calculation of RSS and the actual reported false RSS for each false location reported. The loss (L) is given in eq. 7.3 f L = ∑in=1 ( Ric − Ri )2 (7.3) The adversary calculates the RSS at the false location using the same method in which the receivers adjust their RSS values before reporting to the central controller. 5. In the next step, to update his ith guess in vector L a and R a , the adversary takes the gradient of the L w.r.t to Lia and Ria . The updates are made according to eq. 7.4 and eq. 7.5: ∂L Lia = Lia − η ∂L a (7.4) ∂L Ria = Ria − η ∂R a (7.5) i i 6. This process is repeated again from step 3 till the loss function reduces and flattens out. The formal algorithm for the attack is given in Algorithm 3, and the results are presented in Section 7.3. We observe that one way to achieve zero loss is to set the guesses to be precisely the false location and the corresponding RSS (and of course, this solution does not give any insight into the true locations). The adversary can thus reinitialize the guessed locations that are too close to false locations, and hope that this helps identify the true location. However, we did not observe any improvement in the solution, and thus we stick to Algorithm 3. 28 Algorithm 3 Adversary Attack 1: procedure Adversary Attack( f alse location, f alse rss, num recv, num iters) 2: true loc = [] 3: true rss = [] 4: for i in range(num_recv) do 5: l = rand.uniform(x min, x max), rand.uniform(y min, y max) 6: true loc.append(l) 7: for i in range(num_recv) do 8: r = pick the closest false location to the true location values 9: true rss.append(false rss[r]) 10: grad loc = [] 11: grad rss = [] 12: η = 0.01 13: while num iters ≥ 0 do 14: for i in range(num_recv) do 15: grad loc.append(∂L/∂true loc[i ]) 16: grad rss.append(∂L/∂true rss[i ]) 17: for i in range(num_recv) do 18: true loc[i] = true loc[i] - η * grad loc[i] 19: true rss[i] = true rss[i] - η * grad rss[i] 20: num iters = num iters - 1 21: return true_loc 7.3 Evaluation Figures 7.1 and 7.2 show the results from one scenario of transmitter localization. Figure 7.1 shows the loss function for one run of the attack. The x-axis shows the iteration number and y-axis the value of loss function at each iteration of the adversary attack. We see that the loss drops to a very low value, implying that the adversary’s final guesses explain the RSS measurements at the false location well. Having obtained low loss values, we evaluate how close the adversary’s estimates are to the true locations. Figure 7.2 shows the final guesses of adversary’s attack along with the reported false locations and the true location of the receivers. We can see that even after very low loss, the true locations of the receivers are significantly different from the adversary’s guess of their true location. Though some of the guessed locations seem quite close to the true locations, the adversary has no way to determine which ones of these satisfy this property. To quantify the proximity of the adversary’s guesses to the true locations, we formulate a minimum distance weighted bipartite matching problem. The adversary’s guesses form a set A and true location form the other set, T. There exists a mapping cost wat to assign an adversary’s guess a to true location t. wat is the euclidean distance between location a and location t. The objective here is to map each of adversary’s guesses to a true location 29 Loss calculated on RSS dBm 70 60 50 40 30 20 10 0 0 200 400 600 Iteration number 800 1000 Figure 7.1. Loss function vs. iteration of attack algorithm. Adversary Guesses True locations False locations 12 Y-coordinate (m) 10 8 6 4 2 0 −4 −2 0 2 4 X-coordinate (m) 6 8 10 Figure 7.2. Adversary guesses, true locations and reported false location in an area 30 such that the mapping cost is minimized. The mapping problem can be formalized with a linear program, where binary variable x at = 1, if the adversary guess a is assigned to true location t. In our setup, there is an edge from each adversary’s guessed location a to every true location and the edge weight is wat . It is a special case of minimum cost flow problem and has polynomial complexity [1]. minimize ∑ ∑ wat xat , t ∈ T (7.6) a∈ A t∈ T ∑ xat = 1, t ∈ T subject to a∈ A (7.7) ∑ xat = 1, a ∈ A t∈ T The matching cost is then scaled by the number of receivers in an area: error = minimum matching cost/number of receivers. Figure 7.3 shows the mapping of the adversary’s guess to the true location of the receivers. Ad ersary Guesses 12 1 Y-coordinate (m) 10 3 7 6 8 6 2 7 8 20 3 1 10 5 5 2 2 18 18 24 16 22 21 21 16 23 2225 15 15 14 27 26 13 11 12 0 39 29 2 4 X-coordinate (m) 29 6 32 32 41 40 31 40 4230 28 28 13 3833 33 24 14 27 9 34 34 37 37 42 25 35 35 36 38 39 23 26 −2 36 19 19 4 12 11 −4 17 6 10 0 20 17 4 8 9 4 True locations 31 41 43 30 43 8 10 Figure 7.3. Mapping of the adversary’s guesses to true receiver locations 31 To determine if the attack above learns some structure about the true locations, we compare the matching cost above with the corresponding cost when the guesses are completely random points (uniform and indpendently) chosen from the given area. Figures 7.4(a) and 7.4(b) show the matching cost for 100 runs with random locations and the cost for the final guesses from the attack above, respectively. We can see that the range of the matching cost is identical in both the cases. The average matching cost for randomly sampled locations is 1.81 and that of the attack is 1.84. We also compute the matching cost between the adversary’s guesses and the false locations reported to the controller. Figure 7.5(a) shows the matching cost for 100 runs, as before. We see that the adversary’s guesses are not converging to the false locations. This indicates that there are multiple different patterns of locations that are solutions to the inverse problem set up by the adversary (described at the start of the section). This gives further evidence of the privacy preserving nature of our method. Figure 7.5(b) shows matching cost of the adversary’s guesses to one another for various runs of inverse attack. Adversary’s guess of one run is compared with the guesses from all other runs by taking the matching cost between them. We can see that on average the adversary’s guesses for each run converge to a different set of locations and RSS values. This further strengthens our claim that there exist multiple patterns of location and RSS 8 8 7 7 6 6 5 5 Count Count values which yield low error calculations for false location. 4 4 3 3 2 2 1 1 0 0.0 0.5 1.0 1.5 2.0 2.5 Min Cost Mat ching (a) 3.0 3.5 4.0 0 0.0 0.5 1.0 1.5 2.0 2.5 Min Cost Mat ching 3.0 3.5 4.0 (b) Figure 7.4. Matching cost between the true locations and (a) guesses from the adversary’s inverse attack and (b) random guesses of true locations 32 400 8 350 6 300 5 250 Count Count 7 4 200 3 150 2 100 1 50 0 0.0 0.5 1.0 1.5 2.0 2.5 Min Cost Mat ching 3.0 3.5 (a) 4.0 0 0.0 0.5 1.0 1.5 2.0 2.5 Min Cost Mat ching 3.0 3.5 4.0 (b) Figure 7.5. Matching cost between (a) adversary guesses of true locations and false locations reported (b) adversary guesses to one another for 100 runs of adversary’s inverse attack Next, we also varied the group size and studied how close the adversary’s guesses are to the true locations. To measure this closeness, we take euclidean distance between the adversary’s guess and the true location it is mapped to in the min-bipartite matching/mapping. Figure 7.6 and Fig. 7.7 show the Cumulative Density Plot (CDF) of these distances obtained from inverse attack by an adversary and random guessing of true locations, respectively. The experiment is run for varying group sizes for 100 iterations. We can clearly see that the CDF plots are identical in both the cases. On comparing with random guessing, we observe that the adversary gets no closer to the true location with his inverse attack. This further strengthens our claim that multiple (location, RSS) configurations exist which explain the RSS values at the chosen false locations. To summarize via the terminology of [28], the adversary attack has a high uncertainty (as there are many different solutions to the inverse problem). It also has a low correctness, as the solutions obtained have nearly the same matching cost as random points. 33 Distance of matching for Inverse Attack Cummulative fraction 1.0 0.8 0.6 0.4 10 15 25 30 35 40 0.2 0.0 0 2 4 6 8 Distance (m) 10 12 Figure 7.6. Matching distance CDF for adversary attack for varying group sizes Distance of matching for Random Guess Cummulative fraction 1.0 0.8 0.6 0.4 10 15 25 30 35 40 0.2 0.0 0 2 4 6 8 Distance (m) 10 12 Figure 7.7. Matching distance CDF for random guess of true locations for varying group sizes CHAPTER 8 USER TUNABLE PRIVACY SETTINGS Until this point, we have presented our adjusted measurement approach and shown that the users who participate in the crowdsourcing effort have their privacy preserved. The adversary cannot do better than random guessing of the receiver locations in the region ‘R’ of the group ‘G’. However, different users may have different privacy requirements or in other words a particular privacy setting might be acceptable to a user but not to some other user. Therefore, we let each user define two parameters, namely ‘k’ and ‘a’ based on his privacy requirements. The parameter k is the same as k-anonymity, where a user expects at least k users in the group, i.e., the user is hiding among at least k users. The parameter ‘a’ is the minimum area of the region a user wants to hide in. Coincidentally, the area in which a user will hide is the area of the region ’R’ formed by the group ’G’. Since all the users report their RSS measurements along with the location to the group leader, the group leader is responsible for making sure that the privacy constraints specified by the users are satisfied. Each user i along with RSS measurements and location also send their desired value of k i and ai . Using the reported locations of the receivers, the group leader calculates the area of the region ‘R’. The group leader then for each user i, checks the user’s parameter ai to see if it satisfies the condition ai < R. The receivers which do not satisfy the above condition are dropped from further consideration. Now, with the pruned list of receivers, the group leader checks for the k-anonymity condition. To do so, it sorts the pruned list based on the parameter k in descending order. The group leader also keeps the current count of receivers in the pruned list using a variable current list length. The group leader then begins to go over each receiver one by one, and checks if k i < current list length. If the condition is not satisfied, the receiver is popped from the list and current list length is decreased by one and the group leader proceeds to next receiver. However, if the condition is satisfied, the group leader stops the scan of the list. The remaining receivers are guaranteed 35 to have their privacy condition k i satisfied because the pruned list was sorted and hence all the k i of the subsequent receiver is bound to be less than current list length and the current list length will not change as no more receivers are removed from the list. However, the receivers removed which do not satisfy the condition k i < current list length could have changed the region R and consequently, the area of R. Therefore, this removal process is repeated until no changes are observed to the receiver list. The process is summarized in Algorithm 4. The algorithm takes as argument the receiver list and their parameter ‘a0 and parameter ‘k0 denoted by ‘A’ and ‘K’ respectively. The algorithm returns the list of the receivers whose privacy constraints are satisfied. Algorithm 4 Group Formation 1: procedure Group Formation(receiver list, A, K) 2: changed ← True 3: receiver list ← sorted(receiver list) 4: while changed do 5: changed ← False 6: pruned list ← [] 7: area R ← CalculateArea(receiver list) 8: for receiver in receiver list do 9: if A[i] ≤ area R then 10: pruned list.append(receiver) 11: changed ← True 12: Length ← length( pruned list) 13: for receiver in receiver list do 14: if K[i] > Length then 15: pruned list.remove(receiver) 16: Length ← Length − 1 17: changed ← True 18: else 19: break 20: receiver list ← pruned list 21: return receiver list CHAPTER 9 PRACTICAL CONSIDERATIONS In this section, we consider the practical issues tied to communication between participating receivers for choosing a leader among themselves as well as for communicating with the leader, once the leader is chosen. In indoor environments, we expect users to communicate using WiFi, WiFi Direct, or any short range wireless technology. For outdoor, city-scale settings, we expect users to communicate using a protocol like Dedicated Short Range Communication (DSRC) for vehicular communication. It operates in the 5.9GHz band with an ideal range of 1000 meters [15]. For our evaluation, we choose a DSRC communication range of 350 meters. If our central controller is nonadversarial and if privacy must be preserved from applications that use the data collected at the central controller, our participating nodes do not need to communicate with each other. They can directly (using cellular or WiFi networks) send their location and measured data to the central controller, which runs the adjustment algorithm, instead of distributed leaders. Thus, under this adversary model, our approach can be implemented without requiring any direct device-to-device communication method. CHAPTER 10 CONCLUSION In this thesis, we addressed the problem of location privacy in the context of crowdsourced localization of spectrum offenders where participating receivers report RSS measurements and their location to a central coordinator. We presented a novel adjusted measurement approach in which noise is added to the participating receiverâĂŹs location coordinates and pseudolocations are reported along with adjusted RSS measurements as if the measurements were made at the pseudolocations. We used two RSS datasets, one from a cluttered office and another from roadways in Phoenix, Arizona to evaluate our approach. Our results show that location privacy can be preserved without a significant increase in the localization error. We also formulated an adversary attack that attempted to solve the inverse problem of determining the true locations of the receivers from their false locations. Our evaluations showed that the adversary does no better than random guessing of true locations in the monitored area. We considered two adversarial situation. Primarily, we assumed that the central coordinator was an adversary and used local leaders to collect and adjust RSS measurements. However, our approach also applies to situations where the central coordinator is not an adversary but the applications using the collected data are adversarial. REFERENCES [1] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, Network flows, Elsevier, 2014. [2] P. Bahl and V. N. Padmanabhan, RADAR: An in-building RF-based user location and tracking system, in 19th Joint Conf. IEEE Computer and Communications Societies (INFOCOM 2000), vol. 2, 2000, pp. 775–784. [3] E. Beatrix Cleff, Privacy issues in mobile advertising, International Review of Law Computers and Technology, 21 (2007), pp. 225–236. [4] M. Bocca, O. Kaltiokallio, and N. Patwari, Radio tomographic imaging for ambient assisted living, in International Competition on Evaluating AAL Systems through Competitive Benchmarking, Springer, 2012, pp. 108–130. [5] M. Bocca, O. Kaltiokallio, N. Patwari, and S. Venkatasubramanian, Multiple target tracking with rf sensor networks, IEEE Transactions on Mobile Computing, 13 (2014), pp. 1787–1800. [6] C.-Y. Chow, M. F. Mokbel, and X. Liu, A peer-to-peer spatial cloaking algorithm for anonymous location-based service, in Proceedings of the 14th Annual ACM International Symposium on Advances in Geographic Information Systems, ACM, 2006, pp. 171–178. [7] M. Duckham and L. Kulik, A formal model of obfuscation and negotiation for location privacy, in International Conference on Pervasive Computing, Springer, 2005, pp. 152– 170. [8] J. Freudiger, R. Shokri, and J.-P. Hubaux, Evaluating the privacy risk of locationbased services, in International Conference on Financial Cryptography and Data Security, Springer, 2011, pp. 31–46. [9] B. Gedik and L. Liu, A customizable k-anonymity model for protecting location privacy, tech. rep., Georgia Institute of Technology, 2004. [10] Q. Geng, P. Kairouz, S. Oh, and P. Viswanath, The staircase mechanism in differential privacy, IEEE Journal of Selected Topics in Signal Processing, 9 (2015), pp. 1176–1184. [11] A. Ghasemi and E. S. Sousa, Collaborative spectrum sensing for opportunistic access in fading environments, in New Frontiers in Dynamic Spectrum Access Networks, 2005. DySPAN 2005. 2005 First IEEE International Symposium on, IEEE, 2005, pp. 131–136. [12] P. Golle and K. Partridge, On the anonymity of home/work location pairs, in International Conference on Pervasive Computing, Springer, 2009, pp. 390–397. 39 [13] M. Grissa, B. Hamdaoui, and A. A. Yavuza, Location privacy in cognitive radio networks: A survey, IEEE Communications Surveys & Tutorials, 19 (2017), pp. 1726– 1760. [14] M. Grissa, A. A. Yavuz, and B. Hamdaoui, Preserving the location privacy of secondary users in cooperative spectrum sensing, IEEE Transactions on Information Forensics and Security, 12 (2017), pp. 418–431. [15] J. Guo and N. Balon, Vehicular ad hoc networks and dedicated short-range communication, University of Michigan(2006). [16] A. Harter, A. Hopper, P. Steggles, A. Ward, and P. Webster, The anatomy of a context-aware application, Wireless Networks, 8 (2002), pp. 187–197. [17] B. Hoh, M. Gruteser, H. Xiong, and A. Alrabady, Enhancing security and privacy in traffic-monitoring systems, IEEE Pervasive Computing, 5 (2006), pp. 38–46. [18] R. Jose and N. Davies, Scalable and flexible location-based services for ubiquitous information access, in International Symposium on Handheld and Ubiquitous Computing, Springer, 1999, pp. 52–66. [19] M. Khaledi, M. Khaledi, S. Sarkar, S. Kasera, N. Patwari, K. Derr, and S. Ramirez, Simultaneous power-based localization of transmitters for crowdsourced spectrum monitoring, in Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, ACM, 2017, pp. 235–247. [20] J. Krumm, Inference attacks on location tracks, in International Conference on Pervasive Computing, Springer, 2007, pp. 127–143. [21] S. Li, H. Zhu, Z. Gao, X. Guan, K. Xing, and X. Shen, Location privacy preservation in collaborative spectrum sensing, in IEEE Intl. Conf. on Computer Communications (INFOCOM 2012), 2012, pp. 729–737. [22] R. K. Martin and R. Thomas, Algorithms and bounds for estimating location, directionality, and environmental parameters of primary spectrum users, IEEE Transactions on Wireless Communications, 8 (2009). [23] A. O. Nasif and B. L. Mark, Collaborative opportunistic spectrum access in the presence of multiple transmitters, in Global Telecommunications Conference, 2008. IEEE GLOBECOM 2008. IEEE, IEEE, 2008, pp. 1–5. [24] J. K. Nelson, M. R. Gupta, J. E. Almodovar, and W. H. Mortensen, A quasi em method for estimating multiple transmitter locations, IEEE Signal Processing Letters, 16 (2009), pp. 354–357. [25] N. Patwari, A. O. Hero, M. Perkins, N. S. Correal, and R. J. O’dea, Relative location estimation in wireless sensor networks, IEEE Transactions on Signal Processing, 51 (2003), pp. 2137–2148. [26] T. S. Rappaport et al., Wireless Communications: Principles and Practice, vol. 2, Prentice Hall PTR New Jersey, 1996. 40 [27] R. Shokri, J. Freudiger, M. Jadliwala, and J.-P. Hubaux, A distortion-based metric for location privacy, in Proceedings of the 8th ACM workshop on Privacy in the Electronic Society, ACM, 2009, pp. 21–30. [28] R. Shokri, G. Theodorakopoulos, J.-Y. Le Boudec, and J.-P. Hubaux, Quantifying location privacy, in Security and privacy (sp), 2011 IEEE Symposium on, IEEE, 2011, pp. 247–262. [29] R. Shokri, G. Theodorakopoulos, P. Papadimitratos, E. Kazemi, and J.-P. Hubaux, Hiding in the mobile crowd: Location privacy through collaboration, IEEE Transactions on Dependable and Secure Computing, 11 (2014), pp. 266–279. [30] M. Srivatsa and M. Hicks, Deanonymizing mobility traces: Using social network as a side-channel, in Proceedings of the 2012 ACM Conference on Computer and Communications Security, ACM, 2012, pp. 628–637. [31] L. Sweeney, k-anonymity: A model for protecting privacy, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10 (2002), pp. 557–570. [32] D. J. Torrieri, Statistical theory of passive location systems, IEEE Transactions on Aerospace and Electronic Systems, (1984), pp. 183–198. [33] N. Vratonjic, K. Huguenin, V. Bindschaedler, and J.-P. Hubaux, How others compromise your location privacy: The case of shared public ips at hotspots, in International Symposium on Privacy Enhancing Technologies Symposium, Springer, 2013, pp. 123–142. [34] R. Want, A. Hopper, V. Falcao, and J. Gibbons, The active badge location system, ACM Transactions on Information Systems (TOIS), 10 (1992), pp. 91–102. [35] J. Zhu and G. D. Durgin, Extended indoor/outdoor location of cellular handsets based on received signal strength at greenville, sc, tech. rep., Georgia Institute of Technology, 2005. |
| Reference URL | https://collections.lib.utah.edu/ark:/87278/s65r2wzd |



