Mobility management entity function-as-a-service

Mobility management entity function-as-a-service

Title	Mobility management entity function-as-a-service
Publication Type	dissertation
School or College	College of Engineering
Department	Computing
Author	Jindal, Sonika
Date	2019
Description	Serverless cloud computing is an evolving paradigm that brings advantages in terms of scalability, the flexibility of resource usage, and cost. Serverless computing means that the services are managed by the cloud provider. It is the next level of the offering by cloud providers where the infrastructure and platform are fully managed by the provider. The services include Function as a Service, databases, networking, and security. Some of the managed services by AWS are Lambda, S3, DynamoDB, and IAM. The control plane for mobile wireless (e.g., cellular) networks faces many challenges with respect to scaling, burst handling, and robustness. In this work, we show how serverless computing is a natural fit for this type of control plane: client devices are modeled using a finite state machine (FSM), and transitions in this state machine map well to serverless functions. Using a prototype of the LTE/EPC Mobility Management Entity (MME), we demonstrate how to architect a mobile control plane using serverless functions and demonstrate its practicality. To do this, we first evaluate a serverless platform for the features it promises in terms of scaling and cost. Then we redesign the Mobility Management Entity of LTE as stateless functions running on the serverless platform. These platforms provide APIs for users to run code or functions and return the results. The caller uses different invocation triggers to invoke the functions. This request and response model fits well with the network control plane use case. We also use the cloud managed datastore. We evaluate this implementation to draw conclusions about how a network entity like MME can benefit from the advantages of a serverless platform. This model promises the operational ease with "unlimited" scaling capabilities. With our experiments and evaluations, we demonstrate the feasibility of redesigning old implementations to leverage benefits of new paradigms.
Type	Text
Publisher	University of Utah
Dissertation Name	Doctor of Philosophy
Language	eng
Rights Management	© Sonika Jindal
Format	application/pdf
Format Medium	application/pdf
ARK	ark:/87278/s6pp56rv
Setname	ir_etd
ID	1713236
OCR Text	Show MOBILITY MANAGEMENT ENTITY FUNCTION-AS-A-SERVICE by Sonika Jindal A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Master of Science in Computer Science School of Computing The University of Utah August 2019 Copyright c Sonika Jindal 2019 All Rights Reserved The University of Utah Graduate School STATEMENT OF DISSERTATION APPROVAL The dissertation of Sonika Jindal has been approved by the following supervisory committee members: Robert P. Ricci , Chair(s) 9 Apr 2019 Date Approved Ryan Stutsman , Member 9 Apr 2019 Date Approved Jacobus Van der Merwe , Member 9 Apr 2019 Date Approved by Ross Whitaker , Chair/Dean of the Department/College/School of Computing and by David B. Kieda , Dean of The Graduate School. ABSTRACT Serverless cloud computing is an evolving paradigm that brings advantages in terms of scalability, the flexibility of resource usage, and cost. Serverless computing means that the services are managed by the cloud provider. It is the next level of the offering by cloud providers where the infrastructure and platform are fully managed by the provider. The services include Function as a Service, databases, networking, and security. Some of the managed services by AWS are Lambda, S3, DynamoDB, and IAM. The control plane for mobile wireless (e.g., cellular) networks faces many challenges with respect to scaling, burst handling, and robustness. In this work, we show how serverless computing is a natural fit for this type of control plane: client devices are modeled using a finite state machine (FSM), and transitions in this state machine map well to serverless functions. Using a prototype of the LTE/EPC Mobility Management Entity (MME), we demonstrate how to architect a mobile control plane using serverless functions and demonstrate its practicality. To do this, we first evaluate a serverless platform for the features it promises in terms of scaling and cost. Then we redesign the Mobility Management Entity of LTE as stateless functions running on the serverless platform. These platforms provide APIs for users to run code or functions and return the results. The caller uses different invocation triggers to invoke the functions. This request and response model fits well with the network control plane use case. We also use the cloud managed datastore. We evaluate this implementation to draw conclusions about how a network entity like MME can benefit from the advantages of a serverless platform. This model promises the operational ease with “unlimited” scaling capabilities. With our experiments and evaluations, we demonstrate the feasibility of redesigning old implementations to leverage benefits of new paradigms. Dedicated to my loving husband, Rahul for all his support, and to my little son, Ishaan for being so brave. CONTENTS ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii CHAPTERS 1. THESIS STATEMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3. RELATED WORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. BACKGROUND RESEARCH AND MOTIVATION . . . . . . . . . . . . . . . . . . . . . . 8 4.1 Inherent auto-scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Cost comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Assumptions in cost calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Faster deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Cloud regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 3GPP timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 A note on 5G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Summing up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 11 13 14 14 15 15 16 5. DESIGN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5.1 Design choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Asynchronous response from MME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 State passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Optimistic concurrency control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4 Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.5 Evaluation with attach procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.6 SCTP connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Platform choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Cassandra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1.1 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1.2 Gocql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 OpenFaas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 AWS Lambda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4 DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5 GoLang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Designing the eNB App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 17 18 18 18 19 19 19 20 20 21 21 22 22 22 22 24 6. IMPLEMENTATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 6.1 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 eNB App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 MME function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3 Timer function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.4 Datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.5 SPGW-HSS stub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Failure cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Attach request resent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 SPGW response timeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 Service level agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. EVALUATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 7.1 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 MME solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 OpenFaas functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 AWS Lambda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Concurrent execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.4 Sensitivity analysis with DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Comparison with other solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Asynchronous implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Synchronous implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 Other control messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Testing limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. 26 26 27 28 28 29 29 29 30 30 31 33 33 35 38 41 43 43 43 46 46 49 50 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 8.1 8.2 8.3 8.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Disadvantages of FaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 53 54 54 APPENDICES A. MESSAGE PASSING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 B. SOFTWARES AND INSTALLATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 vi LIST OF FIGURES 2.1 Serverless pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Abstraction levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3 EPC architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.1 AWS Rps, containers, and VMs with concurrency and 10k reqs . . . . . . . . . . . . 10 4.2 Azure and GCP Rps with concurrency and 10k reqs . . . . . . . . . . . . . . . . . . . . . 10 4.3 Response times of 10k reqs on AWS Lambda . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.4 UE-initiated service request procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.5 Network-initiated service request procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.1 Message flow during attach procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2 eNB Test App with workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 6.1 MME FaaS architecture and different components . . . . . . . . . . . . . . . . . . . . . . . 27 6.2 Message flow during attach procedure with MME FaaS . . . . . . . . . . . . . . . . . . 28 7.1 Cassandra insert, update, and read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 7.2 Cassandra inserts with new connection for each query . . . . . . . . . . . . . . . . . . . 33 7.3 OpenFaaS sync async . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.4 Prometheus logs of OpenFaaS scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.5 MME function with synchronous responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7.6 AWS Lambda different datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 7.7 DynamoDB RCU error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.8 DynamoDB WCU error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.9 AWS Lambda and CassandraDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 7.10 AWS Lambda and DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 7.11 AWS minimum, average and median comparison with DynamoDB and Cassandra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 7.12 AWS maximum comparison with DynamoDB and Cassandra . . . . . . . . . . . . . 41 7.13 OpenFaas with concurrent workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 7.14 Sensitivity analysis with DynamoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 LIST OF TABLES 3.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.1 Resource instantiation delay using traditional model . . . . . . . . . . . . . . . . . . . . 14 4.2 Latency across regions in Azure (ms) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3 3GPP timers for attach/detach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 7.1 AWS services cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 7.2 Cost breakup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 CHAPTER 1 THESIS STATEMENT Serverless computing [59] provides a promising platform for deploying the control plane for the core of a mobile network (Evolved Packet Core), in terms of scalability and cost. CHAPTER 2 INTRODUCTION Serverless computing is an execution model in which the cloud provider manages the provisioning of resources dynamically. The management and low-level infrastructure is abstracted away from the developers. The cloud provider manages the scaling of resources as per the application load. The resources are not preallocated to the applications, rather they are provisioned dynamically as per the need. All the services are managed by the cloud provider. A simple serverless pipeline looks like in Figure 2.1, which has multiple managed services connected by event triggers. Function-as-a-Service or FaaS is a concept of serverless computing, which provides a platform for customers to develop, run, and manage application functionality without the complexity of building and maintaining the infrastructure. The application developer leverages this model to deploy independent actions or business logic in the form of stateless functions. These functions are modular, which can be deployed and run independently by the cloud provider. The functions are run in a sandboxed environment like containers or lightweight VMs. The exact details about the environment in which functions execute are proprietary to the cloud provider. Such a runtime environment that executes the function is expected to start within milliseconds, execute the task, and destroy the setup for this function. Therefore, instead of setting up monolithic servers to handle the potential peak load, the application logic can be broken down into pieces and Producer Figure 2.1. Serverless pipeline Managed Queue Processing Database 3 deployed as functions, which can scale automatically and independently. Functions are the highest level of abstraction given to the user by the cloud provider, as shown in the Figure 2.2. With the promise of inherent scalability, availability, pay-per-invocation, higher developer velocity offered by serverless computing, it can be a fair choice for implementing some network functions. In this work, we examine how the control plane of LTE/EPC mobile network—and future 5G networks—can be implemented using serverless functions. The overall LTE network architectures consist of the Radio Access Network (RAN) and the Evolved Packet Core (EPC) as shown in Figure 2.3. The RAN consists of eNodeB (eNB) “base stations” and User Equipment (UE) such as phones, IoT devices, etc. Functionality in the EPC is divided into a control plane and a data plane. The control plane handles authentication of UEs (through the Home Subscriber Service—HSS), attaching them to the network and managing their mobility (Mobility Management Entity—MME), and policy and billing (Policy and Charging Rules Function—PCRF). The main task of the control plane is to configure the data plane, which forwards the actual traffic between UEs and other networks, such as the Internet. We focus on the MME, as it is the key element of the control plane for managing the state of UEs. With the huge growth in IoT and mobile devices, the signaling traffic in the control plane has increased considerably and it is projected to increase further[37]. If an MME becomes overloaded, new devices cannot attach, existing devices may not be able to migrate to new base stations as they move, and the network degrades even if there is plenty of data plane capacity. For this reason, commercial networks place a great emphasis on making these devices reliable and scalable, and vastly over-provision Bare-metal VMs Containers Level of Abstraction Figure 2.2. Abstraction levels </> </> </> </> Functions 4 MME Pool UE MME HSS SGW PGW Internet eNodeB Radio Access Network Evolved Packet Core Figure 2.3. EPC architecture them (60-100% as per [50]) to respond to bursts in traffic. Traditionally, MMEs have been large, extremely expensive, dedicated hardware devices designed as monolithic services. A provider might typically have O(10) MMEs to serve a country the size of the United States and must put in significant resources into ensuring that those few devices do not fail. Recently, providers have been moving towards hosting these in NFV environments to save money and make them easier to upgrade, but the basic monolithic architecture has remained the same, making it difficult (and expensive) to scale, replicate, etc. these services. In NFV world, the network capabilities are virtualized and run on general-purpose servers. Extending the virtualization of network elements to a next level, we build a cloudnative MME solution which can run on a serverless platform. In this work, we will analyze the serverless platforms to see if they are a good fit for an LTE network control plane entity. We approach the problem of building a mobile control framework from the cloud-native direction, using serverless computing at the base. We build a simple model of some MME functionality. To do this, we first evaluate the serverless platform for the features it promises as part of background research and motivation. Then we redesign MME as stateless functions (using remote datastore) running on the serverless platform. MME is an entity that maintains and manages the state of a UE between different messages. We use a remote database to store and retrieve the states by the functions and make it stateless. Then we evaluate the MME FaaS solution on OpenFaaS[44] and AWS Lambda[53] with different datastores. CHAPTER 3 RELATED WORK In the past, people have taken different approaches to redesign the LTE-EPC components, be it data plane or control plane or both. In PEPC [46], a state-driven packet core is designed such that all the state associated with a user is consolidated into a single place which is called a slice. Any control or data traffic for this user is handled by the same slice. While in C3PO [40], the bottlenecks in mobile core packet performance are addressed by separating and independently scaling the data plane and control plane. It is built on the fact that S-PGW(mainly data plane component) in EPC handles a lot of control plane traffic, which becomes a bottleneck for data traffic [47]. There are some control plane specific solutions which are scalable like DMME [4] and SCALE [22]. The main idea behind DMME is to split the control plane processing task among multiple servers, which maintain the UE mobility independently. The states are transferred between DMME replicas. In SCALE [22], the control plane functionality is divided into frontend load balancer, which maintains the standard interfaces and a backend elastic MME processing cluster. The MME processing cluster is a pool of VM managed by the load balancer. Consistent hashing is used to spread the load among the VMs in the pool. In this model, the state information about each UE is stored at the VM. There have been attempts to build a cloud-native solution for MME like in CNS-MME [3]. CNS-MME also makes stateless MME and runs as microservices or VNF pool. Since it is based upon Cloud-Native Architecture, CNS-MME provides auto-scaling and availability. These stateless microservices are instantiated behind an L7 load balancer classifies packets and forwards them to respective VNF. Our approach starts from a different direction: rather than starting from the telecominspired monolithic design of today’s MME and adapting it to cloud technologies, we start from a purely cloud-native design. The work we present here represents an early proof-of- 6 concept of this design; while it is not a complete EPC or MME implementation, by using a different fundamental design, further development of this idea will result in a control plane with substantially different scaling properties. In the Table 3.1, we show the main differences between MME FaaS and other work done in this area. Different designs address different shortcomings by rearchitecting all or few components of EPC. Some of the above designs are complex because of in-built scaling and load balancing. A cloud-native approach on the other hand keeps the design simple by leaving the scaling and management task to the cloud provider. Such designs may require reliance on the external datastore. While storing the state in-memory at the VM can result in faster access compared to an external datastore access, it can hinder the state migration and suffers from single point of failures. If the VM crashes, all the state is lost. Storing the state in cloud has several other advantages compared to local storage. The availability and scalability of cloud storage solutions are unparalleled. Cloud datastores use redundancy to minimize impact from points of failure. A distributed database provides scalability, resiliency, and performance required for a network function, which is difficult to achieve with local state storage. Separating the state management from the message processing, helps in achieving auto scaling of independent components. With the industry trending to move towards the cloud, the providers are getting ready to enable customers with best in class features at great costs. This means that it makes more sense than ever before to explore cloud-native solutions. Thus, we build a serverless solution for MME. In the serverless model, we leave the load balancing to the cloud provider and rely on the auto scalability guarantees of the serverless platform. The functions are executed in short-lived instances (containers or VMs or proprietary runtime), which are alive only for the duration of the execution. Although the framework might keep them warm even after the task is done but the next execution does not depend upon any of the states of the previous call. Thus, for any state that might be needed across different instantiations, an external store is required. To our knowledge, there are no solutions for NFV or mobile network control plane implemented as FaaS. We will go through the specifics of the implementation and perform evaluations with the serverless based solution. EPC EPC MME MME MME MME PEPC C3PO DMME SCALE CNS MME MME FaaS Table 3.1. Related work Name Target Unit Cloud-native architecture. Stateless event-driven independent functions executed in short-lived instances. Cloud-native architecture. Pool of lightweight VNFs in the form of microservices. L7 Load balancer to distribute the requests to the VNFs Consolidating per-device state in a single location and re-organizing EPC functions for efficient access to this consolidated state Control and User separation. Data plane scales with the increase in workers. Control plane programs sessions, bearers and QoS into dataplane. Distributed replicas. Split the control plane processing among a large number of servers that manage the user mobility state independently. State machine checkpointing Front-end load balancer that maintains standard interfaces and a back-end elastic MME processing cluster. State replication and placement on replica VMs Main Idea Auto-scaling as provided by the cloudprovider Pro-actively provision VMs based on expected load Auto-scaling as provided by the cloudprovider Not tested Add more cores for data processing Add more cores for job processing Scaling method Serverless platform handles load by scaling up and down Separate L7 Load balancer Built into the design to assign UE traffic to different replica VMs based upon GUTI Built into the design to assign jobs to different slices. Each UE is handled by the same slice every time Built into the design to assign UE traffic to different workers based upon unique UE id N/A Load Balancing External Datastore External Datastore In VM External Datastore In VNF In slice State Management 7 CHAPTER 4 BACKGROUND RESEARCH AND MOTIVATION Some of the major driving factors for the adoption of serverless platforms are server abstraction, event-driven execution, scaling, and cost. The serverless architecture provides a complete abstraction of servers away from the developers. This reduces the operational burden greatly. Users only have to think about business logic rather than the underlying infrastructure and plumbing. Services are written as event-driven functions. This event-driven nature of the services aligns with the signaling model of control plane entities of mobile networks. In contrary to monolith applications and microservices, functions are executed only when there is a need. For all the large scale applications, scaling is a major concern. In the case of a FaaS model, scaling is the responsibility of the cloud provider and it is completely dynamic. Applications are written in the form of independently scalable functions. Therefore, rather than scaling the entire app, only the necessary functions can be scaled. Also, the billing system is based on consumption and executions and not on the instance size. Thus, you never pay for the idle resources. Telecom networks contain heavy purpose-built infrastructure, which is comprised of different devices coming from different third-party vendors. This results in a lot of integration effort, which reduces the agility of deployments. Increasing capacity is costly and slow, as it involves the physical installation and configuration of new devices, and many services are designed in a monolithic fashion that makes it difficult to flexibly balance the load. While network operators are moving towards softwarization (VNF and SDN) and Infrastructure-as-a-Service to bring agility, IaaS requires overprovisioning to meet bursty traffic demands. Thus, using managed services and serverless platform presents an effective way to reap the benefits of cloud mainly scalability and cost-effectiveness. To strengthen our argument, we performed some experiments for the features relevant 9 to the use case of a network node implementation. We first analyze the scaling and cost aspects of a serverless platform because this will support our thesis idea of a scalable and cost-effective network node. Next, we will understand the time restrictions and how a serverless deployment can perform within those limits by going through the timers, instantiation time and cloud regions. 4.1 Inherent auto-scalability For FaaS, scaling is handled by the cloud provider. We performed tests of scalability with AWS Lambda [53], Azure Functions [20], and Google Cloud Functions [33]. Our first experiment was to examine the scaling provided by AWS Lambda using the method proposed by Wang L. et al. [52]. The method involves reading ‘procfs’ from the function to identify the container and VM instance where the function is executed. In Figure 4.1, we plotted the total number of containers created for processing 10000 requests at different concurrency levels. We observe that as the concurrency is increased, the number of containers increase, i.e., the system scales up to meet the increased requirement. This, in turn, increases the throughput. The latency of each request ranges roughly between 65ms to 75ms. We see the same pattern of increase in the number of requests handled per second with Azure Functions and Google Cloud Functions as concurrency is increased (Figure 4.2). We could not read the ‘procfs’ reliably to come up with the number of containers and VM on Azure and Google Cloud. Therefore, with the results of the throughput test, we assume that Azure and GCP also employ similar triggers as AWS for scaling and thereby throughput is increased. The plot in Figure 4.3 shows the response time of 10k requests with a concurrency level of 200 for AWS Lambda. As we can see, there are few spikes in the start and after 3000 requests that indicate the spawning of new containers to handle the increased load. Once the scale-up is performed, the response times stabilize. The infrastructure for network functions is built keeping the highest load into consideration, although the average utilization is much lesser. The serverless model makes scale-up and scale-down dynamic without the need of preprovisioned resources. Thus, it can be used for network functions if the scaling is seamless and does not affect the performance. 10 Latency(ms) # Containers Req per sec 2000 1000 200 100 0 100 50 0 10 20 30 50 100 Concurrency 150 200 Figure 4.1. AWS Rps, containers, and VMs with concurrency and 10k reqs Requests per sec 100 Azure GCP 80 60 40 20 10 20 30 50 100 Concurrency 150 200 Figure 4.2. Azure and GCP Rps with concurrency and 10k reqs Figure 4.3. Response times of 10k reqs on AWS Lambda 11 4.2 Cost comparison Since the infrastructure is fully managed, the cost can vary depending upon the usage. In this section, we will try to understand the cost implications of using a fully managed service for an LTE network control plane node. We only calculate a rough estimate in this section and will perform more precise calculations during our evaluations with the solution. We perform this preliminary loose estimation to be sure that the solution is an affordable one. This is done to have a sanity check on the price of a FaaS application for a control plane. In the US, as of September 2018, there were 215K towers [23]. According to the data from the top telecommunication companies, there are about 400M subscribers [57]. As per the researchers in [38], there are about 50 EPCs leading to at least 50 SGWs and 50 PGWs. They estimate 3 x 50 MMEs considering 3 MMEs in an MME pool. Therefore, each MME handles data from 2.7M UEs. According to [36] there is a session establishment (service request) from a UE every 106.9 seconds on an average. Using this data, we can say that there are 25K service requests per second seen by each MME (In 1 sec there are 1/106.9 requests, so for 2.7M UEs there will be 2700000/106.9 = 25K requests per second). Thus, for a month, there will be 64.8M service requests (25 x 60 x 60 x 24 x 30 K). As shown in Figure 4.4, there are 3 calls (Service Request, Initial Context Setup Response, and Modify Bearer response) in case of UE triggered service establishment, while there are 4 calls in case of network triggered Service Request due to the Downlink Data Notification (shown in Figure 4.5) to the MME, which trigger function calls. Since [36] mentions the requests from UE, let us consider 3 calls to the MME. Therefore, the total number of calls to the serverless function will be 3 times 64.8M = 194.4M. Based upon this data, we computed the price of a serverless MME on AWS Lambda. As per AWS published pricing [13], there are two components involved when deploying serverless functions: 1. Requests: 0.20$ per million requests. For 259.2M requests we get, 0.20 x 194.4 = $38.88 per month. 2. Duration: $0.00001667 for every GB sec. Assuming 128MB and 10ms of compute 12 UE/eNB MME Initial UE Message (embedded Service Req) HSS/SGW Optional: Authentication Info Req / Authentication Info Answer Optional: Authentication Request / Authentication Response Optional: Security Mode Command / Security Mode Complete Initial Context Setup Req (E-RAB establishment) Initial Context Setup Response Modify Bearer Req Modify Bearer Response Figure 4.4. UE-initiated service request procedure UE/eNB MME HSS/SGW Downlink Data Notification Paging Initial UE Message (embedded Service Req) Optional: Authentication Request / Authentication Response Downlink Data Notification Ack Optional: Authentication Info Req / Authentication Info Answer Optional: Security Mode Command / Security Mode Complete Initial Context Setup Req (E-RAB establishment) Initial Context Setup Response Modify Bearer Req Modify Bearer Response Figure 4.5. Network-initiated service request procedure 13 time we get 0.00125 GB-sec per request ((128/1024) GB ∗ (10/1000)sec) Thus, 0.00125 ∗ 194.4 = 243000 GB-sec per month. Therefore, 243000 ∗ 0.00001667 ≈ $4 per month Thus, total price for an EPC control plane component (for service establishment procedure) is about $43 per month. This cost is competitive compared to Amazon EC2 Dedicated Instances[10]. For example, for a month, it costs $1440 for an m5.large, which is a 2 vCPU/8G mem instance. We chose an EC2 M5 instance [11] for the cost comparison because an M5 offers a balanced compute, memory, and networking resources for general purpose workloads. Such an instance is good to fulfill the requirement of a network function but with added resources for availability, load balancing, etc. Although the cloud providers have different models of instance assignments, and hence costs, it can vary hugely depending upon the requirements. For example, T2 instances [12] are the lowest-cost Amazon EC2 instance option and can cost as low as $8.352 per month. But the total cost of ownership in case of an IaaS [56] based deployment can be much higher due to management of scaling, clustering or load balancing. Real-world workloads can be unpredictable, leading to over-provisioning of IaaS-based compute to handle the fluctuations. For delivering the required availability, multiple load-balanced EC2 instances might be required in different availability zones. Also, having more instances lead to under-utilization of resources. 4.2.1 Assumptions in cost calculations While calculating the cost, we cannot take into account all the factors upfront. We update the costs as per the actual usage with our implementation. During background research, we focused only on the cost of the serverless invocations and have the following assumptions: 1. Function would not need more than 128MB 2. The duration of execution will be known when the actual function is executed. For now, we assume 10ms for each run 3. Price for the data store is not taken into account 14 4. API gateway cost is an additional component, which is required for having REST endpoint to the function 4.3 Faster deployment With serverless computing, abstraction level has changed from infrastructure level to application level. When deploying a serverless application, one does not need to provision containers, set system policies and availability levels, or handle any back-end server tasks. To handle huge workloads a typical monolithic deployment requires multiple VMs and load balancers. Thus, we performed experiments to determine the time it takes to launch VMs and load balancers in AWS. Table 4.1 shows that the provisioning of resources introduces a huge delay in deployment. While in the case of serverless, one does not need to worry about spinning up and configuring instances. We can see in Figure 4.3 that the spin-up of new containers affects only a few calls initially and at 3000 requests. The maximum delay experienced by any request is about 1.4s. 4.4 Cloud regions We examine the latency between different regions for HTTP requests in Azure (Table 4.2). We selected West US2 and East US2, which are the farthest regions and also Central US region. We created VMs in these three regions and performed HTTP latency tests using ApacheBench [6] and Python SimpleHttpServer [45]. We see that for a call in the same region it takes 4ms. For a call to be processed by a node element in the farthest region it takes 149ms. Typically for each big cloud provider there are about 10 regions across US [21], [18], [34]. Thus, we can place an MME in one region, which can scale independently. This will ensure that the control plane processing is done as close as possible to the user. Table 4.1. Resource instantiation delay using traditional model Operation Duration(min) Comments EC2 VM launch 1-3 Varies with size of VM and EC2 load Load Balancer launch 3-5 per VM instance behind this Load Balancer Instance registration health check and 15 Table 4.2. Latency across regions in Azure (ms) Central West US 2 East US 2 Central 4 78 81 West US 2 78 4 149 East US 2 81 149 4 4.5 3GPP timers 3GPP defines timers that set the maximum limits for different procedures to execute. The timers are defined for different entities and methods. Timers that are involved in the attach procedure are shown in Table 4.3. Thus, UE can see up to 15s for an attach request to succeed. Thus, we need to build an MME which can perform the processing within this time limit. 4.6 A note on 5G Another point worth noting is that 5G’s specifications are framing up. The components being defined are fairly different than that in the 4G world. As per the 3GPP specs for 5G core (TS 23.501 [2]), a Service Based Architecture (SBA) is being followed for connecting the control plane entities using Service Based Interfaces (SBIs). In SBA, a software is broken down into communicating services. With this approach, the services can be mixed and matched from different vendors into a single product. The 5G core is Table 4.3. 3GPP timers for attach/detach Entity UE Timer T3410 MME T3450 UE T3421 MME T3422 Trigger Stop Time Attach Request Attach 15s sent Accept/Reject received Attach Accept Attach 6s sent Complete received Detach Request sent(UE intiated) Detach Request sent(nw initiated) Detach Accept 15s received Detach Accept 6s received On Expiry Start T3411 or T3402 Retransmission of Attach Accept Retransmission of req Retransmission of req 16 a mesh of interconnected services. The protocol for communication being used between components is HTTP [7]. Some work is being done to evolve HTTP to support 5G requirement [30]. Thus, a serverless approach to a network entity to support 4G control plane functionality gives a good lead to 5G innovations. With this effort, we hope to demonstrate that a serverless can be a feasible approach for 5G entities as well. In the white paper [49], the 5G Software Network Working Group present insights about a 5G cloud-native design. They mention that there are some telco-grade features, which need to be integrated to the current container management frameworks. Such features include data plane acceleration, support for multiple network per service to support service function chaining, etc. But with efforts going into shaping of 5G specs to follow a service based architecture, it is known that a cloud-native approach is a desirable solution for the deployment of 5G network functions. 4.7 Summing up We have performed the background experiments and come to the conclusion that serverless platforms can provide a powerful medium to deploy an EPC control plane. This leaves the control of orchestration and scaling entirely to the cloud providers. However, the current platforms like AWS Lambda, Azure Functions, and Google Functions are limited in terms of flexibility to deploy custom containers. The current delay due to the platform (internal pipelining in the framework and runtime environment) also adds up to the overall user-perceived latency. Another factor which adds delay is the cold start of the containers or VMs. Since these platforms are evolving, we believe that they will mature as per the customer requirements and will become a better fit for latency-sensitive VNF deployments. Authors in [52] observed improvements and fixes when they performed the experiments after 6 months. With our analysis of cost, latency, and scalability, a platform like AWS Lambda seem to be a viable choice to reliably support high-throughput control plane functions. CHAPTER 5 DESIGN Redesigning MME as a serverless function has challenges because of the stateless nature of functions. While building this solution, we need to think about many different interacting components. Each problem can have different solutions. Thus, we picked some solutions based upon our experience, some based on our tests, some went through multiple changes and some based upon popularity. In this chapter, we will discuss those choices which we adopted while designing the system. 5.1 Design choices To handle the state without affecting the fundamental aspects of a functional MME, we incorporated different design ideas while building MME FaaS. We will discuss these design choices in this section. 5.1.1 Asynchronous response from MME Serverless functions are invoked to execute a task and die. They are not expected to generate any new request to the caller in the usual case. In the case of a network control plane, it is required to process the incoming request and generate a new response/request to the caller or other network components. This can be achieved by using the asynchronous responses from the functions. OpenFaas supports async replies by passing the X-Callback-Url header in the HTTP POST call. The function does not need to be modified to use async replies, just the HTTP endpoint for the function changes from http:// gateway/function/{function_name} to http://gateway/async-function/{function_ name}. Similarly, AWS Lambda supports async responses using dead letter queues [15]. Another reason for having asynchronous implementation is that even though the function timeout is 15 min in AWS Lambda, the AWS API Gateway times out after 29 sec. Therefore, the calling entity (eNB or HSS or SGW) might get a failure message even though the 18 MME function is still waiting for some response from other entities. 5.1.2 State passing The S1AP messages exchanged between MME and eNB are small, < 350 bytes. The state of the UE stored (UE context structure) is less than 300 bytes. Since the functions are stateless, no UE context is stored in memory. All the state is saved to the datastore. When the function is invoked due to an incoming request/response, the state is retrieved using the MME-UE-S1AP-ID and updated before exiting. Few of these access to the data store can be avoided using the “State Passing” concept between components. For example, when the MME sends an Update Location Request to HSS, it passes the state to HSS. The HSS will update the location of this UE at its end and send Update Location Answer to MME. Now, this MME instance does not need to retrieve the UE entry from datastore. It will update the entry in the database before sending the next message. This uses the concept of “Continuation-Passing-Style [55]” of functional programming. 5.1.3 Optimistic concurrency control An MME UE Application Protocol Identity (MME UE S1AP ) is allocated when a new UE-associated logical connection is created in an MME. This ID uniquely identifies a logical connection associated with a UE over the S1 interface within MME. To create a unique id across different instances of MME function, we use “Optimistic Concurrency Control (OCC)” method [58]. Using this approach, we generate a random number for ID and insert the UE context with this MME-UE-S1AP-ID only if it did not exist. If another UE with this ID existed in the database, we regenerate the ID and try again. We use a 64 bit unsigned integer for ID. This reduces the probability of a clash to very small. Typically, each MME handles 2.7M UEs. Thus, the 64bit unsigned ID can create enough unique identifiers. The use of OCC guarantees that no two MME instances generate the same ID for different UEs. 5.1.4 Timer 3GPP defines timers for different control procedures for UE as well as network side. For mobility management, LTE EPS mobility management (EMM) timers are defined. The respective entity starts an EMM timer after sending a request and waiting for a response. 19 If the response does not arrive before timer expiration, corrective action is taken. To implement a timer in a serverless setting, we created a lightweight function. This function sleeps for the timer duration and calls the MME function on expiry. Commercial serverless platforms support timer triggers to functions. OpenFaas does not have this feature. They provide a scheduled invocation using cron jobs of Kubernetes. Therefore, we use the timer function for our use case. This can also be used to check for responses from other EPC components like HSS and SGW. If the response from HSS/SGW takes longer, it triggers a timeout and actions can be taken based upon the current state of the UE. This is useful because the interactions with other components are asynchronous. 5.1.5 Evaluation with attach procedure MME handles various control procedures like attach, detach, handovers, authentication, location updates, and bearer management. In this work, we will consider only the attach procedure because an attach in itself is a collection of many messages: authentication, session creation, default bearer creation, location update, and authentication. The messages involved in an attach procedure are shown in Figure 5.1. There are 4 messages, which originate from UE/eNB and are destined for MME. These messages will trigger a call to serverless MME function. Therefore, we need to take into account these 4 external interactions/triggers while calculating the latency and cost. 5.1.6 SCTP connection The eNB connects to MME over SCTP connection. The serverless functions use many different triggers to invoke the functions like HTTP calls, database events, Cloudwatch logs etc ([19]. To simplify the implementation, we assume that eNB can directly talk using REST APIs with the MME. If having an SCTP connection is a necessity, an SCTP Proxy can be built and run near eNB where the SCTP connections from eNB end and the corresponding HTTP requests are originated to the serverless functions. 5.2 Platform choices To evaluate our solution, we selected some platforms and frameworks. We will discuss the reason behind our selection in this section. 20 UE/eNB MME Initial UE Message (embedded attach Req) HSS/SGW Authentication Info Req (HSS) Authentication Info Answer Authentication Request (UE) Authentication Response Security Mode Command (UE) Security Mode Complete Location Update Req (HSS) Location Update Answer Create Session Req (SGW) Create Session Response Initial Context Setup Req (embedded attach accept) Initial Context Setup Response (attach complete) Modify Bearer Req (SGW) Modify Bearer Response Figure 5.1. Message flow during attach procedure 5.2.1 Cassandra Since the functions are stateless, having a stable and scalable database is very important. Apache Cassandra is a highly scalable, high-performance distributed database designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. It is a type of NoSQL database. As per the benchmarking test in [25], Cassandra [5] database performs better than other NoSQL databases. The tests also show that Cassandra’s performance scales linearly as the number of nodes is increased. All Cassandras nodes are equal, and any of them can function as a coordinator that communicates with the client app. Without master nodes, there is no single point of failure. This allows Cassandra to be always (or almost always) available. Thus, for our solution, we have used Cassandra DB. 5.2.1.1 Consistency Cassandra is an AP system according to the CAP [54] theorem, providing high availability and partition tolerance. Cassandra does have flexibility in its configuration, though, and can perform more like a CP (consistent and partition tolerant) system according to 21 the CAP theorem, depending on the application requirements. This is achieved using two consistency features: tunable consistency and linearizable. With tunable consistency, one can set the consistency level for each read and write request. Thus, Cassandra gives a lot of control over how consistent your data is. Since we have a single node cluster deployment, we use a consistency level of one [26], which means a write must be written to commitlog and memtable of at least one replica node. However, the choice of consistency is dictated by network’s requirement for latency versus accuracy of the data. 5.2.1.2 Gocql We used gocql [32] package to implement the Cassandra client. The gocql driver is asynchronous but provides a synchronous API. It can execute many queries concurrently. Since each MME function will have a single read/write active at any point in time, having a synchronous API is good. Only when the data operation is complete (or timed-out), the function is allowed to exit. Although, some connections to the database can take longer and the function timeout can be adjusted as per the requirement. 5.2.2 OpenFaas We have used OpenFaas for our evaluation because it provides: flexibility of scaling, deployment of custom images, active developer community, easy setup, and good documentation. OpenFaas was deployed on CloudLab server with a Kubernetes single node cluster. This allowed us to understand the scaling of MME functions and other metrics. The asynchronous response feature of OpenFaas fits well with the request/response model of a network component. OpenFaaS functions run on containers. This allows any program written in any language to be packed as a function by building the Docker container and deploying it. This is useful because one can fully reuse the existing code to consume a wide range of web service events without rewriting the code. For orchestration, OpenFaas supports both Docker Swarm [27] and Kubernetes. For performance test, it is recommended to use Kubernetes [43]. Therefore, we have deployed an OpenFaas stack with Kubernetes on a CloudLab server running Ubuntu 16.04 in Utah cluster. 22 5.2.3 AWS Lambda AWS Lambda is an event-driven computing cloud service from Amazon Web Services that allows developers to program functions on a pay-per-invocation basis without having to provision storage or compute resources to support them. AWS was the first to introduce serverless functions in the year 2014 followed by Google and Azure in 2016. During our tests in Section 4.1, we saw that AWS Lambda perform better than Azure Functions and Google Functions in terms of scaling and response time. Therefore, we use AWS Lambda as our choice to test the MME FaaS solution on a commercial platform. 5.2.4 DynamoDB Amazon DynamoDB is a database service offered by AWS. It has the advantage of a hosted service and easy start. This avoids the database management burden. The built-in metrics for monitoring is a plus. As Cassandra DB, DynamoDB also follows the AP model of the CAP theorem but is configurable for consistency. Since we use AWS Lambda as our compute platform for functions, we will use Amazon’s DynamoDB as our second choice of datastore. 5.2.5 GoLang All the code is written in GoLang. We chose GoLang because it is easy to use and all the cloud providers support GoLang for serverless functions. GoLang provides high performance, efficient concurrency handling and ease of coding. This also allows us to evaluate our application on other cloud platforms with little changes. 5.3 Latency The MME FaaS solution can be a feasible implementation if it can perform with low latency as needed by the control procedures. Therefore, here we will try to understand the factors affecting the latency. We can split the total time experienced by the UE for any control procedure into: • Network Latency: We measured the response times between regions to find the maximum time it can take in-flight, Table 4.2. We observed that a call to a server in the same region finishes in 4 ms (99th percentile). We also measured the latency 23 between two CloudLab servers in the Utah cluster. We found the 99th percentile latency between these servers is 4 ms. Therefore, we can say that if we can deploy the functions in regions closer to the eNodeB, the network latency can be kept minimal. Since the latency between two servers in Utah cluster of CloudLab is same as two servers in the same region of AWS, we can assume that our implementation when running on AWS will experience similar network latency as our experimental setup. • Serverless framework: To measure the time taken by the serverless framework, we collected ‘tcpdump’ capture and found the time between the arrival of an HTTP request and departure of the response. We performed this experiment with a simple “hello-world” function deployed on OpenFaas, and the OpenFaas stack deployed in Cloudlab server. We found that it is 11 ms on average and 99% of them finish within 14ms. Since we used a very simple function, we can say that the time taken by the OpenFaas framework is on an average 11ms. Please note that a container was kept warm and then the time was calculated. Therefore, this does not include the cold start delay. • Message processing: This is the actual time for which the MME FaaS function will execute and will be billed for. We find out the time when we deploy the application on the serverless platform. • Datastore access: Since a network element is made stateless to run on a serverless platform, it is also important to consider datastore access time. We calculate it with our solution during our evaluations. Combining the network and framework delays, we can see that each message will incur 18ms of latency. We need to measure the message processing and datastore access time with our solution. Compared to a traditional VNF solution, serverless adds extra processing time in terms of framework related operations of handling the requests. Typically a VNF solution does not need to rely completely on an external data store and it can also benefit from the local caching. While in case of serverless functions, access to a reliable data store is 24 mandatory. Therefore, the latency introduced for the retrieval and updating of the states also add up as an extra overhead in a serverless implementation of a network function. 5.4 Designing the eNB App The eNB App went through multiple changes to accommodate the parallelism and to fit our use case. Its design changed as we performed tests and found issues. We will discuss the design of the eNB App and the previous designs that did not work. The eNB App starts worker routines for the level of concurrency asked from it, Figure 5.2. These worker threads independently handle the complete attach procedure. We start as many workers as the concurrency specified as runtime argument. Each worker starts and serves the state machine for the attach procedure. Once finished with an attach accept reply, they post to a channel and exit. In the case where a procedure fails in between due to unhandled responses and timeouts, a signal is posted to the channel and the worker is exited. Upon reading the channel, the main thread starts another thread to handle another UE attach. This limits the number of threads running concurrently and each one individually behaves like a UE. This limits the number of active procedures at any time to Start ‘n’ workers, each running independent Attach procedure Worker Generator On Attach completion, the worker sends signal and exit Start another worker to replace the exited worker 1 2 n ‘n’ workers in flight Figure 5.2. eNB Test App with workers 25 the concurrency level. This is to ensure controlled testing of different components in the solution. We started building the eNB App by creating go routine for sending every message. The GoLang’s HTTP package creates an HTTP connection for each client. This resulted in too many open TCP sockets very quickly, which burnt down the application. Next, we moved to an approach of limiting the connections for sending attach message by supplying a concurrency argument. This ensures we only create a limited number of goroutines. But to process the responses we were still spawning new goroutines, which would send the responses too. This again leads to too many connections. We were only limiting the sending of initial attach messages and not controlling the responses. Then we moved to another model of creating worker routines, which are responsible for sending any message out of eNB app. The number of workers is defined by the concurrency. We used the approach in [24] to create dispatcher and handlers. All the messages to be sent are written to channels, and the workers pick up the tasks from these channels and send them to the other end. This gave us a controlled environment to test the messages with concurrency. While this gave the desired number of simultaneously running workers, this is not good enough for a real test scenario. All the initial messages get lined up quickly that block the replies to the incoming requests to go out. When we look at the output, it appears like all the attach requests are sent first and only after that the replies are processed and responded to. This delays the entire attach procedure. Finally, we moved to the approach where concurrent workers process the entire attach procedure. We performed all our evaluations using this version of the application. CHAPTER 6 IMPLEMENTATION Based on the research and knowledge gained so far, we designed the MME App and the supporting components. We will discuss the implementation aspects of these components in this chapter. 6.1 Components The MME FaaS application is required to work without depending upon the cached state. The Figure 6.1 gives the overall architecture for the MME FaaS implementation. It shows five main components that we built in order to create a prototype. All the code is written in GoLang. We will go through all the components in detail in the following subsections. 6.1.1 eNB App This application generates attach requests and handles responses from MME. It implements the state machine at the UE/eNB side for attach procedure. It is a model of eNB with different tuning parameters to match our testing scenarios. Two of the inputs to the application, ‘concurrency’ and ‘num ue’ define how many UEs this app simulates and with how many of them interacting concurrently with the MME. This helps in evaluating the impact similar to the real deployments with many eNodeBs contacting an MME simultaneously. Other important input to the application is the ‘sync option’. If it is 1, the responses from MME are received asynchronously. The serverless function replies with a 202 Accepted, and the HTTP connection to send the request is closed. The response to the original request arrives as a separate HTTP POST from MME. While in the case of synchronous tests, the response is sent in the body of 200 OK response. The eNB generates ENB-UE-S1AP-ID for each UE and sends attach request using this ID. This is the unique ID that identifies a UE at an eNB. When the response arrives, it pulls the UE information 27 eNB App HTTP Req/Res MME Timer MME Timer \| \| \| \| \| \| Serverless functions Datastore SPGW-HSS Stub Figure 6.1. MME FaaS architecture and different components using this ENB-UE-S1AP-ID. An array of UE context is stored at the eNB. 6.1.2 MME function This is the main function that implements the MME state machine and models interactions with other EPC components. This function is triggered when eNB sends any message to the API gateway endpoint for this function. When the first attach request arrives, MME generates a unique ID to identify the UE at this MME. This ID is called MME-UE-S1AP-ID. The flow of messages between all components is shown in the Figure 6.2. Different instances of MME function is spawned to handle different messages. MME1 builds the UE state and passes it to HSS-SPGW-STUB while sending Authentication Info Req. The HSS stub sends Authentication Info Answer along with UE state. This message is handled by another MME instance. MME-2 stores the state to the database and also sends Authentication Request to the UE. All the interactions with the datastore are shown by blue arrows. The red arrows show the invocation of timers whenever MME needs. 28 eNB App MME Function Attach Req MME-1 Authentication Request MME-2 Database Timer SPGW-HSS-Stub Authentication Info Req Authentication Info Answer Authentication Response MME-3 Security Mode Command Security Mode Complete MME-4 Location Update Req Location Update Answer MME-5 Attach Accept MME-6 Create Session Req Create Session Response Timer MME-7 Attach Complete MME-8 MME-9 Modify Bearer Req (SGW) Modify Bearer Response Figure 6.2. Message flow during attach procedure with MME FaaS 6.1.3 Timer function This serverless function is written to simulate the LTE timers. Some cloud providers support timer triggers for serverless functions. Since OpenFaas does not support it, we deployed a simple function which takes time as a parameter and sleeps for that many seconds. Once it wakes up, it sends the response asynchronously to the MME Function. The MME uses this function to start timer T3450 after sending ‘attach-accept’ message to UE. If the timer expires before an ‘attach-complete’ message is received, the MME resends the ‘attach-accept’. The timer T3450 conditions say that on the 1st, 2nd, 3rd and 4th expiry of this timer, the ‘attach-accept’ is resent. On the 5th expiry, the attach procedure is aborted. 6.1.4 Datastore We used a single node Apache Cassandra setup for storing the state of the UE. The “ue info” table consists of entries of the form: [key bigint PRIMARY KEY, info blob]. The key holds the unique UE ID at this MME, i.e., MME-UE-S1AP-ID. The info field is of type blob that holds the complete structure of UE context. We have used the gocql GoLang package in the function. The gocql package implements a fast and robust Cassandra client 29 for the Go programming language. 6.1.5 SPGW-HSS stub This is a simple stub application to simulate EPC components that the MME interacts with. It listens for TCP connections from MME. 6.2 Failure cases The MME FaaS application is not a fully 3GPP compliant solution. We do not take care of all the failure scenarios. But we studied such cases to understand if fixing them would mean fundamental design changes. We could not find cases which cannot be solved using the same approach of using hosted services. Each case can be addressed using simple additions to the design. We present a few such cases and discuss solutions. 6.2.1 Attach request resent When the UE first sends an attach request, it starts an EMM timer called T3410 for 15 s. If an ‘attach-accept’ is not received during this time, the UE starts another timer T3411 for 10s and on the expiry of T3411 it resends the attach request and restarts the T3410. Thus, the MME can receive more than one attach request if any of the components took an unusually long time to complete. The subclause 5.5.1.2.7 of TS 24.301 [1] defines “Abnormal cases on the network side” as follows: More than one attach request is received and no attach accept or attach reject message has been sent, then: - If one or more of the information elements in the attach request message differs from the ones received within the previous attach request message, the previously initiated attach procedure shall be aborted and the new attach procedure shall be executed. - If the information elements do not differ, then the network shall continue with the previous attach procedure and shall ignore the second attach request message. We store the attach request message with the UE context in the database. If another attach is received, we can use IMSI to look in the database table for an ongoing request. Currently, we use the MME-UE-S1AP-ID as the key for UE context. To handle this failure case, we can create another table of IMSIs, which are currently being processed. But this will result in another access to the datastore. 30 6.2.2 SPGW response timeout Based on the subscription information, the MME establishes an EPS session and a default EPS bearer for the user. For handling such timeouts in an asynchronous, stateless environment, we can initiate timer triggers. When the timer times out, the UE state can be verified and corrective action can be taken. 6.2.3 Service level agreement As per the SLAs of AWS Lambda [14] and DynamoDB [9], the Monthly Uptime Percentage is 99.95% and 99.99% for Lambda and DynamoDB, respectively. If a failure condition occurs, the executing Lambda can silently shut down. Similarly, the data store can become inaccessible during the downtime. Since the UE starts a timer after sending the attach request, if one of the Lambda’s die during the process, the UE will resend the request. This duplicate request can be handled as described in the above Section 6.2.1. Also, to handle the database failures, the aws sdk performs retries. If the failure is still seen, the lambda replies with a failure message to the caller. The corrective action can be taken at the caller’s end. The functions are idempotent, and such failures do not impact the functioning. Overall, we can say that many failure cases may occur that need to be handled differently on a case by case basis. We did not find any specific issue that cannot be fixed in this design. More on implementation details and software used during this work is present in the appendices. CHAPTER 7 EVALUATION In this chapter, we will look at different experiments we performed with our implementation. We assess the latency and cost impact of the MME FaaS solution with different datastores and deployment platforms. Our evaluations were performed using CloudLab server, which is Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz with 20 cores running Ubuntu 16.04. The focus of our tests was to analyse the total time taken for attach procedure with variety of conditions like serverless platforms, datastore, concurrency, and synchronicity. The closest representation of a real model deployment is using a commercial platform. Our main observation is that having stable and scalable datastore is critical to the design of such cloud-native applications. 7.1 Database During our background research, we did not evaluate our database choices. Thus, we start by evaluating the impact of different interactions with Cassandra DB. For our experimental setup, we deployed Apache Cassandra on an Ubuntu server. Note that it is a single node cluster configuration. A keyspace and a UE context table were created using cqlsh queries. A keyspace in Cassandra is a namespace that defines data replication on nodes. Once the table is ready, we performed the operations which MME FaaS performs. That is, “Insert if not exists,” “Update,” and “Select.” The boxplots in Figure 7.1 show the range of time taken for 10000 operations for each query. The UE context in our case is of 250bytes. We added entries in the batches of 10k up to a total of 200k. The distribution of time taken did not change with the increase in the number of records. In the above experiment, the queries were performed on the same established session to the cluster. To approximate the interactions from the function, we performed next experiment by opening a new connection each time for insert. Due to the limitation of our setup, we 32 1800 1600 Time(us) 1400 1200 1000 800 600 400 Insert Update Operation Read Figure 7.1. Cassandra insert, update, and read could not create many concurrent sessions. The connection attempt kept failing with connection timeouts and socket errors: unable to create session: unable to connect to initial hosts: i/o timeout unable to create session: unable to connect to initial hosts: socket: too many open files We tested inserts by creating a different thread for each request. We could test only 300 queries from a single server concurrently. This is shown by blue line in the following Figure 7.2. In another test (shown by the yellow line), we run a loop around create session, insert, and close. Even with this approach, we could not test more than 1000 requests. Although we closed the connections after each inserts using “session.Close()”, the session remains hanging, which blocks the creation of new connections after 1017 inserts. Adding a delay didn’t help in releasing the sessions. This could be a limitation of gocql package. With the Figure 7.2, we can say that time taken by each insert request varies from 2 ms to 45 ms. This can increase if the same node handles more and more independent connections. Such a problem is usually solved by pooling the connection to the database. AWS Lambda allows this by sharing variables defined outside the handler function. By specifically creating variables in and outside of our handler function we can share variables (like database connections) between requests. 33 50 Concurrent sessions Serialized sessions Time(ms) 40 30 20 10 0 200 400 600 Request 800 1000 Figure 7.2. Cassandra inserts with new connection for each query To summarize our tests, we can say that since Cassandra can provide a consistent data state using tunable consistency feature and it is known to scale linearly, it is a good choice for our implementation. It is widely used in large scale organizations with growing data (Netflix, Facebook, etc). 7.2 MME solution We next assess the time taken by the complete attach procedure and analyze limitations at OpenFaaS and AWS Lambda. These tests are performed in order to understand the load handled by our solution. This will give us insights about how the system scale for gracefully processing requests. We performed evaluations with open source and commercial platforms. This helps in building expertise with different platforms and giving the pros and cons of each. Thus, we use OpenFaaS and AWS lambda as FaaS framework choices for open source and proprietary platforms. And we use Cassandra DB and DynamoDB as datastore choices for open source and proprietary offerings. While performing the tests, we first perform tests with 1k UEs to understand latencies. Then we move to concurrency tests to analyze the system under load. 7.2.1 OpenFaas functions OpenFaaS supports asynchronous responses back from the functions. As described in the previous chapter, we use this feature to send a request/response from MME to 34 eNB/UE. We capture the time taken using both synchronous and asynchronous interactions with the function. In the synchronous mode, the eNB App sends requests in HTTP post and receives responses in the body of the 200 OK replies. While in the case of asynchronous mode, when eNB App sends any request to the MME function, it sends a callback URL to OpenFaaS in the HTTP post header. If the message is accepted by the gateway, a “202 Accepted” reply is sent. This means that the caller can be assured that this message will invoke the function. The eNB App listens on this URL for any incoming HTTP messages. When the MME Function returns, the OpenFaaS framework uses this callback URL for sending the post message with the response body. The plot in Figure 7.3 shows that Async mode takes longer time for attach procedure to finish. This is because there are in total 9 calls to MME functions and 9 different asynchronous responses. Each message adds delay due to HTTP connection setup for sending a response. In both the cases, the latency is similar but we suggest Async mode because of reasons discussed in Section 5.1.1. Due to the deployment setup of a single node Kubernetes cluster, the number of requests handled simultaneously is limited. We also performed tests to understand the scaling at OpenFaaS. The framework starts more containers in the batches of 4. For 1000 requests, it starts a total of 4 containers, for 10k and 5k requests it starts 20 containers in 5 steps. The trigger for scaling configured at OpenFaas is: sum by(function_name) (rate(gateway_function_invocation_total{code="200"}[10s])) > 5 This says that if the function is invoked more than 5 times per second, trigger an alert. Therefore, if we increase the number of requests, this rule keeps getting triggered and hence more containers are spawned in steps as seen in Figure 7.4. In these tests, we are limited by our test infrastructure of single server setup. But we get the basic idea about how the scaling works as the number of requests grow. At OpenFaaS, it is configurable. Also, since it uses Kubernetes, the built-in Horizontal Pod Autoscaler of Kubernetes can be used. The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization. 35 100 Time(ms) 80 60 40 20 0 Async Synchronicity Sync Figure 7.3. OpenFaaS sync async Figure 7.4. Prometheus logs of OpenFaaS scaling 7.2.2 AWS Lambda Our current single server setup is not optimal for performance testing. We next deploy MME FaaS solution on AWS Lambda. This requires some changes in the code. The deployment method is different and the expectation of the runtime is different in terms of naming convention and input/output of the function. Thus, we ported the MME Function to AWS Lambda. Also, supporting asynchronous responses from lambda is not natively 36 supported by AWS Lambda. Thus, we used synchronous calls with AWS Lambda for the evaluation, as shown in Figure 7.5. To add more versatility to the tests, we used both Cassandra DB and DynamoDB for state storage. As in the case of OpenFaaS tests, CassandraDB runs on the CloudLab server. While DynamoDB was configured on AWS. We created a table named “ue info” with entries of the form [key, Info]. “Key” is of type Number, and “Info” is a JSON structure. At AWS Lambda, we perform tests in two different categories: Lambda using Cassandra DB and Lambda using DynamoDB. Adding support for DynamoDB again require usage of few aws sdk packages. Figure 7.6 shows the spread of time taken by 1000 requests with DynamoDB and CassandraDB datastore. These requests are sent serially to the REST endpoint for the MME FaaS function. The DynamoDB was deployed in the same region as the MME FaaS Lambda (US-East2). This resulted in faster query responses and hence the time taken to finish the procedure with DynamoDB is much lesser compared to that with CassandraDB, eNB App MME Function Timer Authentication Info Req Attach Req Authentication Request Database MME-1 Authentication Info Answer Authentication Response MME-2 Security Mode Command Security Mode Complete Location Update Req Location Update Answer MME-3 Create Session Req Create Session Response Attach Accept Timer Attach Complete MME-4 Modify Bearer Req (SGW) Modify Bearer Response Figure 7.5. MME function with synchronous responses SPGW-HSS-Stub 37 1000 Time(ms) 800 600 400 200 0 Dynamo Datastore Cassandra Figure 7.6. AWS Lambda different datastore which was running at Utah cluster in US West. Another thing to note here is that with DynamoDB, the 95% latency extends up to about 650ms while the median remained at 250ms. This shows that DynamoDB can take a variable amount of time for different queries. The total time to complete an Attach with AWS Lambda is 250ms. Each round trip to AWS Lambda takes about 60 ms as per the Figure 4.1. This equates to supporting 4-5 calls to the MME FaaS Lambda function, as shown in Figure 7.5. The Figures 7.7 and 7.8 show the alarms raised by DynamoDB for the high rate of usage. The database denies the queries with Provisioned Throughput Exceed Exception. This hinders the performance testing of MME FaaS on AWS Lambda when using a high rate of messages. A provisioned mode is a good choice if the capacity requirements can be forecasted and the load is predictable. We discuss the capacity required for our tests in the next section when we increased the load. With the tests so far, we infer that the time taken to finish an attach procedure is about 250 ms if the database is in the same region. Although with DynamoDB, the variance in the results is quite huge. The whiskers of the plot run from 5 to 95 %. And the 95 % line extends way higher than the third quartile box. Also, there are a lot of outliers in case of DynamoDB. While the results with CassandraDB are much more stable. Although the median lies at about 500 ms, all the request finish between 500 ms to 600 ms (except 1). We 38 Figure 7.7. DynamoDB RCU error Figure 7.8. DynamoDB WCU error did not test with DynamoDB’s On-demand capacity mode which might give better and stable results with MME FaaS. Since these latency ranges fall well within the limit of the timer (15 s), such a solution is a viable option. 7.2.3 Concurrent execution In this section, we will discuss different tests by incorporating concurrency aspects to simulate multiple UEs interacting simultaneously with MME. We performed these tests 39 with both OpenFaas and AWS Lambda. The concurrency is achieved by starting worker threads, which are responsible for handling the complete procedure for the UE. Only the requested number (concurrency parameter) of worker threads are launched. We went over the details of the eNB App design in Section 5.4. Figure 7.9 and 7.10 are the box plots for CassandraDB and DynamoDB based MME FaaS at concurrency level of 10 to 400 deployed at AWS Lambda. The AWS Lambda and DynamoDB are deployed in US-East-2 region. The total time taken to finish the attach procedure is much lesser compared to that with AWS Lambda 2000 Time(ms) 1500 1000 500 0 10 50 100 150 200 Concurrency 400 Figure 7.9. AWS Lambda and CassandraDB 2000 Time(ms) 1500 1000 500 0 10 50 100 150 200 Concurrency Figure 7.10. AWS Lambda and DynamoDB 400 40 and Cassandra DB setup. This can be due to the fact that the AWS Lambda is deployed in UE-East-2 region and CassandraDB is running on a Cloudlab server at Utah cluster in US-West. In both the tests, as the concurrency increased, the time taken to complete the procedure remained stable till a concurrency of 100. The time-taken jumps up as we increase the concurrency further. The plots in Figure 7.11 show the minimum, average, and median time for 10k requests at different concurrency levels. The minimum time remains almost constant for both the cases. The median and average grow very fast with the increase in simultaneous requests. This shows that at higher concurrency, the databases are not able to scale enough. The CassandraDB is run on a single server and this might be limiting the scale-up. The graphs in Figure 7.11 and 7.12 show that CassandraDB based results are more stable. The plot in Figure 7.12 shows the maximum time taken. In both the cases, it looks random and does not follow any specific pattern. But in general, DynamoDB based tests have much higher maximum time than CassandraDB based tests. We also performed concurrency tests with OpenFaaS. The results look different from AWS Lambda, Figure 7.13. The time-taken for the processing increase with the concurrency level. In this case, both Cassandra DB and OpenFaaS contribute to the latency. 900 Dynamo Min Dynamo Average Dynamo Median Cassandra Min Cassandra Average Cassandra Median 800 Time(ms) 700 600 500 400 300 200 10 50 100 Concurrency 150 200 Figure 7.11. AWS minimum, average and median comparison with DynamoDB and Cassandra 41 Dynamo Max Cassandra Max 12000 Time(ms) 10000 8000 6000 4000 2000 10 50 100 Concurrency 150 200 Figure 7.12. AWS maximum comparison with DynamoDB and Cassandra 1000 Time(ms) 800 600 400 200 0 10 50 100 Concurrency 150 200 Figure 7.13. OpenFaas with concurrent workers 7.2.4 Sensitivity analysis with DynamoDB The tables in DynamoDB are configured with RCUs and WCUs, which limit the number of read/write access per second to the DB. Amazon DynamoDB has two read/write capacity modes for processing reads and writes on tables: Provisioned and On-demand. Since the Provisioned mode is free-tier eligible, we chose this for our testing. For provisioned mode tables, you specify throughput capacity in terms of read capacity units (RCUs) and write capacity units (WCUs). One read capacity unit represents one strongly consistent read per second, or two eventually consistent reads per second, for an item up 42 to 4 KB in size. One write capacity unit represents one write per second for an item up to 1 KB in size. A working configuration for our tests was 200 RCUs and 400 WCUs. We performed a sensitivity analysis by varying these capacity units, which can be shown in Figure 7.14. As can be seen, the latency of Attach procedure remains constant with RCUs 100 or more and WCUs 200 or more. But we observed few spikes in some tests with 100 RCU and 200 WCU. Therefore, we performed all out testing with 200 RCUs and 400 WCUs. With lower capacity units, we observed many failures due to throttled requests at the datastore. For 10k requests, we saw 1523, 1010 and 502 failures with 25rcu/50wcu, 25rcu/50wcu with auto-scale and 50rcu/100wcu respectively. Thus, if the load can be predicted, provisioned mode works reliably with right amount of capacity configuration. The key takeaway from these tests is that AWS Lambda with CassandraDB is a stable choice for running a control plane node. DynamoDB looks better in the start probably because of it being deployed in the same region as the lambda. But CassandraDB handles the load more gracefully when the load increases. Also, since CassandraDB is known to be linearly scalable it is a good choice. But if using a hosted service is a requirement, then 17500 15000 Time(ms) 12500 10000 7500 5000 2500 DynamoDB capacity config Figure 7.14. Sensitivity analysis with DynamoDB 200r50 0w 200r40 0w Aut oscale 200r40 0w 100r20 0w 50r100 w ale Autosc 25r50w 25r50w 0 43 a higher provisioned capacity at DynamoDB will be a good idea. There are some hosted CassandraDB options available from different organizations, which can also be evaluated if management of infrastructure needs to be avoided. Another important point to be noted from these tests is that the time taken to process the complete Attach is well under the time limits set by UE (T3410) for 15 s. Although in these tests we do not consider the time taken by HSS or SGW. We make calls to the stubbed HSS and S-GW and get a response immediately. The overall time will increase as per the processing time in these components. But due to the asynchronous implementation, the individual request would not timeout. 7.3 Comparison with other solutions To compare our latency results with other EPC solutions we performed the UE attach using OpenEPC [31] and OpenAirInterface [42]. We used the setup suggested in [29] and [28] to attach a simulated UE by running experiments on PhantomNet testbed. In our experiment with OpenEPC we found that an attach procedure takes 450 ms to complete. While with OpenAirInterface’s OAISIM it takes 1800 ms to complete the attach procedure. Since all the nodes in the experiment are running on the emulab servers in the Utah cluster, it will be fair to compare the results with our OpenFaaS based MME FaaS results which are also run on the Utah cluster nodes. From Figure 7.3, it takes about 42 ms to finish the Attach procedure. This duration is much lesser than OAISIM and OpenEPC although all external connections are simulated in our tests. 7.4 Cost After the evaluation of latency and scaling, we move on to calculate the actual cost of the deployment. We will calculate the cost of attach procedure per subscriber by breaking it down into all components. Since our solution supports only attach procedure, we look into the detailed cost for only attach. We later estimate the cost of other methods. 7.4.1 Asynchronous implementation In the background research (Chapter 4), we estimated cost based upon just the serverless function calls. However, after the implementation and deployment, we have a better idea about the function’s memory and duration billed. Also, we can now calculate the 44 extra costs associated with this solution. The cost of the MME deployment will include the cost of API gateway, number of function invocations and database calls per procedure along with the FaaS cost. From the message flow in Figure 6.2, we know that there are 9 calls to the MME FaaS function, 5 write calls to the database, 4 read calls to the database and 1 call to the timer function. The SPGW HSS Stub is run in a separate VM. Here we will calculate the cost of all the components of MME. The cost is broken down as follows: 1. Requests: 0.20$ per million requests For 1 attach procedure, 0.20 x 10 = 0.0002 cents. 2. Duration: $0.00001667 for every GB sec In our tests with AWS Lambda and DynamoDB, a typical call log in CloudWatch looks like: REPORT RequestId: 65383542-1a7a-49a5-8301-e2551819ecbd Duration: 11.51ms Billed Duration: 100ms Memory Size: 128MB Max Memory Used: 33MB The function is configured to timeout after 15 sec and memory to be 128 MB. Since the logs say that the billed duration is 100ms and memory is 128MB, we use this to calculate the billable GB-sec and 10ms of compute time we get 0.0125 GB-sec per request ((128/1024) GB ∗ (100/1000)sec) Thus 0.0125 ∗ 10calls = 0.125 GB-sec per attach 0.125 ∗ 0.00001667 = 0.0002 cents 3. API Gateway: The cost of API gateway [8] can be calculated based upon the number of calls. For 10 calls at a price of $3.50 per million, the API gateway cost comes to be 0.0035 cents. 4. DynamoDB: There are two modes of pricing: on-demand and provisioned. For predictable capacity needs, the provisioned mode is preferred. The price for provisioned capacity mode depends upon the number of RCUs and WCUs configured. We configured 200 RCUs and 400 WCUs for the ue info table. We used the metrics at DynamoDB to come up with the required number of RCUs and WCUs. The logs 45 show the actual consumption using the price chart at [17], the cost of the datastore will be $205.92: (0.00065 ∗ 400 ∗ 24 ∗ 30 + 0.00013 ∗ 200 ∗ 24 ∗ 30 = $205.92) The Provisioned mode cost does not work at the granularity of each access. Since we are trying to calculate the cost of Attach procedure alone comprising of many messages, we will use the on-demand pricing. Using [16], reads cost 0.000125 cents and write costs 0.000625 cents for the Attach implementation. 5. Data Storage: First 25 GB stored per month is free and $0.25 per GB-month thereafter. Since the UE context is small (about 400 bytes), we can keep 62.5M UE records for free. Each MME needs to maintain about 2.7M UEs as per the previous data. Thus, data storage will not cost extra. Therefore, the total cost with above components is 0.00465 cents per attach procedure in a serverless implementation. This makes it $46.50 for 1M Attaches. An attach happens when the UE first registers to the network. It will go the ecm-idle state after connecting to the network. A service request is used to move from idle to connected. Thus, an Attach can occur maybe once a day per UE or even lesser. As per Section 4.2, each MME handles messages from 2.7M UEs. Assuming one attach per day per UE, we estimate that the total cost of attaches at an MME will be $3766.5 for a month. Using the above numbers, we can break down the cost between different services as follows: Serverless Request: 4.5% Duration: 4.5% API Gateway: 75% DynamoDB Read: 2.5% DynamoDB Write: 13.5% With the above decomposition, we know that API Gateway is the biggest contributor of cost in a serverless implementation. Thus, other function triggers need to be explored to reduce the cost. 46 7.4.2 Synchronous implementation Next, let us analyze the cost of another model where the calls to SPGW is synchronous. In such a case, the MME FaaS function will suffer extra wait for the SPGW to respond. But this will reduce the number of calls to the function (Figure 7.5). This model invokes the function 4 times. This results in (4 + 1) calls to the serverless functions and API gateway. Note that 1 is for timer function. There are 4 requests and response calls from MME to HSS-SGW. Assuming HSS or SGW takes 1 second to respond, the total amount of duration each function will be billed for becomes 1100 ms (100 ms originally and 1 second due to HSS-SGW). In this case, we need to re-calculate the API gateway and serverless platform cost: 1. Serverless Requests: For 5 calls we get, 0.20 per million x 5 = 0.0001 cent 2. Serverless Duration: We have 4 calls of 1100ms each and 1 call for 6000ms (the timer runs for 6 sec). ((128/1024) GB ∗ (4 ∗ 1.1 = 4.4)sec = 0.55GB-sec for each Attach procedure) 0.55 ∗ 0.00001667 ≈ 0.00091cents 3. API Gateway: 5M calls at a price of $3.50 per million, the API gateway cost comes to be 0.0018 cents. 4. DynamoDB: With the function calls reduced to half, we reduce the DB interactions to half and hence the cost with on-demand pricing is 0.000375 cents. Total cost is 0.00318 cents for a synchronous Attach procedure. This comes out to be a little lesser even though the function had an extra wait. But the difference is not huge (0.00318 vs 0.00465) to give up the benefits of asynchronicity. The asynchronous approach is less prone to variable costs due to delays at HSS and SPGW. The MME instance remains idle and incurs cost while not doing any real work. With asynchronous approach, the instance is unblocked and can be released. 7.4.3 Other control messages In an LTE network, although attach procedure is a set of many messages, attaches are less frequent than other messages like service request and release, handover, paging, bearer activation, modification, deactivation, etc. In [48], it is shown that during 47 busy hours the number of service request/release can be about 60 compared to 2 attach per hour. Similarly there are about 100 mobility events (handovers) per busy hour per subscriber. Although the load at nonbusy hours can be much lesser. Also, attach procedure requires many different interactions, while other messages generally require just a request/response. From [47], we know that an attach results in 10 control messages at MME while the rest of the procedures contribute to 24 control messages combined. Thus, the messages that are more frequent will contribute to a lesser number of function invocations. The statistics from 2013 [39] suggest that attach is about 1% of the total signalling messages processed by an MME. But we do not have any reference for the number of total signalling events at MME. Table 7.1 lists the cost of AWS services. To get a better view of the cost, we break down the cost per message in Table 7.2. We use the on-demand [16] pricing of DynamoDB for creating a better cost model per request. Since the MME FaaS function is billed for 100 ms and 128 MB memory, we use 0.0125 GB-sec as the duration for each Lambda. For two signalling messages, we assume we would need on an average one read and one write to the datastore. Using the AWS prices and the number of control messages per NAS procedure (from Table 1 of [47]), we estimate the cost of handling each of these procedures in Table 7.2. Without the information about the number of signalling messages handled by an MME, we perform some rough estimates for 1 million control messages based upon example signalling distribution from [41]. Percentages of events seen by MME are as follows Table 7.1. AWS services cost Service Price per request(cents) API Gateway 0.00035 AWS Lambda Requests 0.00002 AWS Lambda Duration (GB-sec) 0.001667 DynamoDB On-demand Read 0.000025 DynamoDB On-demand Write 0.000125 48 Table 7.2. Cost breakup Procedure MessagesAPI Gateway Attach 10 0.0035 Dedicated Setup ServerlessDuration DynamoDB Total(cents) ReOn-Demand quests 0.0002 0.0002 0.000125 + 0.00465 0.000625 Bearer 2 0.0007 0.00004 0.00002 0.000025 0.000125 + 0.00091 Additional Default 4 Bearer 0.0014 0.00008 0.00004 0.000050 0.00025 + 0.00182 Tracking Area Up- 2 date 0.0007 0.00004 0.00002 0.000025 0.000125 + 0.00091 S1 based Handover 8 0.0028 0.00016 0.00008 0.000100 0.0005 + 0.00364 X2 Handovers based 2 0.0007 0.00004 0.00002 0.000025 0.000125 + 0.00091 Paging 2 0.0007 0.00004 0.00002 0.000025 0.000125 + 0.00091 Idle to Connected 3 0.00105 0.00006 0.0000625 0.000025 0.000250 + 0.00144 Connected to Idle 3 0.00105 0.00006 0.0000625 0.000025 0.000250 + 0.00144 in the example: Mobility and Handover - 30% Paging - 30% Service Request - 20% Service Release - 10% Attach - 4% Detach - 1% Bearer Activation/ Deactivation/ Modification - 5% Using the Table 7.2, total cost for 1 million events distributed as per above example percentages is about $12.5. The modelling of this kind gives a clear picture of the total cost involved for the capacity expected out of an MME. As per [41], “In an LTE network, 49 operators can expect one million smartphone users to generate 31,000 transactions per second during busy hours.” Using this, we can expect 111.6M transactions per busy hour for 1M subscribers. This gives the total cost for handling busy hour traffic for 1M subscribers as about $1400. This price is quite reasonable given the national scale of these networks. Cost at off-peak hours can be assumed to be much lower, though no reliable utilization numbers are publicly available. From [48], we know that service requests are seen once every 106 seconds, which is about 0.44 times per minute. In our example, we consider service requests/release to be 30% of total messages. Thus, the total events are 1.4 per minute. While during busy hours, there are about 2 events every minute (31,000 per second per million UEs gives 1.86 events per minute). The difference between average and busy hour is not huge using our example distribution, but in reality it can be different. We do not have any source to confirm it though. But the above calculations and discussions gives a fair idea about the cost and number of events handled by an MME. 7.5 Testing limitations With the tests in the above sections, we demonstrated the effectiveness of a FaaS implementation of MME in terms of cost and load handling. In this section, we will highlight some limitations we faced due to the test setup and subscription. Following are a few of the issues we faced: 1. Function timeout: The initial Lambda function was deployed with a default timeout value of 15sec. With this setting, many of the function calls failed with “Task timed out” error. This was because some update instructions to the DynamoDB took a long time. Thus, we increased the timeout value to 60 seconds and tested more. But this increase did not help because the updates continued to fail, if not timeout. These failures were because of the higher load on the datastore, more than the capacity provisioned. 2. eNB App design: The eNB App maintains a number of workers which read from the job queue and sends messages to the MME function. Since we execute the application from a single server, we cannot simulate many simultaneous connections. 50 3. OpenFaaS scaling: While running concurrent executions with OpenFaaS, when we looked at the Prometheus logs, the scaling was triggered only for the case with 10k requests at 10 concurrency. When concurrency is increased, the rate of invocations seen by the gateway is reduced. This might be one of the reasons for the longer processing time with increased concurrency. 4. Provisioned throughput exceeded exception: DynamoDB experiences the issue of “Exceeding Provisioned Throughput” very quickly with concurrency. And hence many of the Attach procedures fail to complete. We started with a setting of 5 for RCU and WCU where out of 1000 requests, only 518 completed with 100 concurrency, 743 procedures completed with 50 concurrency and 887 completed with 10 workers. After analyzing the metrics at CloudWatch and trials, we figured that 200 RCU and 400 WCU is ideal for testing 10k requests at multiple concurrency levels up to 200. Figure 7.10 shows the results of the tests at different concurrency levels. We did not see any failures this time. 7.6 Final remarks We presented a design which can run in a truly serverless environment using all hosted services. We performed tests to demonstrate various use cases of the same. We discussed the scalability and cost of such deployment. The tests were done with the highest concurrency of 400. Due to the limitation of our setup we could not increase the load further, but it gives us an estimate of a simple deployment on AWS. The highest latency at this level turns out to be about 900ms with DynamoDB and 600ms with CassandraDB. This is again comparable to the latencies with OpenEPC(450ms) and OpenAirInterface(1800ms), even though these were run in a local cluster as compared to AWS deployment running in US East region. CHAPTER 8 CONCLUSION In this thesis, we redesigned a network control plane entity to run on a fully managed end-to-end serverless platform using compute, storage, messaging, and monitoring. In case of AWS, these are achieved using, AWS Lambda, DynamoDB, API Gateway, and Cloudwatch. During the research and development of the MME FaaS solution, we gained experience and deep understanding of the philosophy behind the serverless platforms. We came across different challenges in terms of setup, evolving nature of the technology, lack of data about the usage pattern in mobile network, etc. By going through the different design requirements and implementation of MME FaaS, we were able to demonstrate the feasibility of a serverless deployment. Our results show than an Attach takes about 800 ms with a reasonable load from different servers. Without the industrial data for comparison, we do not know how this deployment will perform under a real workload, but we know that it lies well under the timer limits set by 3GPP. Our tests also show scaling at these platforms. The scaling at AWS Lambda is fully automated without the need of any user intervention. This is good news for small organizations to work on getting their business logic up and running in no time. The cost model of such a deployment gives flexibility to the organizations. For latency-sensitive applications, it might be a little overwhelming to use API endpoint calls. But for procedures that can tolerate higher latency, it can be a great choice to keep the costs under check. The focus can be entirely on the development of the application rather than the management and operational burden. 8.1 Discussion While cloud computing enabled the industry to rent compute, networking and storage infrastructure, serverless computing takes a step further by relieving developers from any management and maintenance efforts. All the services are managed entirely by the provider. This helps in increasing the developer productivity with zero server manage- 52 ment and zero configuration deployments. This enables developers and organizations to reach the customers and end-users quickly. The serverless platforms are evolving and supported feature set is growing continuously. This makes it usable for a vast variety of applications. The shift of applications to this model mostly requires rearchitecting and breaking of the application into simpler independent components. This means building systems composed of small, independent units of functionality focused on doing one thing well. Apart from providing autoscaling and provisioning, cloud-native applications win because of the inherent resilience to failures. In the event of failure, application processing instantly moves from one data center to another without interrupting the service. Talking about the cost, the biggest contributors for applications came out to be data store and API Gateway in our implementation. This is seen by other organisations as well [35], [51]. Data store costs might be unavoidable if we want reliable and available storage. But API Gateway cost looks redundant and wasteful. With $3.50 per million calls at AWS’s API Gateway, and applications broken down into small independent units, cost increases drastically. Although managed services can still win the cost battle for small organizations because of zero management burden. Another big obstacle for our network control plane use case had been the supporting interfaces. Like any managed service, Functions/Lambdas also expose a limited number of ways to interact with them. Whereas network supports various protocols for different functionality. Control and data plane of a network work separately on different protocols. This means that network functions cannot be directly ported to managed platforms which expose limited interfaces. Specifications need to be reshaped to be able to leverage the real benefits of serverless platforms. Foreseeing the need, 3GPP is redefining 5G using a Service Based Architecture, which allows easy adoption of managed platforms. The delays introduced because of the serverless framework and cold starts is another thing organizations are concerned about. In its current form, it is not usable for latencysensitive network functions like data plane applications. AWS Lambda takes about 60 ms to process a simple request and respond. This delay is higher with Azure Functions and Google Cloud Functions. Thus, it is best suited for background operations and tasks that are not prone to time sensitivity. 53 8.2 Disadvantages of FaaS Until now we have discussed the advantages of the FaaS approach for building a mobile network control plane. We also discussed the challenges we faced and the issues that we foresee. But it is important to understand both the pros and cons of this technology before adopting it. Here we list the main disadvantages of using Function-as-a-Service. 1. Decreased transparency: Since the deployment is managed by the cloud provider, it is difficult to analyze the low lying infrastructure. 2. Tough to debug: Debugging managed services is difficult due to the lack of tools. It is mainly done using logging, which can be time-consuming many times. 3. Auto scaling leads to auto-scaling of cost: On one hand, auto-scaling relieves the operational burden, while on the other it can add up to the cost because of uncontrolled resource usage. 4. No caching of states: The stateless nature of functions leads to redesigning of applications around datastore. This adds complexity because of database interactions. 5. Suffers from cold start time: Since the containers are not preprovisioned, the runtime scaling can add latency due to cold start time. 6. Difficult to move across different providers: Different cloud providers have their own runtime for serverless applications. The language supported also might vary. Even with the same language support, the applications written for one platform does not work on another. There is some extra porting work involved from one provider to another. Some points may worry developers while some points may force them to redesign the applications. The independent pieces can be developed, patched, deployed, and scaled independently. Overall such an implementation gives a better opportunity for a continuous integration and development model. The specific job of infrastructure management is left to the experts at the cloud provider rather than everyone handling their own. 54 8.3 Future work With the initial analysis of MME FaaS for cost, latency and scalability at hand, there is a need to explore the serverless platforms further to support a fully-functional data plane and control plane. Strict latency and bandwidth requirements for the data plane needs further investigation to fit the serverless model. Apart from the functionalities like supporting all network procedures, protocol compliance, etc., more work needs to be done to break down the components to reduce the number of interactions since there is a cost associated with each interaction in the serverless model. Reducing cold start latency also needs more work at the platform side to minimize latency due to many interacting components. Selection of a right database requires additional investigation which should be based upon the application architecture. There are areas for saving cost like instead of using an API gateway other event triggers can be examined. Another important area for research is the availability. Telecom grade services need high availability of “five nines,” which is much lesser than offered by an IT-cloud (about 99.95% availability). Overall, our current work establishes a direction for further research on fulfilling the requirements of the telecom world using IT-grade cloud services. 8.4 Conclusion We demonstrated the feasibility of an MME FaaS solution with different database accesses. The implementation focuses mainly on the dependencies and architectural challenges related to serverless platforms. Standard compliance is not specifically looked at in this work. Although for protocol handling, specific libraries need to be incorporated as per the requirement. We provided a framework with interactions with external entities and pointing out specific use cases. We also forecasted costs for such a solution that is very competitive to the traditional model. The FaaS model is a powerful way to provide a platform for developers to produce output quickly. With the abundant managed services offered by cloud providers, it is becoming more and more easy for application developers to build their logic and deploy. Even for large scale systems, a serverless approach is great because of zero operational cost. APPENDIX A MESSAGE PASSING We have defined the UE context as a structure of fields that are required for the handling of Attach procedure. We also defined each message structure with the important fields to showcase the functionality without worrying about the compliance. Following are the structures that show the UE context and some messages. type Ue_info Ue_id Enb_ue_s1ap_id Mme_ue_s1ap_id Plmn_id Tai Message Datalen Ue_state } struct { string /IMSI or GUTI/ uint64 uint64 uint8 uint8 Message_union_t int Ue_state_t type Attach_req_t struct { Imsi string Enb_ue_s1ap_id uint64 Plmn_id uint8 Tai uint8 Net_cap uint8 } type Attach_accept_t struct { Enb_ue_s1ap_id uint64 Mme_ue_s1ap_id uint64 Ambr uint8 Sec_cap uint8 } The eNB App performs the json encoding of the message and passes it in the body of the HTTP POST request to the MME function. 56 json.NewEncoder(form).Encode(msg) At the MME function, this message is decoded using json unmarshalling into the message structure. json.Unmarshal(req, &msg) The MME also sends the asynchronous responses to eNB by encoding the message structure as json. If there is a need for supporting the full S1AP protocol parsing, we can add a library (or a GoLang package) to perform the decoding at MME function. APPENDIX B SOFTWARES AND INSTALLATION Docker curl -sSL https://get.docker.com/ \| sh GoLang sudo sudo sudo sudo sudo apt-get install -y software-properties-common apt-get install software-properties-common add-apt-repository ppa:longsleep/golang-backports apt-get update apt-get install golang-go -y Kubernetes sudo apt-get update && sudo apt-get install -y apt-transport-https && curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg \| sudo apt-key add echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" \| sudo tee -a /etc/apt/sources.list.d/kubernetes.list && sudo apt-get update sudo apt-get install -y kubelet kubeadm kubernetes-cni sudo kubeadm reset sudo swapoff -a sudo kubeadm init --pod-network-cidr=10.244.0.0/16 rm -rf $HOME/.kube && mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Docum entation/kube-flannel.yml kubectl get nodes kubectl get all --namespace=kube-system kubectl taint nodes $(hostname) node-role.kubernetes.io/master: NoSchedule- 58 OpenFaaS git clone https://github.com/openfaas/faas-netes kubectl apply -f https://raw.githubusercontent.com/openfaas/faas-netes/master/ namespaces.yml kubectl apply -f ./yaml OpenFaaS function creation faas-cli new --lang go cass-func-go go get -u github.com/golang/dep/cmd/dep export GOPATH=/users/sonika05/lte-enb-mme ~/go/bin/dep init ~/go/bin/dep ensure -add github.com/gocql/gocql Kubernetes monitoring kubectl get svc -n openfaas : Gives all the services in openfaas namespace and the ip addresses while true; do kubectl describe node; sleep 1; done : Gives the master node status kubectl get nodes : gives node information kubectl get pods --all-namespaces : This will show all the running/evicted pods CassandraDB sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer echo "deb http://www.apache.org/dist/cassandra/debian 21x main" \| sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list curl https://www.apache.org/dist/cassandra/KEYS \| sudo apt-key add sudo apt-get update sudo apt-get install cassandra nodetool status CQLSH 59 $ cqlsh <master-ip> cqlsh> CREATE KEYSPACE mme_faas WITH replication = {’class’: ’SimpleStrategy’, ’replication_factor’: ’3’} AND durable_writes = true; cqlsh> CREATE TABLE mme_faas.ue_info ( key bigint PRIMARY KEY, info blob ) ; TRUNCATE mme_faas.ue_info; SELECT * FROM mme_faas.ue_info; Gocql go get github.com/gocql/gocql GoLang AWS Lambda go get github.com/aws/aws-sdk-go env GOOS=linux GOARCH=amd64 go build -o dynamo_app_aws/main dynamo_app_aws HTTP Load test tools 1. ApacheBench: sudo apt-get install apache2-utils 2. Hey: go get -u github.com/rakyll/hey 3. loadtest: curl -sL https://deb.nodesource.com/setup_4.x \| sudo -E bash sudo apt-get install -y nodejs sudo npm install -g loadtest loadtest -n 10000 --rps 200 http://128.110.154.132:31112/function/hello-python REFERENCES [1] 3GPP. 3GPP TS 24.301 V8.6.0 (2010-06). https://www.scribd.com/document/ 397199061/24301-860-pdf, 2010. Retrieved June 28, 2019. [2] 3GPP. 5G System; Principles and Guidelines for Services Definition. https: //portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails. aspx?specificationId=3341, 2017. Retrieved June 28, 2019. [3] Amogh, P. C., Veeramachaneni, G., Rangisetti, A. K., Tamma, B. R., and Franklin, A. A. A cloud native solution for dynamic auto scaling of MME in LTE. In 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC) (Oct 2017), pp. 1–7. [4] An, X., Pianese, F., Widjaja, I., and Acer, U. G. DMME: A Distributed LTE Mobility Management Entity. Bell Labs Tech J. 17, 2 (Sept 2012), 97–120. [5] Apache. Apache Cassandra. http://cassandra.apache.org, 2016. Retrieved June 28, 2019. [6] Apache. ab - Apache HTTP Server Benchmarking Tool. https://httpd.apache. org/docs/2.4/programs/ab.html, 2019. Retrieved June 28, 2019. [7] APISTraining. HTTP in the 5G Core. https://apistraining.com/5g-core/, 2018. Retrieved June 28, 2019. [8] AWS. Amazon API Gateway Pricing. https://aws.amazon.com/api-gateway/ pricing/, 2019. Retrieved June 28, 2019. [9] AWS. Amazon DynamoDB Service Level Agreement. https://aws.amazon.com/ dynamodb/sla/, 2019. Retrieved June 28, 2019. [10] AWS. Amazon EC2 Dedicated Instances. https://aws.amazon.com/ec2/ purchasing-options/dedicated-instances/, 2019. Retrieved June 28, 2019. [11] AWS. Amazon EC2 M5 Instances. https://aws.amazon.com/ec2/ instance-types/m5/, 2019. Retrieved June 28, 2019. [12] AWS. Amazon EC2 T2 Instances. https://aws.amazon.com/ec2/instance-types/ t2/, 2019. Retrieved June 28, 2019. [13] AWS. AWS Lambda Pricing. https://aws.amazon.com/lambda/pricing/, 2019. Retrieved June 28, 2019. [14] AWS. AWS Lambda Service Level Agreement. https://aws.amazon.com/lambda/ sla/, 2019. Retrieved June 28, 2019. 61 [15] AWS. Dead Letter Queues. https://docs.aws.amazon.com/lambda/latest/dg/ dlq.html, 2019. Retrieved June 28, 2019. [16] AWS. Pricing for On-Demand Capacity. https://aws.amazon.com/dynamodb/ pricing/on-demand/, 2019. Retrieved June 28, 2019. [17] AWS. Pricing for Provisioned Capacity. https://aws.amazon.com/dynamodb/ pricing/provisioned/, 2019. Retrieved June 28, 2019. [18] AWS. Regions and Availability Zones. https://docs.aws.amazon.com/AmazonRDS/ latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html, 2019. Retrieved June 28, 2019. [19] AWS. Supported Event Sources. https://docs.aws.amazon.com/lambda/latest/ dg/invoking-lambda-function.html, 2019. Retrieved June 28, 2019. [20] Azure. Azure Functions. https://azure.microsoft.com/en-us/blog/ introducing-azure-functions/, 2016. Retrieved June 28, 2019. [21] Azure. Azure Regions. https://azure.microsoft.com/en-us/ global-infrastructure/regions/, 2019. Retrieved June 28, 2019. [22] Banerjee, A., Mahindra, R., Sundaresan, K., Kasera, S., Van der Merwe, K., and Rangarajan, S. Scaling the LTE Control-plane for Future Mobile Access. In Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies (New York, NY, USA, 2015), CoNEXT ’15, ACM, pp. 19:1–19:13. [23] Brain, S. Cell Phone Tower Statistics. https://www.statisticbrain.com/ cell-phone-tower-statistics/. Retrieved June 28, 2019. [24] Castilho, M. Handling 1 Million Requests Per Minute With Go”. http://marcio. io/2015/07/handling-1-million-requests-per-minute-with-golang/, 2015. Retrieved June 28, 2019. [25] Datastax. Apache Cassandra NoSQL Performance Benchmarks. https: //academy.datastax.com/planet-cassandra/nosql-performance-benchmarks, 2019. Retrieved June 28, 2019. [26] Datastax. How is the Consistency Level Configured? https://docs.datastax. com/en/cassandra/3.0/cassandra/dml/dmlConfigConsistency.html, 2019. Retrieved June 28, 2019. [27] Docker. Swarm Mode Overview. 2019. Retrieved June 28, 2019. https://docs.docker.com/engine/swarm/, [28] Emulab. End-to-end LTE/EPC Network with OpenAirInterface (OAI) Simulated eNB/UE and OAI’s EPC. https://wiki.emulab.net/wiki/phantomnet/ oepc-protected/oai-sim-epc, 2018. Retrieved June 28, 2019. [29] Emulab. OpenEPC Tutorial - Using the Profile Driven PhantomNet Portal. https://wiki.phantomnet.org/wiki/phantomnet/oepc-protected/ openepc-tutorial-profile, 2018. Retrieved June 28, 2019. 62 [30] English, J. Evolution of HTTP and DNS for 5G. https://www.netscout.com/news/ article/http-and-dns-5g-world, 2018. Retrieved June 28, 2019. [31] GmbH, C. N. D. Openepc. https://www.corenetdynamics.com, 2018. Retrieved June 28, 2019. [32] gocql Authors, T. Gocql. https://github.com/gocql/gocql, 2012. Retrieved June 28, 2019. [33] Google. Google Functions. https://cloud.google.com/functions/, 2019. Retrieved June 28, 2019. [34] Google. Google Locations. https://cloud.google.com/about/locations/, 2019. Retrieved June 28, 2019. [35] Hammond, E. How Much Does It Cost To Run A Serverless API on AWS? https: //alestic.com/2016/12/aws-invoice-example/, 2016. Retrieved June 28, 2019. [36] Li, Y., Yuan, Z., and Peng, C. A control-plane perspective on reducing data access latency in lte networks. In Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking (New York, NY, USA, 2017), MobiCom ’17, ACM, pp. 56–69. [37] Lueth, K. L. State of the IoT 2018. https://iot-analytics.com/ state-of-the-iot-update-q1-q2-2018-number-of-iot-devices-now-7b/, 2018. Retrieved June 28, 2019. [38] Mohammadkhan, A., Ramakrishnan, K. K., Rajan, A. S., and Maciocco, C. Considerations for Re-designing the Cellular Infrastructure Exploiting Software-based Networks. In 2016 IEEE 24th International Conference on Network Protocols (ICNP) (Nov 2016), pp. 1–6. [39] Nokia. Managing LTE Core Network Signaling Traffic. https://www.nokia.com/ blog/managing-lte-core-network-signaling-traffic/, 2013. Retrieved June 28, 2019. [40] Norton, A. Sprint Launches C3PO. https://newsroom.sprint.com/ sprint-launches-c3po-open-source-nfvsdnbased-mobile-core-reference-solution. htm, 2017. Retrieved June 28, 2019. [41] NSN. Signaling is Growing 50% Faster Than Data Traffic. https://docplayer.net/ 6278117-Signaling-is-growing-50-faster-than-data-traffic.html, 2012. Retrieved June 28, 2019. [42] OpenAirInterface. Openairinterface. https://www.openairinterface.org, 2019. Retrieved June 28, 2019. [43] OpenFaaS. Notes on Load-Testing or Performance Testing. https://docs. openfaas.com/architecture/performance/, 2019. Retrieved June 28, 2019. [44] OpenFaaS. OpenFaaS - Serverless Functions Made Simple. https://github.com/ openfaas/faas, 2019. Retrieved June 28, 2019. 63 [45] Python. SimpleHttpServer. https://docs.python.org/2/library/ simplehttpserver.html, 2019. Retrieved June 28, 2019. [46] Qazi, Z. A., Walls, M., Panda, A., Sekar, V., Ratnasamy, S., and Shenker, S. A high performance packet core for next generation cellular networks. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (New York, NY, USA, 2017), SIGCOMM ’17, ACM, pp. 348–361. [47] Rajan, A. S., Gobriel, S., Maciocco, C., Ramia, K. B., Kapury, S., Singhy, A., Ermanz, J., Gopalakrishnanz, V., and Janaz, R. Understanding the Bottlenecks in Virtualizing Cellular Core Network Functions. In The 21st IEEE International Workshop on Local and Metropolitan Area Networks (April 2015), pp. 1–6. [48] Raza, M. T., Kim, D., Kim, K., Lu, S., and Gerla, M. Rethinking LTE network functions virtualization. In 2017 IEEE 25th International Conference on Network Protocols (ICNP) (Oct 2017), pp. 1–10. [49] Sayadi, B., Stasinopoulos, N., Al Jammal, B., DeiA, T., Ropodi, A., Fajjari, I., Patachia, C., Griffin, D., Breitgand, D., Martrat, J., Vilalta, R., Siddiqui, M. S., Baldoni, G., and Sayyad Khodashenas, P. From Webscale to Telco, the Cloud Native Journey. https://5g-ppp.eu/wp-content/uploads/2018/ 07/5GPPP-Software-Network-WG-White-Paper-23052018-V5.pdf, 07 2018. [50] searchnetworking. In Mobile Networks, SDN and NFV Mean Service Orchestration. https://searchnetworking.techtarget.com/tip/In-mobile-networks-SDN-andNFV-mean-service-orchestration, 2014. Retrieved June 28, 2019. [51] Shachar, A. The Hidden Costs of Serverless. https://medium.com/@amiram_ 26122/the-hidden-costs-of-serverless-6ced7844780b, 2018. Retrieved June 28, 2019. [52] Wang, L., Li, M., Zhang, Y., Ristenpart, T., and Swift, M. Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) (Boston, MA, 2018), USENIX Association, pp. 133–146. [53] Wikipedia. AWS Lambda. https://en.wikipedia.org/wiki/AWS_Lambda, 2019. Retrieved June 28, 2019. [54] Wikipedia. CAP Theorem. https://en.wikipedia.org/wiki/CAP_theorem, 2019. Retrieved June 28, 2019. [55] Wikipedia. Continuation-Passing Style. https://en.wikipedia.org/wiki/ Continuation-passing_style, 2019. Retrieved June 28, 2019. [56] Wikipedia. Infrastructure As a Service. https://en.wikipedia.org/wiki/ Infrastructure_as_a_service, 2019. Retrieved June 28, 2019. [57] Wikipedia. List of United States Wireless Communications Service Providers. https://en.wikipedia.org/wiki/List_of_United_States_wireless_ communications_service_providers, 2019. Retrieved June 28, 2019. [58] Wikipedia. Optimistic Concurrency Control. https://en.wikipedia.org/wiki/ Optimistic_concurrency_control, 2019. Retrieved June 28, 2019. 64 [59] Wikipedia. Serverless Computing. https://en.wikipedia.org/wiki/Serverless_ computing, 2019. Retrieved June 28, 2019.
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6pp56rv