Description |
Managing networks in practice continues to be complex and fragile. This is largely due to the increasing complexity of networks themselves. As new services, devices, and protocols emerge, more information is collected to monitor them. Embedded within this mass of data are details about service-impacting events, whether performance or security related. This context is essential for understanding the nature of events, an important first step of troubleshooting and mitigation. However, gathering this context involves more than making simple queries and perusing logs. With a wide range of possible events, each exhibiting different symptoms across various monitoring datasets, operators are required to expend more effort in reasoning about complex situations. In this dissertation, we address this complexity by addressing three main challenges associated with inferring and understanding context in network management. Specifically, this includes finding context of anomalous events across multiple dimensions, composing context about the network as observed by multiple diverse management applications, as well as gathering context about high-priority security-impacting events in real time. We show that by deepening our understanding of the context surrounding network events, network management operations become simplified and more amenable to automation. |