Submitted by Callum Whyte
Delivering maximum uptime and the fastest response to our end users all over the world is very important... So we deploy our application to powerful servers and hope for the best. But managing and visualising each component of our infrastructure, how it everything hangs together, detecting and handling failure, as well as deployments, can all be a challenge.
Callum will share his experience of adding world-class resilience to globally distributed (ASP.NET) web applications. Everyone will come away with actionable steps they can implement in their organization to improve resilience and performance, regardless if they're building websites in Node or microservice systems in .NET.
Rolling out changes to distributed systems takes time and planning, so we'll start by configuring a rock-solid deployment process with Azure Pipelines - including handling approvals and failures across multiple concurrent targets.
We'll look at using tools like Cloudflare and Azure Front Door to monitor our environments and optimize delivery to users - bridging multiple cloud providers and hosting setups (Azure PaaS, AWS VM, Kubernetes cluster), routing users to the best available location, and self-healing for maximum uptime.
Working at distributed scale brings challenges and trade-offs - from handling cache purging to balancing data consistency vs. performance. We will explore high-performance geo-replicated alternatives to traditional data stores, such as CosmosDB and Table Storage, and how they can benefit systems (such as membership) that require real-time data reading/writing.