Blog | 06.21.2021

Building Resilience Through the Cloud in a Post-Pandemic World

As we look to a post-pandemic world, one of the areas of investment we can expect to see is in building resilience to destructive type attacks. 2020 saw a record number of distributed denial-of-service (DDoS) and ransomware attacks, which is only expected to continue through the rest of this decade. Many organizations are now looking to the cloud to help achieve resilience to these types of attacks. But what is it about the cloud and cloud-native architectures that lend themselves to resilience to these types of attacks?

Three attributes come to mind: distributed, immutable, and ephemeral.

Distributed – Applications and Services: If your applications are leveraging a distributed delivery model — for example, leveraging cloud-based services such as content delivery networks (CDNs) — then you have to worry less about DDoS attacks, as these attacks work best by concentrating their firepower in one direction.

Immutable – Data Sets: And if your applications are leveraging solutions that do not modify records, but rather are “append-on-write” (in other words, your data set is immutable), then you have to worry less about attacks on the integrity of that data, as it is easier to detect and surface such attacks.

Ephemeral – Workloads: And finally, if your applications are ephemeral in nature, then you may worry less about attackers establishing persistence and moving laterally. And the value of confidential information, such as tokens associated with that application instance, is reduced as those assets simply get decommissioned and new ones get instantiated within a relatively short time frame.

Therefore, by leveraging modern cloud-native architectures that are distributed, immutable, and ephemeral, you help address the issues of confidentiality, integrity, and availability which have been the foundational triad of cybersecurity.

So, how are companies manifesting these attributes in their applications? Modern cloud architectures are moving from monolithic, tiered models to distributed microservices-based architectures, where each microservice can scale independently, within a geographic region or across regions. And each microservice can have its own storage and database optimized for that service, thereby allowing that service to run stateless (or perhaps more accurately, using a shared-state model where the state is shared amongst the running instances via the storage/database layer). This allows those services to become truly ephemeral and distributed.

Pets vs. Cattle

This brings us to a concept that has been talked about for some time in the context of the cloud — pets versus cattle. Pets have cute names and they can be recognized individually. If a pet falls ill, the owner takes it to the vet. Owners give them a lifetime of caring and make sure the pet lives a healthy life for as long as possible. Traditional applications are like pets. Each instance is unique. If the application gets infected, it is taken to the cyber vet.  “Patch in place” is common with traditional applications that make these instances unique. The job of IT is to keep the applications up and running for as long as possible.

Cattle, on the other hand, don’t have names. They have an obscure number, you generally cannot distinguish the cattle in the herd, and you don’t build relationships with them. If cattle fall ill or get infected, you cull the herd. Modern cloud applications are like cattle. You create many running instances of the services, and each instance is indistinguishable from the other. They are all manifested from a golden repository. You never patch in place — that is, never make the instances bespoke. Your job is to make the instances ephemeral, killing the instances quickly and create new ones. In doing so, you build resilient systems, which in many ways is the opposite of keeping applications up for as long as possible — these latter systems tend to be more fragile.

The Benefits of the Cloud

The cloud offers many tools to help build systems that follow this paradigm. For example, Amazon recently announced “Chaos Engineering” as-a-service, which allows organizations to introduce elements of chaos into their production workloads, such as taking down running instances, to ensure that the overall performance isn’t impacted and the workloads over time become resilient in the face of these types of operational setbacks.

Getting to this point is a journey, and it may be accomplished in multiple steps. For example, organizations may move their “pets” — their traditional applications and workloads — from an on-premises world to the cloud world, without significantly altering the architecture of the applications. The common term for this is “lift and shift.” Once the applications are in the cloud and organizations have started building familiarity with cloud-native tools, they can work on re-architecting their traditional applications (pets), into modern architectures that are distributed, immutable, and ephemeral (cattle). In other words, they can move from pets-in-the-cloud to cattle-in-the-cloud. However, organizations need to make sure that once they get to this point, they don’t regress and move back to pet creation. For example, they don’t patch in place or keep instances up and running for longer than necessary.

Maintaining real-time or near real-time visibility at each step of the journey is critical to ensuring early detection of pets or pet-like behavior. As new workloads are moved to the cloud in a lift-and-shift model, or as workloads are re-architected into modern microservice type architectures, understanding the internal and external dependencies — that is, the interactions between users and the applications, and between the different application components themselves — is important to enforce the right policies and to detect and disincentivize pet creation. And while there are many ways to do this, looking to the network activity footprint of these applications provides a ground-zero approach to mapping this out.

Hear from Gigamon experts on the latest trends and best practices to optimize network visibility and analysis.