Cavirin Blog

With increasing reliance on the cloud, and in many cases on a single cloud service provider, the probability for a widespread (though infrequent) outage grows.  On Tuesday, AWS S3 storage experienced a major outage, taking down the back-ends of many sites that include Netflix, Slack, and HubSpot, two of which we use at Cavirin.  For enterprises that were single threaded, they just had to wait it out, and though the actual outage lasted only 4 hours, it took the remainder of the day for many to recover.  To give you an idea of the magnitude of the impact, AWS S3 supports over 150K sites and upwards of three trillion data elements.  Thousands of tweets were questioning if the Internet went down, just like last October with the Mirai outage.  Compounding the problem is that the storage service is shared across multiple AWS zones, and though an enterprise may distribute compute across geographies, due to practical or cost reasons they may depend upon a single storage instance. 

Despite immense amounts of automation, the human element may still be the weak link, as reported by USA Today – “The most common causes of this type of outage are software related,” said Lydia Leong, a cloud analyst with Gartner.  "Either a bug in the code or human error. Right now, we don't know what it was."    The publication Slate took a more somber view - “At this point, we practically expect that whatever personal information we enter into websites will be stolen.” 

So how to combat these types of outages as well as human risk?

First off, the larger enterprises do in fact have a cloud DR strategy.   For example, if AWS fails, the enterprise may have warm-standby capability on GCP, Microsoft Azure, or maybe on-premises.  Though most DR programs fail into the cloud, there is nothing precluding a scenario where an enterprise may have critical applications on-premises, less critical ones in the cloud, and an option to rehome these on-premises in times of emergency. 

What this implies is that the enterprise must have a security compliance architecture that spans these multiple domains.

The success of any sound DR strategy involves continuous replication of critical data to failover after the disaster is rectified, so that the business continuity is guaranteed. In addition, the replicated systems must have the same rigorous, continuous security monitoring and assessment requirements that is expected from live production systems. That way, when failover happens during outages, the restored systems and services will not have any vulnerabilities. The scope of any security platform such as Cavirin must include DR-replicated systems as well in addition to live production assets.

If enterprises have implemented AWS Hardening Benchmarks, and their workloads move to GCP, they should ensure that the same protections are in-place.  And this applies not only for conventional virtualized workloads but for containers as well.   They need to ensure that the hardening applied to a given OS on one cloud provider are also available on another, and that compliance is agentless and continuous to quickly build the baseline and identify any risk.  

It is in times of outages that IT is stressed the most and likely to make mistakes. 

Here, automation of the security compliance process is critical.  In the same way, if workloads move from the cloud to on-premises and vice-versa, the same benchmarks, rules, and automation must span these different domains.   Having to use one tool on one CSP and another in-house is yet another area of potential failure.

We will never be able to totally prevent outages, but by implementing best practices based on available security tools, the enterprise will be able to more effectively protect against negative customer impact or worse.

Snippet of AWS Eastern US status during outage.

As many noted, even accurate reporting of the outage was unavailable for a while, which harkens back to the Mirai US DNS outage last October.


The CISO is under immense pressure, expected to manage a dozen or more vendors across perimeter, endpoint, network, application, and data security, not to mention having to be an expert on policy and operations.  Hackers in many cases have the upper hand, and the human element is still the weak link. 

Because of this, more and more enterprises are realizing that what we offer to automate some of this is no longer a nice-to-have…. It is a must-have!   At the same time, we’re able to clearly show our differentiation from the vulnerability assessment vendors, and we are more versatile than the cloud-only solutions.  Look at it this way, best articulated by one of our customers, Cepheid.  VA will tell you how many windows and doors you have, and which are open.   We take the next step, and tell you how to close them.  And, if you are so inclined, we’ll do the closing.  

The API-first architecture of our new Pulsar platform was also top of discussion, with potential ecosystem partners realizing the need for a unified view of overall security compliance, be it server, endpoint, identity, or vulnerability, and across all clouds and containers.  If you missed it, check out our Pulsar General Availability PR.  In all, a more than successful first day for Cavirin’s first RSA presence, based on both the quantity, and more importantly, the quality of discussions and demos. 

(Breaches photo from SS8 shirt at RSA - thanks!)







The Hackers – Time Magazine person of the year runner-up, and what it means for the rest of us

This last week, Time announced their person of the year, and as expected, President Elect, Donald Trump got the nod. More interesting was the selection of Hackers as number three. In fact, cybersecurity also touches Donald Trump, the person of the year, and Secretary Hilary Clinton, the runner-up, both knee deep in the conversation and controversy. Trump with his ties to Putin and attacks against the DNC, and Hilary with her private email server. 2016 also saw terms such as ransomware and IoT botnets enter water-cooler conversation, and the credit card hacks of the past were eclipsed by an order of magnitude when Yahoo admitted the breach of over 500 million email accounts. Even the Internet was not immune, with a denial of service attack in October cutting off connectivity to many well-known web properties.

Cavirin provides security management across physical, public, and hybrid clouds, supporting AWS, Microsoft Azure, Google Cloud Platform, VMware, KVM, and Docker.



5201 Great America Pkwy Suite 419  Santa Clara, CA 95054

- 1-408-200-3544

Monday - Friday: 9:00 - 18:00

Cavirin US Location