Managing Uncertainty Podcast - Episode #115: Ransomware and Backups

In this episode of the Managing Uncertainty Podcast, Bryghtpath Principal & Chief Executive Bryan Strawser discusses Brian Kreb’s recent article “Don’t wanna pay ransom gangs? Test your backups”.

Topics discussed include dealing with ransomware as a disaster recovery and business continuity problem. Do you have the proper backups in place? Do you really know what recovery will require in the event of a large-scale ransomware issue?

Related Episodes & Blog Posts

N9JkaZ-l_i7Tvb7Ao4BBF53uBCxkhL9M6BUunikd_aABuQ5q6rhT2yOj-qCHD2AsGyW7qz2D-BlbfR86_uvj=s0 Managing Uncertainty Podcast - Episode #115: Ransomware and Backups

Episode Transcript

Hello, and Welcome to the Managing Uncertainty Podcast. This is Bryan Strawser, Principal and Chief Executive here at Bryghtpath. And in this week’s episode, I want to talk about ransomware specifically about backup strategies and ransomware. And what raised this to my attention earlier this week is an article by information security journalist, Brian Krebs who writes at KrebsOnSecurity.com, and his article is entitled, Don’t Wanna Pay Ransom Gangs? Test Your Backups. And Brian hits upon some really important facts and underlying issues with the way we think about ransomware right now.

And we’re thinking about ransomware, I think we think about ransomware primarily as an information security problem, and it is, but it’s also a disaster recovery and business continuity problem. And it’s bigger. It’s more of an issue. It’s a bigger issue than we think of in the BCDR space. So, I want to kind of talk through some key points that Brian makes then I’ll add a little bit of other contexts to wrap up this episode. Krebs writes, look at the comments on almost any story about a ransomware attack and you’ll almost surely encounter the view that the victim organization could have avoided paying the extortionist, the ransom working if they’d only had proper data backups.

The ugly truth, as he writes, there’s many non-obvious reasons why victims wind up paying even though they’ve done nearly everything right from a data backup perspective. His story, he points out is not about what companies are doing in response to cyber criminals that are holding their data for hostage and that’s become a best practice now. And in an approach on how they’re going to, how you respond, how do you do this? But rather why will victims pay for a key? The key that’s needed to decrypt their systems, even when they already have the means to restore anything from backups on their own.

What experts are saying according to Krebs is the biggest reason that ransomware targets are, why ransomware targets or their insurance providers are still paying, even when they already have reliable backups. Is nobody at the victim organization has bothered to test in advance how long that data restoration process might take? Krebs quotes Fabian Wosar, the chief technology officer at Emsisoft, perhaps. And Fabian says, in a lot of cases companies do have backups, but they have never tried to actually restore their network from backups before. So they have no idea how long it will take.

Suddenly the victim notices, they have a couple of petabytes of data they need to restore over the Cloud. And they realized that even with their fast connections, it’s going to take three months to download all the backup files. A lot of IT teams have never even made a back of the napkin calculation on how long it would take to restore from a data rate perspective. Wosar said the next most common scenario involves victims who have offsite encrypted backups of their data, but discover that digital key needed to decrypt their backups was stored in the same local file sharing network that’s now been encrypted by the ransomware. And the third issue, according to Wosar is that victim organizations, the impediment victim and organizations being able to rely on their backups is that the ransomware purveyors managed to corrupt the backups as well.

Now Wosar says, that’s somewhat rare but it does happen, but that’s been more of the exception than the rule. That unfortunately it’s been quite common to have backups in some form in one of these three reasons that he outlines that prevent them from being useful. All at a fourth, that happened recently to a colleagues’ organization, where they had a ransomware attack that encrypted their production systems in a manufacturing environment. And of course their immediate response was we’ll just restore from backup. So they went to obtain their offsite backups to find out that the ransomware group had social engineered their way into the offsite backup provider and had deleted their account.

They had terminated the service and deleted their accounts and their backups were gone. So there was no backup. And now they were forced to pay the ransom in order to regain access to their systems. Krebs sites, Bill Siegel, CEO, and founder of Coveware, which is a company that negotiates ransomware payments through victims. And Siegel says most companies that pay either don’t have properly configured backups, or they haven’t tested their resiliency or the ability to recover their backups against the ransomware scenario. Siegel quotes, or there’s lots of software applications that you used to do a restore and some of those applications are in your network that got encrypted.

So you’re like, “Oh great.” I have these backups, the data’s there but the application to actually do the restoration is encrypted. So there’s all these little things that trip you up and prevent you from doing a restore when you don’t practice. Wosar quoted earlier, says all organizations need to test their backups and develop a plan for prioritizing the restoration of critical systems needed to rebuild their network. So think about this for a minute in your own disaster recovery environments. A lot of us have requirements from a disaster recovery standards perspective, where you have a backup strategy that is ransomware resilience, and by that, I mean you have onsite and offsite backups.

You have rotating backups. So you’ve got a continuous, to a daily, to a weekly in a standard backup scheme. That you have three separate generations of backups, and then at least one of those backups is stored offsite and these backups are immutable, meaning that they can’t be modified and you’re able to restore those backups. And then our testing strategy is usually, well that can restore one database or I can restore one stack or I can restore one application then I assume that my backup strategy works. I know earlier in my career, that’s what my backup strategy looked like.

That’s what my DR testing strategy look like. But in the ransomware era, we need to go deeper than that. One of our clients operates a large cloud-based software platform and disaster recovery is still a little immature in their world, and they’re still working through this getting a solid approach implemented. Can they restore an individual database? Yes. Can they restore a stack of equipment? And I shouldn’t say equipment, but a stack of data between application servers, web servers and the backend databases. Yes. If they were to lose a data center or the entire data center was encrypted, could they restore all of that? Theoretically, yes. How many days would it take?

When I started looking at that after reading the Krebs article, it’s actually weeks to months. So that backup strategy might be acceptable in terms of the letter of their policy and the standards that they’re following and the regulatory frameworks that they’re required to adhere to. But is it acceptable in the ransomware era? Do they have the depth of backup necessary and can then restore that in a time period that will be acceptable for their clients, for their providers, that they are responsible for supporting. Siegel goes on to quote in the Krebs story that in a lot of cases, I’m sorry, Wosar goes on to quote in a lot of cases, companies don’t even know their various network dependencies, and so they don’t know in which order they should restore systems.

They don’t know in advance, Hey, if I get hit and everything goes down, these are the systems and services that are priorities for a basic network that we can build off of. Okay. Here’s a really interesting continuity and disaster recovery challenge. If you’re doing effective business impact analysis and business continuity planning and you’re including your technology dependencies in this discussion, and you’re then following that chain over to your IT team, and you’re going down the line for disaster recovery from the application or platform that you’re trying to recover, and you start looking at the individual components, you should be able to answer that question.

What are the dependencies, in what order should we restore the systems? How does that line up with what your restoration capabilities actually look like? Wosar said that it’s essential that organizations drill their breach response and periodic tabletop exercise, completely agree with that. And it is in these exercises that companies can start to refine their plans. For example, he says, if the organization has physical access to their remote data center, it may make sense to develop processes for physically shipping or moving the backups to the restoration location. He’s dead on here in terms of creative thinking, can you actually do that and move it to the location?

Are you running off of that backup or are you able to physically relocate that equipment or the storage devices in order to help facilitate recovery of your prime location? Wosar goes on and concludes that many victims see themselves confronted with having to rebuild their network in a way they didn’t anticipate, and that’s usually not the best time to have to come up with this sort of plan. That’s why tabletop exercises are critically important. We recommend creating an entire playbook so that you know what you need to do to recover from a ransomware attack, and that’s dead-on, that is exactly the approach that you want to be thinking about.

Your ransomware playbook, your breach playbook, your DR playbook, however you want to think about that cannot just be about the management of the reputation of your organization in the incident, although that’s important. It is about the ability to recover the technological capabilities that your clients and customers need to do their work. It is to recover the technological capabilities that allow your business teams, that service your customers, that generate revenue that make your organization go round. Those are the things we have to keep in mind as we’re thinking about those plans.

If we’re not exercising those plans and having an honest lessons learned process, and then setting up prioritized actions across the organization to address the things that you learn as Wosar is laying out here with tabletop exercises. Your program will not advance and disruptive incidents, like a ransomware attack will continue to wreak havoc on our industries and in your organization. We’ll link to the article in the show notes for further reading, but I thought it was a great insight from Brian Krebs that really connects, really kind of continues to play on that convergence of information security, physical security, business continuity, crisis management and disaster recovery. That’s it for this edition of the Managing Uncertainty Podcast, we’ll be back next week with another new episode. Be well.

Show Notes

Related Episodes & Blog Posts

Episode Transcript

Other Episodes

Episode 0

Managing Uncertainty Podcast - Episode #105: Taking care of yourself during a crisis

Episode

Managing Uncertainty Podcast – Episode #284: Is it time for a Chief Resilience Officer?

Episode 0

Managing Uncertainty Podcast - Episode #239: Thinking about Business Strategy