Trust but Verify: How To Handle Managed Hosting Providers

    

How to Handle Managed Hosting Providers

Every company I have worked for has decided to outsource their datacenter to a managed hosting provider, in some manner or another. Some only outsourced their websites, others went all in and had them manage their entire infrastructure. 

Most managed hosting providers will regulate everything from the operating system down the stack to the hardware, the cooling and the electrical for you. Many will use virtualization technology of some kind to help optimize usage and resources for you. They even help companies "right size" their virtualization environment when it goes into their hosting environment. 

 There are many advantages to having a hosting provider handle your environment. Typically, it takes what is normally a capital investment and turns it into an operational expense. Also, companies can gain the advantage of know-how from the provider. The best providers are staffed deep with administrators and engineers that do this work every day and are extremely knowledgeable and experienced with the subject.

 Sounds pretty good, right?

Companies can reduce overhead, gain skills and knowledge they did have access to before, as well as transfer their environment out of their office building and into a location that is secure and redundant in every degree.

What could possibly go wrong? 

Unfortunately, lots of things (you knew there had to be a catch, didn't you?)

Now, I am going to point out some specific items that I have witnessed first-hand. These are specific items and not blanket generalizations about hosting providers. My intention is to educate you on the potential risks of using hosting providers, not to bash the providers. 

 Ok, on with the not-quite-bashing of providers.

Right-Sizing Your Environment

The first item that's worth noting is the right-sizing of your environment.

Anyone that has worked with virtualization environments knows that they have provisioned servers without knowing the requirements they really need to run. We provision servers with 32G of RAM, 8 vCPUs and hundreds of GB of disk space, only to later find out that the server only really needs 1 vCPU and 4GB of RAM with 25GB of disk space afterwards. And of course, after the server is up and running, the business owners are never going to let you downsize the machine to fit what it really needs. 

Hosting providers have many tools that allow them to look at your environment and know exactly how they should provision servers for you. This will help reduce a lot of that unused overhead, and here comes the catch—they are going to give you the absolute bare minimum resources every time. They are in the business to make money, so they are going to do their best to shove as many guests on a single host as possible. They are also going to over provision resources as much as they can because their priority is cost reduction.

Right Sizing Your Environment

Most of the time, it's fine. The tools allow them to see what the utilization is and plan accordingly. From personal experience, this works about 9 out of 10 times. There is always a server or servers in your environment that need more due to their workload.

 Personally, I spent hours trying to understand what was happening with one of the financial applications a previous employer had moved to a hosting company. At the time, for an unknown reason, the server would wind down to a snail's pace every Tuesday around 11:30am and would recover two hours later. The issue was incredibly difficult to troubleshoot as this server was almost inaccessible during these two hours because it was so resource-starved. After what seemed like weeks of troubleshooting and being unable to replicate the issue, I stumbled over one of the company's accountants that was running a major report every Tuesday and would kick off the report and then head to lunch, because she knew it took a long time to generate. It had only been an hour before the server was moved and she happened to mention it was taking twice as long now. Mystery solved! The hard part was what actually came next.

You would think that I could just put a ticket in to the service provider to increase the resources for that box and schedule a reboot for it. It's never that easy.  What I found out next boggled my mind. The hosting provider had provisioned all our servers in two specific clusters in their environment. This specific server was on a "low usage" cluster, was already grossly over provisioned, and there were no resources available for it. The other cluster was on an entirely different network. The server couldn’t be moved to that cluster without a "major downtime" for the application, including redefining the networking stack on that cluster, upgrading the OS on the server and moving three other servers from the financial app environment to that cluster.

 While it took me over a month to troubleshoot, it took almost two months to get it resolved due to all the complications with "simply adding more RAM" to that server. And of course, there was an additional uplift cost for the server because we were adding more resources to it, which had to get approved all the way to the CIO by the nature of how the contract was written.

 The ability to provision resources in an agile fashion is important for a hosting provider. Most can do it easily, but sometimes snags like this can occur.  As I was told, “measure twice, cut once.”  This old saying definitely applies when you are planning an exercise like this and it's a continual effort to ensure that your environment meets your business need, not just at the initiation of a contract but throughout.

Know Your Security Responsibilities

The second area is all about security (I know, you are surprised I’m writing about security).

The move from your on-premise datacenter to a hosting provider is often a fantastic security opportunity. Hosting providers won’t let you deploy servers in their environment on operating systems they can’t support and will force you to move to the latest version of whatever operating system is on the market. This allows companies to get rid of all their legacy systems that they can’t patch or upgrade and gets a clean start on new systems that are fully patched. 

Now the bad news: That's often when the security benefit ends. You will need to work with the service provider frequently to determine patching schedules and work through how they lock down your hosts. They are often using volume pricing for Anti-Malware tools, so they might not be using what companies come to expect at this point.

When it comes to hosting providers, the real concern with security is what is now commonly referred to as the "shared security model." Amazon and Microsoft have been very forthcoming and spent a lot of time documenting the shared security model, laying out what they are responsible for in security and what the customer is responsible for. This type of model has always been in place with hosting providers, but until recently, it wasn’t well documented.

Shared Security Model

 Your hosting provider isn't going to tell you specifically what your responsibilities are when you engage with them the first time.  This is a lesson that has been learned the hard way.

Recently, a customer’s internal host-based IPS system started throwing alerts for systems all over their environment. It looked like a massive malware breakout through their company. After getting a handle on the alerts, the HIPS system was doing its job and preventing the breach of thousands of systems. The next question on everyone's mind was, "Where did it come from?" Often, the source of a breach can take a long time to determine, but in this case the HIPS system was verbose enough to give a trail of breadcrumbs right back to the source. It was one of the hosted systems that an engineer from the hosting company (for reasons still unknown) dual homed with a questionable network and it got righteously hacked. The issue was mitigated and cleaned up with no real damage done, thankfully. If this company didn't have the HIPS system in place, this malware would have infected absolutely everything on their network all because of a misconfiguration on one of the hosted systems.

What can you do about a human error? They happen to everyone. This error was clearly an accident by some administrator and nothing deliberate. To protect yourself, don't trust their environment. Companies should secure their uplinks to their hosting provider as it if were on the internet and have controls in place to prevent this type of behavior. And of course, if something does happen, you should already have an established agreement that allows your company to receive compensation for problems like this. Check your contracts to ensure that they are assuming the risk and are liable for situations like the ones described.

Trust but verify. Competent hosting providers have a great deal of talent and have been in business for a long time because they do their job right. They have the resources to help you, both in assets and personnel. Despite this, accidents and mistakes are inevitable. Ensure that you are monitoring all the activity and performing health checks on your hosting environment regularly. At the end of the day, it’s your data they are holding.

The Essential Guide to IT Security Strategy

Comments