The Technical Blog

Orchestration without Kubernetes

Tue 25-Feb-2020 10:23 AM +00:00

A couple of huge trends have swept through coding in the past many years: microservices, and containers.

For many developers and shops now, "microservice" has come to mean "container", and once you have containers, well... you have to orchestrate them, right? And that means Kubernetes.

Kubernetes is great. It's come a long way. The hosted versions of it, like Azure Kubernetes Service, completely manage the VM's for you so you can focus on deploying your pods.

And yet... Kubernetes has paper cuts. It works, but once you get past the Kubernetes 101 demos, and you get into production, you start to see the levels of care and feeding that you're responsible for. It takes some skill and experience to run it well.

Questioning the assumptions

So... Kubernetes good. Kubernetes kinda hard sometimes.

Is it necessary? Well... not really. Not for most solutions, I'd say. And I want to talk about why, if for no other reason than to have an interesting discussion about solutions architecture. I'm not saying "don't use Kubernetes" at all... I'm just thinking through the choices.

And does "microservice" have to = "container"? I like what a friend of mine said about containers: "I love them for dev envionments for my team, because it eliminates setup time, and I can trust the versions, but I don't deploy them to production. They're unnecessary."

So, yes, "microservice" = "container" for dev, but maybe not for prod. I like that perspective. (And in case you're still not using containers even in dev... don't worry about it. You don't have to.)

In other words, remember when we used to just run servers, and we made sure they had the right runtime versions installed? And it worked? It still does, and you don't need an orchestrator for it.

Orchestration using Platform-as-a-Service

I spent years of my career as a network administrator, and I've installed and managed more servers in my life than I can remember. I still do it when I have to, but if I don't have to, I use Platform-as-a-Service. And almost all of the time, I don't have to.

I'm going to use Azure App Service and Azure Functions as my example deployment vehicles for this exercise. You might familiar with other PaaS application services, like Heroku, so feel free to swap the product names in your head.

Azure App Service lets me deploy just-my-code to a managed, fully secure, highly tuned VM with predictable versions of runtimes installed by Microsoft.

And if I want to run a container on App Service, great. And if I want to bring-my-own-runtime, great. But I don't need to.

What do I want from containers?

What I really want from a container is: a process, running on some computer, with predictable versions of runtimes (so I don't waste time chasing runtime version issues).

With App Service, I get exactly that: a process, running on some computer, with predictable versions of runtimes (installed by Microsoft instead of docker pull).

I don't need containers in production to get the benefits I want from containers. There are other ways to get those benefits that keep things much simpler. No YAML, no multi-gigabyte network transfers to get things into test and production.

What do I want out of orchestration?

So that's what I want out of containers... what about orchestration? What do we get out of it? Can we get that some other way?

Kubernetes gives fine-grained control over how many versions of each pod are running on which nodes. That's awesome. If you need that level of control, it's there.

Do we always need that level of control over exactly how services get deployed? I don't think that we do. Sometimes we do, for sure, but not always.

What I think we really want from orchestration, most of the time, is:

  • Can I make sure I have more than one instance of a service running, for redundancy?
  • Does each service have enough resources (CPU, memory, network bandwidth, etc.) to run well?
  • Can I auto-scale capacity up when I need to to deliver more resources? Can I auto-scale down when things are slower?
  • Can I roll out updates in safe way that doesn't cause site outages?

Something like that. You might have more requirements, but that's mostly what we get from Kubernetes.

And, for this exercise, I'll suggest that Azure App Service (with Azure Monitor for fine-grained control over auto-scaling) gives you those features without having to write a single line of YAML.

Group services by their scaling requirements

Imagine that we have 100 services that we need to deploy onto a set of App Service machines. How should we organize them for the best experience for our users, and our lowest Azure bill at the end of the month?

I'll start by suggesting that you think about your services in terms of which resource they're likely to run out of first: CPU, memory, or bandwidth; or which external resources indicate pressure, like queue lengths or task pool task length.

I think these are a good first guess for most organizations on how to deploy services together, but - I want to be clear - they're not intended to be an authoritative list. At all. Once you start thinking about your services you might find other perspectives to take on your services that help you scale them up and down properly. But let's proceed with that list.

Services with similar scaling requirements get deployed together.

So, back to the 100 services, let's imagine that they break down like this:

  • 50 are CPU-bound with low calls/sec
  • 20 are CPU-bound with high calls/sec
  • 5 are memory-bound
  • 5 are bandwidth-bound
  • 10 are message-queue-length-bound
  • 5 are task pool bound

For the services that are CPU-bound, what we're saying it that: if I have sufficient CPU available for this service, it should run fine. If two or three services that are all CPU-bound run together on the same servers, and they need to auto-scale up, it doesn't matter which service caused it... they'll all run on additional CPU's and all have enough to run well.

I definitely don't need to decide up-front how much memory each server needs; the Windows and Linux memory managers are far better at allocating memory across dozens of processes than I will ever be.

And if 50 (or more) services all were CPU-bound, stateless, and relatively low calls/sec, we could run all of them together in one Azure App Service Plan with a large enough VM size chosen, and all we have to do is set auto-scaling to be based on CPU %. Or split them up into two. It's not like a server can't run 50 processes without a problem.

Whatever. It works. They're just processes. I never have to think about it.

For the services that are memory-bound, we can deploy them all together on sufficiently large VM's that they run well, and we can use Azure Monitor to set up auto-scale parameters around not just CPU, but also things like available memory and process memory sizes.

For the services that are bandwidth-heavy, same thing, We use Azure Monitor to keep an eye on network traffic, and scale accordingly. The services requiring the bandwidth don't have to care which service caused a scaling event, as long as they all have sufficient available bandwidth.

As an aside, in many cases, being queue-length bound or being task pool task-bound shows up as just using a lot of CPU, and if you throw more CPU at either problem, it should go away. But it's entirely possible to be bound by some external service or resource that shows up in your system as a queue-length issue.

Fine tuning

Start simple. When you identify a service or two that really should be run separately from the others, move it to another App Service Plan. No big deal. Let evidence be your guide.

What about Azure Functions?

I'll bet that some of those services don't get all that many calls every day, and I'll bet that some of them work asynchronously - responding to queue messages, or handling tasks not directly related to UX - and for those services, Azure Functions is great. Let Azure handle everything, and only pay for the actual CPU time and memory your code is using.

And if you want to use Azure Functions for your services, but need to make sure that you never hit a cold start, you can deploy an Azure Function project on an App Service Plan that you own exclusively. You can deploy them side-by-side with Web Apps on the same Plan. Totally up to you. "Serverless" model, but on servers you have to yourself.

Wrapping up

Again, I'm not saying that Kubernetes isn't good at what it does, or that it's not an appropriate choice for some solutions. I'm just saying that it does come with quite a bit of care and feeding required, and for many solutions - many more than we seem to have settled on as an industry - you don't need it. You can deploy dozens of services, handling thousands of requests/second, that will run very well without an orchestrator, or a single IaaS VM, or YAML file, in sight.

Really, what I'm encouraging you to do is to think for yourself. Don't just default to what's in fashion right now. Every piece of technology you choose to operate yourself incurs tech debt, takes up engineering time, causes outages, requires updates, etc. After a lot of years in the business, I want as little of that as possible.

Azure App Service and Azure Functions are amazing Platform-as-a-Service pieces to build on that give you utility and performance with as little management overhead as possible. I use them whenever I can, but, either way... think about PaaS instead of IaaS, think about what you really need, not just what the cool kids are doing, and keep things simple.