tldr; A salty salute to Kubernetes, Marathon, and Swarm regarding containers and persistence. Your data is more safe and sound than ever with containers thanks to virtual volume links, snapshots, and cloud services. Developers everywhere are hacking away to get their code into production. Unfortunately, for a vast majority of enterprise devs, even if they perfect their applications, get through code review, write unit tests, conduct integration testing, and meet all of those unclear requirements, they are going to have a fruitless Friday experience thanks to infrastructure and dev ops. Let's fix that.

Swarm Fish, Marathon Fish, Kubernetes Fish, ECS Fish?

I can't express how great ALL of these technologies are, however, we have found Kubernetes to be easily accessible, well documented, and well-supported. It works with a wide selection of solutions which allow us to quickly adapt to the latest trends without rebooting our designs. There is a vast community constantly improving and augmenting it. The utilities bundled in a CoreOS ecosystem are more powerful than most licensed solutions on the market today. We can easily deploy, monitor, and maintain thousands of systems within a Kubernetes stack without ever losing any sleep. (and we get Slack notifications for everything!)

As a side note, we initially started our analysis with Mesos for out-of-the-box functionality, but soon found that supporting DC/OS wrapped around Kubernetes would give us the most utility.

We are constantly experimenting with new solutions as they appear on our feed, and we enjoy giving you the insight into our research and design.

The White Fish in the Room

Containers are virtual, which is usually categorized as temporary, and your IT department is inclined not to trust anything that risks destroying data.

At Graphstory, data is our business model. There are many like it, but this data is YOURS. We get that. We promise to safeguard it with our lives.

Cast a line... Got a bite?

What are some claims against it? Here's what I've seen:

  • Questionable persistence
  • Fairly young
  • Goes against many of the beliefs the IT industry has been taught to believe
  • Requires research to implement and support

  • "It works on my local machine"
  • "Infrastructure needs to set up a server for dev"
  • "We don't have a license"

Remember your limewire collection that you backed up on external hard drives?

That's right! I am encouraging you to tear down your precious production systems, surrender all your rights to AI, and abandon everything you learned in that college databases class.

Why? Higher performance, Scalability, Simplicity

"There's no place like 127.0.0.1" and, for many shops, there's a constant refrain of "It worked in dev". Well, here "Works on my machine = Works in Production"

Cross-platform - no problem! Get rid of the "Wrong flavor of linux" and "It doesn't work on windows". Those objections don't hold water here.
Also, you have a modular design that allows for plugins, additions, and changes with zero impact and you've optimized your resource usage!

Corporate architecture design meetings are a thing of the past. Most applications don't require special design/architecture to scale with containered deployments. Need to run 1 instance? Easy. Need to run 1 million? Test it beforehand.

There's also the traditional: "We're assigning a support phone to employees to prevent production outages". To that point, high availability doesn't require constant support with proper design.

Need more?

  • Developers aren't limited by solutions that are already available on the corporate network.
  • Backups aren't necessary for builds
  • Deployments can be fully automated without billions of scripts
  • Continuous integration can happen asynchronously with no impact
  • Continuous deployment becomes a feature, not a requirement
  • Easily accessible, tested, and repeatable
  • Eliminate environment-related issues
  • Controlled releases with better tracking
  • Simple versioning, plug-n-play
  • Self-healing systems
  • Lightning Fast Recovery
  • No overhead to test third party dependencies/frameworks/tools before sourcing a license

Finally, corporate memory is built-in. No more of this:

  • "There's was only one guy here that understood how to set it up"
  • "Our linux guy left a few months ago"
  • "We don't know which branch to deploy"

Still Skeptical?

"Why isn't everyone doing this?"

See "Evidence" above.
A multitude of new conventions.
Fear of drastic system changes.
Learning curve associated with adopting.
High degree of research required to support.

"What about my data? When the container shuts down everything will be lost."

The immediate assumption with any containered solution is that it requires a stateless application to operation. This is no longer true in today's world. With docker on a local machine, you can add virtual directories that persist data storage to localhost. Shut down the container, bring it back to life, and your data is still there. You can stop holding your breath and keep on keepin' on. Even in the event of a total region/provider outage (You really should be deploying in multiple regions if you care about High Availability aka, your sanity), once the systems are automatically restored, we are now able to restore stateful applications and replicate data as needed immediately thanks to containered deployments using persistent volumes. We're even exploring options so that storage volumes can be shared within our clusters to minimize the impacts on ingress/egress of replication.

"But, the cloud!"

Whether it's AWS, Azure, or GCE/gcloud, containers and persistence from within them is still FULLY supported. Thanks to great advances with Mesos, Tectonic, and Docker your data is guaranteed to be stored in the same rigid systems that you're already familiar with: EBS, EFS, Glacier, S3, Azure Premium Storage, etc.

"What about security?"

In most cases, containers are actually MORE secure than traditional running applications. They can be updated more regularly with security patches with no impact to a running system. There are also additional port inbound/outbound rules involved with running containers that restrict access even from the machine that it is running on.

Sources:

https://kubernetes.io/docs/concepts/storage/persistent-volumes/
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

Inspirations:

The Twelve-Factor App