SlideShare a Scribd company logo
Shipping NodeJS with 
Docker and CoreOS 
@RossKukulinski 
BayNode Talk Night 
November 20, 2014
@RossKukulinski 
@RossKukulinski 
SpeakIt.io Founder 
BayNode Co-Organizer 
Soccer Fanatic 
Node-Forward Mentor
What I’m going to Cover 
@RossKukulinski 
• Our Story 
• Background on Docker & CoreOS 
• Tips & Tricks / Lessons Learned
* History of SpeakIt 
* We were the Future-Tech / Labs group, based in CA, but HQ in Virginia 
* Tried lots of audio/video conferencing, none worked well enough for us 
* We wanted 
* high quality audio 
* no user accounts or 18 digit codes 
* run in a web browser, one-click on a link 
* support for 1-on-1 as well as company wide meetings 
* Parent company builds real-time communication for flight simulators and military training 
* why don’t we build our own tool? 
* NodeJS, websockets/socket.io 
* WebRTC 
* Proprietary high-performance audio mixing platform
The internal tool that wasn’t 
internal anymore 
@RossKukulinski 
SpeakIt was well received, we used it internally 
But we started using it for customer calls, open source projects 
Other people wanted in… so we opened SpeakIt up to the public via a closed beta 
It was a side project that grew with new features, capabilities, etc while staying one big monolithic app
@RossKukulinski 
When we decided to spin SpeakIt off, we realized to actually build and iterate quickly we needed to breakup our monolithic application into smaller micro services. 
Currently, new devs needed to understand the whole application to make changes, which slowed down our on-boarding process. 
We also knew this re-architecture would require a complete rework of our build, test, deployment infrastructure. 
So we took a deep breath. Planned things out, and got to work.
@RossKukulinski 
12factor.net 
As we were planning out the reworked architecture for the new SpeakIt, we drew heavily from the idea of a 12-factor app. 
Highly recommend you take a close look at this site — in my opinion great guidelines for building networked applications
@RossKukulinski 
Our Goals 
• Reduce application complexity (do one thing well!) 
• Scalable 
• Fault tolerant 
• Support running multiple versions of the same app 
• Consistent app from dev → test → staging → prod 
• Minimize time spent doing ‘devops’ 
People from the node community should be familiar with the idea of building small modules that do one thing well. We wanted to take the same approach at the 
application layer. Each app should do one thing and one thing well. For example, we have an app that manages file uploads, another for managing user accounts, 
another for managing conference rooms, etc.
@RossKukulinski 
Docker 
I’ve been tracking the Docker project since it was first open sourced in 2013. We did some experiments with version 1.0 and were pretty happy.
VM vs Docker 
@RossKukulinski 
https://docker.com/whatisdocker/ 
In some ways, you can think of Docker containers as lightweight virtual machines. Docker was originally based on LXC, or Linux Containers, but earlier this year they 
moved to their own lib container library. 
One big difference is that Docker containers usually run only _one_ process, whereas your VM might run lots of processes. Since Docker containers are so lightweight, 
its easy to run lots of containers, one per process.
@RossKukulinski 
• Containers start quickly 
• Containers have small footprint 
• Dockerized applications run anywhere 
• Really fast builds via cached images 
• Registry for storing images from build pipeline 
• Images can be layered 
• Abstracts app networking from system networking
@RossKukulinski 
Our Goals 
• Reduce application complexity (do one thing well!) 
• Scalable 
• Fault tolerant 
• Run multiple versions of the same app 
• Consistent app from dev → test → staging → prod 
• Minimize time spent doing ‘devops’ 
So by using Docker as our runtime, we could split up our monolithic application to several smaller pieces, each that runs inside its own container. 
Because docker abstracts the networking and filesystem for each app, we could run multiple versions of the same app on the same host. This is helpful for A/B testing 
of changes as well as allowing us to horizontally scale our applications out. 
Finally, because containers are run once, run anywhere… this helps to eliminate the ‘but it runs on my machine’
How do you ship 
docker containers? 
@RossKukulinski 
Bash scripts (ugh) 
Ansible / Puppet / Chef 
So, then the question became — ok, how do we actually get our containers into the wild. 
Obviously, the docker registry would be a central tenant, but there needs to be some orchestration tools around it
Linux for Massive Server Deployments 
@RossKukulinski 
This is where CoreOS entered the picture for us. We had an opportunity to sit down with some seasoned Rackspace developers who had worked on their onMetal 
product. onMetal launched supporting CoreOS, and Rackspace has been really helpful to us in planning and building our new architecture. After a lengthy discussion 
with them, we came to the conclusion that we should tryout CoreOS.
@RossKukulinski 
• Minimal Operating System 
• Automated software updates 
• Runs docker containers 
• Supported by all major cloud 
providers 
• Can also run on bare metal 
https://coreos.com/
@RossKukulinski 
Fault Tolerant 
• Clustered by default 
• Support for multiple HA zones 
• Distributed tools like etcd & fleet 
• HTTP Key-Value Store 
• Service Discovery 
• Application Scheduling 
https://coreos.com/ 
ETCD is a distributed high-available key-value store that runs in your CoreOS cluster. Its similar to ZooKeeper, in that it can be used for service discovery, shared 
configuration,
@RossKukulinski 
Scalable 
https://coreos.com/ 
What’s cool about CoreOS is that the topology is immensely scalable. 
The core Cluster running etcd is limited to 9 nodes, but the worker machines can be consumers of the etcd / fleet scheduling of Docker containers. This allows you to 
scale (and de-scale) your resources quickly and efficiently.
@RossKukulinski 
Goals 
• Reduce application complexity (do one thing well!) 
• Scalable 
• Fault tolerant 
• Run multiple versions of the same app 
• Consistent app from dev → test → staging → prod 
• Minimize time spent doing ‘devops’ 
So with CoreOS and Docker, we’ve checked everything off our list. 
There are definitely some missing pieces of the puzzle. There’s no great automated way of hooking your build/test system (jenkins for us) to deploying new services to 
your CoreOS cluster. 
There are a few systems that are trying to take this on but are pretty early in development. In the meantime, we’re building our own (open source) tool to handle these 
rolling deploys.
Now for the good stuff 
@RossKukulinski 
Lessons Learned / Tips & Tricks
Docker Registry 
@RossKukulinski 
• Public Registry 
• Private Registry 
• Quay.io / DockerHub 
• Run your own 
• Take advantage of layering docker images 
* Docker registry is great (though not as great as npm!) 
* Published docker images are a great starting point 
* Definitely use them for your underlying OS 
* But for software I’ve found them to be out of date, and tagged versions _tend_ to be old or not quite usable 
* I often copy/paste a public docker file and make my own version, then push to either the public registry or our own private one 
* Private registries seem to be hit or miss. We tried quay.io and Docker Hub and were not impressed with their performance for downloading images. I spoke with some 
senior folks at Quay (now owned by CoreOS) about our performance problems and they say they’ve made improvements, especially the AWS <-> Rackspace network 
connection. 
* We ultimately decided to run our own registry _inside_ our datacenter for fast docker push/pulls. 
* I do recommend using SSD backed storage for that however, and probably should take advantage of elastic-block storage (or w/e its called in your hosting 
system). 
* I should also note that the registry runs _inside_ a Docker container. So that means that you’re storing docker images inside a docker container. Yo dog, I heard 
you like images.
@RossKukulinski 
speakit/nodejs 
This is an example Dockerfile for building a NodeJS image. This is what we use for our base nodejs images, and we have published this on github and Docker hub. 
An important note regarding Dockerfile builds: Docker build by default caches each build step (each RUN, ADD, command). This is great, because subsequent builds 
only need to start at the first step that has changed. Unfortunately, if your build does things over the network (like yum update / yum install, npm install), Docker doesn’t 
know whether the network resources has changed — so it just uses the cache. This is why we added an ENV LAST_UPDATED to line 4. If we want to update this 
images packages, we just need to bump the LAST_UPDATED env line and then the next Docker build will start with line 4 and then grab the latest packages. 
It’s important to note that docker build does have a no-cache flag which will force all steps to be build every time. This ensures your builds are always up to date… but 
they will also take longer to complete. The choice is yours.
Node App Dockerfile 
@RossKukulinski 
This is an example Dockerfile for one of our nodejs apps. 
It’s important to note that we add our package.son file first, run npm install which can take a while then we add our source code. 
This way if you change your source, a new docker build only needs to copy your new source 
Important to note that you might also want to add ENV NPM_INSTALLED <date> above npm install Otherwise subsequent builds will not grab the latest packages
Private GitHub Repos 
@RossKukulinski 
If you need to install git clone from private repos, then you’ll need to add your private key to the image. We have a special sshkey for our automated build system (that’s 
tied to a robot github account in our organization). 
This allows us to git clone private github repos in later images. 
We remove the id_rsa key in our application Dockerfiles so that our private key is never in a container outside our private network. 
It’s been pointed out to me that we could take advantage of docker’s new ‘onbuild’ command in this dockerfile. Onbuild would allow us to add package.json, npm install, 
git clone, and then remove the ssh key in an automated way. That’s something we might experiment with.
> docker pull image:latest 
@RossKukulinski 
The docker pull command by default pulls all tagged versions of your image. If you have automated builds, this pull can take a _long_ time. I highly recommend only 
pulling the :latest tag, or if you know a specific version you want, pull that tag. 
For us, our automated builds tag images with the git-sha and jenkins build# so we can easily pull down any version from our private registry.
Local CoreOS Dev 
• Can use Vagrant with a single (or multi) node 
cluster 
• Digital Ocean pretty cheap for small cluster 
@RossKukulinski 
In theory you can use Vagrant for local CoreOS development testing. We’ve gotten it to work, but have run into too many problems over the last 2-3 weeks that we’ve 
scrapped it. 
Instead we’re taking advantage of cost-effective VMs in the cloud to provision a CoreOS machine (or 3) per developer on-demand.
@RossKukulinski 
Monitoring 
• CLI tools (fleetctl via ssh) 
• CoreGI (github.com/astilabs/ 
CoreGI) 
• cAdvisor (github.com/google/ 
cadvisor) 
I’m all for having great command-line tools. Fleet & etcd are pretty decent overall, but if you’ve got more than 20 services or so (we run almost 100 in production), it 
quickly becomes annoying to monitor the state of your cluster. 
We built an open source tool called CoreGI that is a simple NodeJS/AngularJS application that runs in a container (duh) on every host in your CoreOS cluster. CoreGI 
provides an easy at-a-glance display of the machines, units, and unit-files in your cluster. 
If you want more in-depth look at the performance of your containers, take a look at cAdvisor which gives per-container CPU, memory, and network activity. cAdvisor 
has a REST API, so I expect we’ll eventually add that information to CoreGI.
Service Discovery 
@RossKukulinski 
• Don’t hardcode the 
host port of your 
container 
• Sidekick pattern -> Write 
to etcd 
• Confd (github.com/ 
kelseyhightower/confd) 
• Vulcan 
(vulcanproxy.com) 
https://coreos.com/ 
One of the new things for me when building out our new application stack was the idea of service discovery. Each time an instance of a service comes up (e.g. user 
account management), it (or its sidekick) registers the IP & Port of the running container with etcd. This allows any other services that need to make account queries to 
find a running instance of the account REST API. 
This also allows you to quickly scale any one particular service out horizontally by dynamically updating load balancers / proxies (like nginx / HA Proxy) based on etcd 
information. A great tool for doing that is Confd written by kelsey Hightower. 
Another tool that’s still under active development is Vulcan Proxy, which could really help in this area. It was a little _too_ new for even us. nginx/haproxy are really good 
at what they do.
Cloud Load Balancers 
• How do your users access services in CoreOS? 
• Could run Global service with proxy on 80/443 
• Or update cloud lbs dynamically based on etcd 
• Soon: github.com/astilabs/CoreOS-Cloud-LB 
@RossKukulinski 
So with service discovery, all of the apps/proxies within your cluster can easily dynamically reconfigure themselves as applications start/stop across your cluster. 
But how do your end-users actually connect to your system? You *could* run a global service on every machine that listens on port 80/443 and proxies traffic 
accordingly. Then your DNS system does round-robin balancing across all the hosts in your system. 
I’m not a huge fan of this simply because of the delays in DNS propagation if a host goes down. This is especially true if you’re taking advantage of auto-scaling of your 
cluster. 
Instead, we’ve created a package called CoreOS-Cloud-LB (probably will rename to etcd-cloud-lb for trademark conflict). This tool monitors etcd for loadbalancer 
named mappings & ip:ports and dynamically updates your cloud load balancers accordingly. Right now this tool only supports Rackspace Load balancers, but under the 
hood it uses a cross-cloud library called pkgcloud. Once we clean up the documentation a bit we will be open sourcing this tool (probably next week (11/23/2014)
npm install -g coreos-cluser-cli 
@RossKukulinski 
If you’re a developer and you quickly want to bring up a CoreOS cluster, check out this Cool tool built by Ken Perkins using pkgcloud. 
—> Right now Rackspace only, but no reason it couldn’t be expanded to other platforms
Things to watch 
@RossKukulinski 
• Kubernetes 
• Google Container Engine 
• Vulcan Proxy 
• Paz (paz.sh) 
• Panamax (panamax.io) 
• Mesosphere (mesosphere.com) 
I mentioned earlier that the deployment (and rolling deployment) of services is still fairly “roll your own” in the CoreOS ecosystem. Kubernetes is the project to watch in 
this area — it’s what Google is using to automated their new Google Container Engine. I talked with a few of the folks working on the project and they recommended 
taking a serious look for “production” use of Kubernetes in the late-spring / early summer 2015 timeframe. 
Paz.sh is a project that’s being worked on by the folks of http://www.yld.io/. It looks to be a slightly different take of CoreGI in that it’s focused on the build/delivery 
pipeline of containers to your core-os sytem. I understand that they’re planning on open sourcing Paz in early 2015. 
Panamax is another commercial (open source?) project thats supposed to help with the automated deploy of containers. We haven’t looked too closely at it, but its 
something we’re keeping an eye on. 
Finally, be sure to check out mesosphere. it’s a combination of coreos and Apache Mesos that in theory will be amazingly awesome. its still new and to be honest their 
documentation / walk through tutorials are still very immature (non-existent). I sat down with a friend from new relic a couple weeks ago and we were unsuccessful in 
deploying node & redis to a mesosphere cluster. *sigh*
@RossKukulinski 
Resources 
• Example cloud_config 
• https://gist.github.com/rosskukulinski/ 
9ddff8e5f67a24cc7bb7 
• Full example of sidekick pattern for Redis 
• https://gist.github.com/rosskukulinski/ 
96f7709fa20d7def6b9e 
• PXE Booting CoreOS Post coming soon… 
Example cloud config with Rackspace monitoring, using 2nd hard drive for docker image storage, private docker hub credentials, dynamically detecting private ips for 
use in apps 
Another common pattern with CoreOS is service announcement/discovery through etcd. The exact systemd/etcd configuration can be a little tricky. 
Finally, we’ve gotten PXE booting of CoreOS that matches our production cloud configuration. We’ll be blogging / open sourcing some tools in that space very soon.
Other Resources 
@RossKukulinski 
• CoreOS Docs: https://coreos.com/docs/ 
• CoreOS User Google Group 
• #coreos & #docker on FreeNode (I’m ‘rossk’) 
• SpeakIt GitHub (https://github.com/astilabs) 
• SpeakIt Blog (https://blog.speakit.io) 
Definitely checkout CoreOS docs as well as the CoreOS User & Dev google croups. 
#coreos on FreeNode has a pretty active community. If you’ve got questions, ask there — please feel free to also ping me directly (I’m ‘rossk’ on FreeNode). 
Checkout our GitHub and our blog for more posts as we explore the CoreOS / Docker ecosystem.
@RossKukulinski 
Thanks! 
Questions? 
I hope this material proves helpful. 
I’d love to hear your feedback, comments, suggestions, and corrections! You can find me on twitter / github @rosskukulinski and you can email me at ross@speakit.io.

More Related Content

Shipping NodeJS with Docker and CoreOS

  • 1. Shipping NodeJS with Docker and CoreOS @RossKukulinski BayNode Talk Night November 20, 2014
  • 2. @RossKukulinski @RossKukulinski SpeakIt.io Founder BayNode Co-Organizer Soccer Fanatic Node-Forward Mentor
  • 3. What I’m going to Cover @RossKukulinski • Our Story • Background on Docker & CoreOS • Tips & Tricks / Lessons Learned
  • 4. * History of SpeakIt * We were the Future-Tech / Labs group, based in CA, but HQ in Virginia * Tried lots of audio/video conferencing, none worked well enough for us * We wanted * high quality audio * no user accounts or 18 digit codes * run in a web browser, one-click on a link * support for 1-on-1 as well as company wide meetings * Parent company builds real-time communication for flight simulators and military training * why don’t we build our own tool? * NodeJS, websockets/socket.io * WebRTC * Proprietary high-performance audio mixing platform
  • 5. The internal tool that wasn’t internal anymore @RossKukulinski SpeakIt was well received, we used it internally But we started using it for customer calls, open source projects Other people wanted in… so we opened SpeakIt up to the public via a closed beta It was a side project that grew with new features, capabilities, etc while staying one big monolithic app
  • 6. @RossKukulinski When we decided to spin SpeakIt off, we realized to actually build and iterate quickly we needed to breakup our monolithic application into smaller micro services. Currently, new devs needed to understand the whole application to make changes, which slowed down our on-boarding process. We also knew this re-architecture would require a complete rework of our build, test, deployment infrastructure. So we took a deep breath. Planned things out, and got to work.
  • 7. @RossKukulinski 12factor.net As we were planning out the reworked architecture for the new SpeakIt, we drew heavily from the idea of a 12-factor app. Highly recommend you take a close look at this site — in my opinion great guidelines for building networked applications
  • 8. @RossKukulinski Our Goals • Reduce application complexity (do one thing well!) • Scalable • Fault tolerant • Support running multiple versions of the same app • Consistent app from dev → test → staging → prod • Minimize time spent doing ‘devops’ People from the node community should be familiar with the idea of building small modules that do one thing well. We wanted to take the same approach at the application layer. Each app should do one thing and one thing well. For example, we have an app that manages file uploads, another for managing user accounts, another for managing conference rooms, etc.
  • 9. @RossKukulinski Docker I’ve been tracking the Docker project since it was first open sourced in 2013. We did some experiments with version 1.0 and were pretty happy.
  • 10. VM vs Docker @RossKukulinski https://docker.com/whatisdocker/ In some ways, you can think of Docker containers as lightweight virtual machines. Docker was originally based on LXC, or Linux Containers, but earlier this year they moved to their own lib container library. One big difference is that Docker containers usually run only _one_ process, whereas your VM might run lots of processes. Since Docker containers are so lightweight, its easy to run lots of containers, one per process.
  • 11. @RossKukulinski • Containers start quickly • Containers have small footprint • Dockerized applications run anywhere • Really fast builds via cached images • Registry for storing images from build pipeline • Images can be layered • Abstracts app networking from system networking
  • 12. @RossKukulinski Our Goals • Reduce application complexity (do one thing well!) • Scalable • Fault tolerant • Run multiple versions of the same app • Consistent app from dev → test → staging → prod • Minimize time spent doing ‘devops’ So by using Docker as our runtime, we could split up our monolithic application to several smaller pieces, each that runs inside its own container. Because docker abstracts the networking and filesystem for each app, we could run multiple versions of the same app on the same host. This is helpful for A/B testing of changes as well as allowing us to horizontally scale our applications out. Finally, because containers are run once, run anywhere… this helps to eliminate the ‘but it runs on my machine’
  • 13. How do you ship docker containers? @RossKukulinski Bash scripts (ugh) Ansible / Puppet / Chef So, then the question became — ok, how do we actually get our containers into the wild. Obviously, the docker registry would be a central tenant, but there needs to be some orchestration tools around it
  • 14. Linux for Massive Server Deployments @RossKukulinski This is where CoreOS entered the picture for us. We had an opportunity to sit down with some seasoned Rackspace developers who had worked on their onMetal product. onMetal launched supporting CoreOS, and Rackspace has been really helpful to us in planning and building our new architecture. After a lengthy discussion with them, we came to the conclusion that we should tryout CoreOS.
  • 15. @RossKukulinski • Minimal Operating System • Automated software updates • Runs docker containers • Supported by all major cloud providers • Can also run on bare metal https://coreos.com/
  • 16. @RossKukulinski Fault Tolerant • Clustered by default • Support for multiple HA zones • Distributed tools like etcd & fleet • HTTP Key-Value Store • Service Discovery • Application Scheduling https://coreos.com/ ETCD is a distributed high-available key-value store that runs in your CoreOS cluster. Its similar to ZooKeeper, in that it can be used for service discovery, shared configuration,
  • 17. @RossKukulinski Scalable https://coreos.com/ What’s cool about CoreOS is that the topology is immensely scalable. The core Cluster running etcd is limited to 9 nodes, but the worker machines can be consumers of the etcd / fleet scheduling of Docker containers. This allows you to scale (and de-scale) your resources quickly and efficiently.
  • 18. @RossKukulinski Goals • Reduce application complexity (do one thing well!) • Scalable • Fault tolerant • Run multiple versions of the same app • Consistent app from dev → test → staging → prod • Minimize time spent doing ‘devops’ So with CoreOS and Docker, we’ve checked everything off our list. There are definitely some missing pieces of the puzzle. There’s no great automated way of hooking your build/test system (jenkins for us) to deploying new services to your CoreOS cluster. There are a few systems that are trying to take this on but are pretty early in development. In the meantime, we’re building our own (open source) tool to handle these rolling deploys.
  • 19. Now for the good stuff @RossKukulinski Lessons Learned / Tips & Tricks
  • 20. Docker Registry @RossKukulinski • Public Registry • Private Registry • Quay.io / DockerHub • Run your own • Take advantage of layering docker images * Docker registry is great (though not as great as npm!) * Published docker images are a great starting point * Definitely use them for your underlying OS * But for software I’ve found them to be out of date, and tagged versions _tend_ to be old or not quite usable * I often copy/paste a public docker file and make my own version, then push to either the public registry or our own private one * Private registries seem to be hit or miss. We tried quay.io and Docker Hub and were not impressed with their performance for downloading images. I spoke with some senior folks at Quay (now owned by CoreOS) about our performance problems and they say they’ve made improvements, especially the AWS <-> Rackspace network connection. * We ultimately decided to run our own registry _inside_ our datacenter for fast docker push/pulls. * I do recommend using SSD backed storage for that however, and probably should take advantage of elastic-block storage (or w/e its called in your hosting system). * I should also note that the registry runs _inside_ a Docker container. So that means that you’re storing docker images inside a docker container. Yo dog, I heard you like images.
  • 21. @RossKukulinski speakit/nodejs This is an example Dockerfile for building a NodeJS image. This is what we use for our base nodejs images, and we have published this on github and Docker hub. An important note regarding Dockerfile builds: Docker build by default caches each build step (each RUN, ADD, command). This is great, because subsequent builds only need to start at the first step that has changed. Unfortunately, if your build does things over the network (like yum update / yum install, npm install), Docker doesn’t know whether the network resources has changed — so it just uses the cache. This is why we added an ENV LAST_UPDATED to line 4. If we want to update this images packages, we just need to bump the LAST_UPDATED env line and then the next Docker build will start with line 4 and then grab the latest packages. It’s important to note that docker build does have a no-cache flag which will force all steps to be build every time. This ensures your builds are always up to date… but they will also take longer to complete. The choice is yours.
  • 22. Node App Dockerfile @RossKukulinski This is an example Dockerfile for one of our nodejs apps. It’s important to note that we add our package.son file first, run npm install which can take a while then we add our source code. This way if you change your source, a new docker build only needs to copy your new source Important to note that you might also want to add ENV NPM_INSTALLED <date> above npm install Otherwise subsequent builds will not grab the latest packages
  • 23. Private GitHub Repos @RossKukulinski If you need to install git clone from private repos, then you’ll need to add your private key to the image. We have a special sshkey for our automated build system (that’s tied to a robot github account in our organization). This allows us to git clone private github repos in later images. We remove the id_rsa key in our application Dockerfiles so that our private key is never in a container outside our private network. It’s been pointed out to me that we could take advantage of docker’s new ‘onbuild’ command in this dockerfile. Onbuild would allow us to add package.json, npm install, git clone, and then remove the ssh key in an automated way. That’s something we might experiment with.
  • 24. > docker pull image:latest @RossKukulinski The docker pull command by default pulls all tagged versions of your image. If you have automated builds, this pull can take a _long_ time. I highly recommend only pulling the :latest tag, or if you know a specific version you want, pull that tag. For us, our automated builds tag images with the git-sha and jenkins build# so we can easily pull down any version from our private registry.
  • 25. Local CoreOS Dev • Can use Vagrant with a single (or multi) node cluster • Digital Ocean pretty cheap for small cluster @RossKukulinski In theory you can use Vagrant for local CoreOS development testing. We’ve gotten it to work, but have run into too many problems over the last 2-3 weeks that we’ve scrapped it. Instead we’re taking advantage of cost-effective VMs in the cloud to provision a CoreOS machine (or 3) per developer on-demand.
  • 26. @RossKukulinski Monitoring • CLI tools (fleetctl via ssh) • CoreGI (github.com/astilabs/ CoreGI) • cAdvisor (github.com/google/ cadvisor) I’m all for having great command-line tools. Fleet & etcd are pretty decent overall, but if you’ve got more than 20 services or so (we run almost 100 in production), it quickly becomes annoying to monitor the state of your cluster. We built an open source tool called CoreGI that is a simple NodeJS/AngularJS application that runs in a container (duh) on every host in your CoreOS cluster. CoreGI provides an easy at-a-glance display of the machines, units, and unit-files in your cluster. If you want more in-depth look at the performance of your containers, take a look at cAdvisor which gives per-container CPU, memory, and network activity. cAdvisor has a REST API, so I expect we’ll eventually add that information to CoreGI.
  • 27. Service Discovery @RossKukulinski • Don’t hardcode the host port of your container • Sidekick pattern -> Write to etcd • Confd (github.com/ kelseyhightower/confd) • Vulcan (vulcanproxy.com) https://coreos.com/ One of the new things for me when building out our new application stack was the idea of service discovery. Each time an instance of a service comes up (e.g. user account management), it (or its sidekick) registers the IP & Port of the running container with etcd. This allows any other services that need to make account queries to find a running instance of the account REST API. This also allows you to quickly scale any one particular service out horizontally by dynamically updating load balancers / proxies (like nginx / HA Proxy) based on etcd information. A great tool for doing that is Confd written by kelsey Hightower. Another tool that’s still under active development is Vulcan Proxy, which could really help in this area. It was a little _too_ new for even us. nginx/haproxy are really good at what they do.
  • 28. Cloud Load Balancers • How do your users access services in CoreOS? • Could run Global service with proxy on 80/443 • Or update cloud lbs dynamically based on etcd • Soon: github.com/astilabs/CoreOS-Cloud-LB @RossKukulinski So with service discovery, all of the apps/proxies within your cluster can easily dynamically reconfigure themselves as applications start/stop across your cluster. But how do your end-users actually connect to your system? You *could* run a global service on every machine that listens on port 80/443 and proxies traffic accordingly. Then your DNS system does round-robin balancing across all the hosts in your system. I’m not a huge fan of this simply because of the delays in DNS propagation if a host goes down. This is especially true if you’re taking advantage of auto-scaling of your cluster. Instead, we’ve created a package called CoreOS-Cloud-LB (probably will rename to etcd-cloud-lb for trademark conflict). This tool monitors etcd for loadbalancer named mappings & ip:ports and dynamically updates your cloud load balancers accordingly. Right now this tool only supports Rackspace Load balancers, but under the hood it uses a cross-cloud library called pkgcloud. Once we clean up the documentation a bit we will be open sourcing this tool (probably next week (11/23/2014)
  • 29. npm install -g coreos-cluser-cli @RossKukulinski If you’re a developer and you quickly want to bring up a CoreOS cluster, check out this Cool tool built by Ken Perkins using pkgcloud. —> Right now Rackspace only, but no reason it couldn’t be expanded to other platforms
  • 30. Things to watch @RossKukulinski • Kubernetes • Google Container Engine • Vulcan Proxy • Paz (paz.sh) • Panamax (panamax.io) • Mesosphere (mesosphere.com) I mentioned earlier that the deployment (and rolling deployment) of services is still fairly “roll your own” in the CoreOS ecosystem. Kubernetes is the project to watch in this area — it’s what Google is using to automated their new Google Container Engine. I talked with a few of the folks working on the project and they recommended taking a serious look for “production” use of Kubernetes in the late-spring / early summer 2015 timeframe. Paz.sh is a project that’s being worked on by the folks of http://www.yld.io/. It looks to be a slightly different take of CoreGI in that it’s focused on the build/delivery pipeline of containers to your core-os sytem. I understand that they’re planning on open sourcing Paz in early 2015. Panamax is another commercial (open source?) project thats supposed to help with the automated deploy of containers. We haven’t looked too closely at it, but its something we’re keeping an eye on. Finally, be sure to check out mesosphere. it’s a combination of coreos and Apache Mesos that in theory will be amazingly awesome. its still new and to be honest their documentation / walk through tutorials are still very immature (non-existent). I sat down with a friend from new relic a couple weeks ago and we were unsuccessful in deploying node & redis to a mesosphere cluster. *sigh*
  • 31. @RossKukulinski Resources • Example cloud_config • https://gist.github.com/rosskukulinski/ 9ddff8e5f67a24cc7bb7 • Full example of sidekick pattern for Redis • https://gist.github.com/rosskukulinski/ 96f7709fa20d7def6b9e • PXE Booting CoreOS Post coming soon… Example cloud config with Rackspace monitoring, using 2nd hard drive for docker image storage, private docker hub credentials, dynamically detecting private ips for use in apps Another common pattern with CoreOS is service announcement/discovery through etcd. The exact systemd/etcd configuration can be a little tricky. Finally, we’ve gotten PXE booting of CoreOS that matches our production cloud configuration. We’ll be blogging / open sourcing some tools in that space very soon.
  • 32. Other Resources @RossKukulinski • CoreOS Docs: https://coreos.com/docs/ • CoreOS User Google Group • #coreos & #docker on FreeNode (I’m ‘rossk’) • SpeakIt GitHub (https://github.com/astilabs) • SpeakIt Blog (https://blog.speakit.io) Definitely checkout CoreOS docs as well as the CoreOS User & Dev google croups. #coreos on FreeNode has a pretty active community. If you’ve got questions, ask there — please feel free to also ping me directly (I’m ‘rossk’ on FreeNode). Checkout our GitHub and our blog for more posts as we explore the CoreOS / Docker ecosystem.
  • 33. @RossKukulinski Thanks! Questions? I hope this material proves helpful. I’d love to hear your feedback, comments, suggestions, and corrections! You can find me on twitter / github @rosskukulinski and you can email me at ross@speakit.io.