SlideShare a Scribd company logo
A New Model for Image
Distribution
Docker Registry 2.0
Stephen Day
Distribution, Tech Lead
Docker, Inc.
stephen@docker.com
@stevvooe
github.com/stevvooe
Overview
• What is Docker?
• What is an Image?
• What is the Docker Registry?
• History
• Docker Registry API V2
• Implementation
• The Future
3
What is Docker?
https://www.docker.com/whatisdocker/
What is an Image?
A runnable component with a filesystem
What is an Image?
• Containers, the runtime of docker, are created from images
• Filesystem made up with “layers”
– Just tar files
– Layers can be shared between images
• Includes a description organizing layers into an image
• Identified by a name (ubuntu, redis, stevvooe/myapp)
• docker run ubuntu
– Runs a container, created from image ubuntu
6
A runnable component with a filesystem
What is the Docker Registry?
A central place to store and distribute docker images
What is the Docker Registry?
• Stores the layers and the description of how they make up an image
• Implements a common API agreed upon by Docker clients
• Several Implementations
– A simple web server to make images available
– A complete web application
– Services like the Docker Hub contain a registry
• Documentation: https://docs.docker.com/registry/
8
A central place to store and distribute docker images
History
Docker Registry API V1
Docker Registry API V1
• Layer Oriented
• Layer IDs are randomly assigned
• JSON object corresponding to each layer referencing a parent
• Naming accomplished through tags
10
History
Layer Layer Layer Layer
JSON JSON JSON JSONFetch(ID)
Registry API V1 URL Layout
Methods URL
GET /v1/_ping
GET, PUT /v1/images/(image_id)/layer
GET, PUT /v1/images/(image_id)/json
GET /v1/images/(image_id)/ancestry
GET /v1/repositories/(namespace)/(repository)/tags
GET, PUT, DELETE /v1/repositories/(namespace)/(repository)/tags/(tag*)
DELETE /v1/repositories/(namespace)/(repository)/
GET /v1/search
11
https://docs.docker.com/reference/api/registry_api/
Docker Registry API V1
• Performance
– Fetch a layer, fetch the parent, fetch the parent, …
• Security
– Image IDs must be kept secret
– Who assigns the layer IDs?
– Hard to audit, verify
• Implementation in Python
– Affected ease of deployment
– Reduced sharing with main Docker Project
• More available through https://github.com/docker/docker/issues/8093
12
Problems
Docker Registry API V2
Design
Docker Registry API V2
• Simplicity
– Easy to implement
– Works with static host
• Distribution
– Separate location of content from naming
• Security
– Verifiable Images
– Straight-forward access control
• Performance
– Remove the single track
• Implementation
– Move to Go to increase code sharing with Docker Engine
14
Goals
Docker Registry API V2
• Layers are treated as content-addressable blobs
– Much better for security
– Permits safe-distribution through untrusted channels
• All data can be verified
– Improved cache-ability
• Content address is known as the “digest”
15
Content-Addressable
Docker Registry API V2
• Uniquely identifies content
• A cryptographically strong hash
– Chose a name, digest, that does not conflict with other concepts (map, dict, crc, etc.)
• For Registry V2, simply using sha256(bytes)
– Easy to implement
– Easy to verify
• Independently Verifiable
– If you and I agree on a common algorithm, we can choose IDs for content without
coordinating
• Strongly-typed with tools to parse and verify
– http://godoc.org/github.com/docker/distribution/digest
16
What is a digest?
Docker Registry API V2
• Describes the components of an image in a single object
– Layers can be fetched immediately, in parallel
17
Manifests
LayerLayer Layer Layer
JSONFetch(ID)
Docker Registry API V2
• Content-addressable, as well
– docker pull ubuntu@sha256:8126991394342c2775a9ba4a843869112da8156037451fc424454db43c25d8b0
– The above command will pull the exact same image that I have on my laptop
• Leverages Merkle DAG
– Because the digests of the layers are in the manifest, if any bit in the layer
changes, the digest of the manifest changes
– Similar to git, ipfs, camlistore and a host of other projects
• Tags are in the manifest
– This is going away
18
Manifests
Docker Registry API V2
19
Manifests
{
"name": <name>,
"tag": <tag>,
"fsLayers": [
{
"blobSum": <digest>
},
...
]
],
"history": [<v1 image json>, ... ]
}
Docker Registry API V2
• All content is now part of a named repository
– Image IDs are no longer a secret
– Simplified authorization model
• name + operation (push, pull)
• No round trips required for access checks when using token auth
• Makes implementation simple and more secure
– Clients must “prove” content is available to another repository by providing it
• Opened up namespace to allow more than two components
– No reason to have registry enforce “<user>/<image>”
– API “reversed” to make static layout easier
20
Repositories
Docker Registry API V2
• Shared-nothing
– “Backend” ties a cluster of registries together
– Allows scaling by adding instances
– Performance limited by backend
• Make backend faster, registry gets faster
• Pull-optimized
– Most important factor when distributing software
– May hurt certain use cases
• Resumable Pull and Push (specified but not implemented)
– Resumable pull already available with http Range requests
– Two-step upload start for resumable push
– Built into the protocol for future support
• A living specification
– Meant to be used and modified
– Always backwards compatible
21
Overview
Registry API V2 URL Layout
Methods URL
GET /v2/
GET /v2/<name>/tags/list
GET, PUT, DELETE /v2/<name>/manifests/<reference>
GET /v2/<name>/blobs/<digest>
POST /v2/<name>/blobs/uploads/
GET, PUT, PATCH, DELETE /v2/<name>/blobs/uploads/<uuid>
22
https://docs.docker.com/registry/spec/api/
Docker Registry API V2
• Content addresses (digests) are primary identifier
• Unrolled image description model
• Multi-step upload
– Provides flexibility in failure modes
– Options for future alternative upload location (redirects)
• No Search API
– In V1, this API does everything
– Replacing with something better
• No explicit tagging API
– This will change: https://github.com/docker/distribution/pull/173
23
Differences from V1
Docker Registry 2.0
Implementation
Docker Registry 2.0
• Registry 2.0 released with Docker 1.6
– Mostly a success
– https://github.com/docker/distribution
• Running the Hub
– S3 backend
• Having some trouble with round trips to s3 :(
– Decent performance with very little caching
• A lot of low hanging fruit left to tackle
25
Status
Docker Registry 2.0
• Full support release with Docker 1.6
– Minimal bugs
– Most problems are common to version upgrades
• Header required to declare support for 2.0 API
• Validated most concepts in 1.3, 1.4 with V2 preview
– Much faster pull performance
– You’ve probably already used it
• There are some edge cases
– push-heavy workflows
– disk io when verifying large images
– We are mitigating these
26
Does it work?
Docker Registry 2.0
• Are you on Docker 1.6+?
– Yes.
• Evaluate it
• Test it
• Break it (and file bugs https://github.com/docker/distribution/issues)
• Deploy it
• Are you on Docker <1.6?
– Are you entrenched in v1?
• Perhaps, hold off
– Run dual stack v1, v2
• https://docs.docker.com/registry/deploying/#configure-nginx-with-a-v1-and-v2-registry
• Not recommended to auto-port images between v1 and v2
27
Should you be using it?
Docker Registry 2.0
• Internal deployments
– Use the filesystem driver — it is really fast
– Backup with rsync
• Scale storage
– Use S3 driver
• Make sure you are “close” since round trip times can have an effect
• Scale Reads
– Use round robin DNS
• Do not use this for HA
– Rsync to followers on read-only filesystem
– Add machines to taste
• https://docs.docker.com/registry/deploying/
28
Deploying
Docker Registry 2.0
• Feature parity with V1
– Maturity
– Building collective operational knowledge
• Hard to break some bad practices from v1
• Proxy Caching
• Catalog API
– What’s in my registry?
• Deletes
– Diverse backend support makes this hard
– https://github.com/docker/distribution/issues/461
– https://github.com/docker/distribution/issues/462
• Search
– See the goals of Distribution to see why this is interesting
29
Future
Docker Distribution
A project to improve packing, shipping, storing, and delivering content
Docker Distribution
• Goals
– Improve the state of image distribution in Docker
• Focus
– Security
– Reliability
– Performance
• Build a solid and secure foundation
• Unlock new distribution models
– Moving images around no longer requires a registry
– Peer to Peer for large deployments
31
Overview
Docker Distribution
• Clean up the docker daemon code base
– Defined new APIs for working with docker content
– Increase feature velocity
– Generalize around strong base
• Current Manifest format is provisional
– Still includes v1 layer JSON
– Content-addressability + mediatypes make support new formats trivial
– https://github.com/docker/distribution/pull/62
• Road Map: https://github.com/docker/distribution/wiki
32
Future
Docker Distribution
Google Group: distribution@dockerproject.org
GitHub: https://github.com/docker/distribution
IRC on Freenode: #docker-distribution
Join us at
DockerCon
Save 10% on registration with code: containyourself
June 22-23, 2015 in
San Francisco, CA
Register now:
dockercon.com
@dockercon | dockercon.com
Q&A
THANK YOU

More Related Content

A new model for Docker image distribution

  • 1. A New Model for Image Distribution Docker Registry 2.0
  • 2. Stephen Day Distribution, Tech Lead Docker, Inc. stephen@docker.com @stevvooe github.com/stevvooe
  • 3. Overview • What is Docker? • What is an Image? • What is the Docker Registry? • History • Docker Registry API V2 • Implementation • The Future 3
  • 5. What is an Image? A runnable component with a filesystem
  • 6. What is an Image? • Containers, the runtime of docker, are created from images • Filesystem made up with “layers” – Just tar files – Layers can be shared between images • Includes a description organizing layers into an image • Identified by a name (ubuntu, redis, stevvooe/myapp) • docker run ubuntu – Runs a container, created from image ubuntu 6 A runnable component with a filesystem
  • 7. What is the Docker Registry? A central place to store and distribute docker images
  • 8. What is the Docker Registry? • Stores the layers and the description of how they make up an image • Implements a common API agreed upon by Docker clients • Several Implementations – A simple web server to make images available – A complete web application – Services like the Docker Hub contain a registry • Documentation: https://docs.docker.com/registry/ 8 A central place to store and distribute docker images
  • 10. Docker Registry API V1 • Layer Oriented • Layer IDs are randomly assigned • JSON object corresponding to each layer referencing a parent • Naming accomplished through tags 10 History Layer Layer Layer Layer JSON JSON JSON JSONFetch(ID)
  • 11. Registry API V1 URL Layout Methods URL GET /v1/_ping GET, PUT /v1/images/(image_id)/layer GET, PUT /v1/images/(image_id)/json GET /v1/images/(image_id)/ancestry GET /v1/repositories/(namespace)/(repository)/tags GET, PUT, DELETE /v1/repositories/(namespace)/(repository)/tags/(tag*) DELETE /v1/repositories/(namespace)/(repository)/ GET /v1/search 11 https://docs.docker.com/reference/api/registry_api/
  • 12. Docker Registry API V1 • Performance – Fetch a layer, fetch the parent, fetch the parent, … • Security – Image IDs must be kept secret – Who assigns the layer IDs? – Hard to audit, verify • Implementation in Python – Affected ease of deployment – Reduced sharing with main Docker Project • More available through https://github.com/docker/docker/issues/8093 12 Problems
  • 13. Docker Registry API V2 Design
  • 14. Docker Registry API V2 • Simplicity – Easy to implement – Works with static host • Distribution – Separate location of content from naming • Security – Verifiable Images – Straight-forward access control • Performance – Remove the single track • Implementation – Move to Go to increase code sharing with Docker Engine 14 Goals
  • 15. Docker Registry API V2 • Layers are treated as content-addressable blobs – Much better for security – Permits safe-distribution through untrusted channels • All data can be verified – Improved cache-ability • Content address is known as the “digest” 15 Content-Addressable
  • 16. Docker Registry API V2 • Uniquely identifies content • A cryptographically strong hash – Chose a name, digest, that does not conflict with other concepts (map, dict, crc, etc.) • For Registry V2, simply using sha256(bytes) – Easy to implement – Easy to verify • Independently Verifiable – If you and I agree on a common algorithm, we can choose IDs for content without coordinating • Strongly-typed with tools to parse and verify – http://godoc.org/github.com/docker/distribution/digest 16 What is a digest?
  • 17. Docker Registry API V2 • Describes the components of an image in a single object – Layers can be fetched immediately, in parallel 17 Manifests LayerLayer Layer Layer JSONFetch(ID)
  • 18. Docker Registry API V2 • Content-addressable, as well – docker pull ubuntu@sha256:8126991394342c2775a9ba4a843869112da8156037451fc424454db43c25d8b0 – The above command will pull the exact same image that I have on my laptop • Leverages Merkle DAG – Because the digests of the layers are in the manifest, if any bit in the layer changes, the digest of the manifest changes – Similar to git, ipfs, camlistore and a host of other projects • Tags are in the manifest – This is going away 18 Manifests
  • 19. Docker Registry API V2 19 Manifests { "name": <name>, "tag": <tag>, "fsLayers": [ { "blobSum": <digest> }, ... ] ], "history": [<v1 image json>, ... ] }
  • 20. Docker Registry API V2 • All content is now part of a named repository – Image IDs are no longer a secret – Simplified authorization model • name + operation (push, pull) • No round trips required for access checks when using token auth • Makes implementation simple and more secure – Clients must “prove” content is available to another repository by providing it • Opened up namespace to allow more than two components – No reason to have registry enforce “<user>/<image>” – API “reversed” to make static layout easier 20 Repositories
  • 21. Docker Registry API V2 • Shared-nothing – “Backend” ties a cluster of registries together – Allows scaling by adding instances – Performance limited by backend • Make backend faster, registry gets faster • Pull-optimized – Most important factor when distributing software – May hurt certain use cases • Resumable Pull and Push (specified but not implemented) – Resumable pull already available with http Range requests – Two-step upload start for resumable push – Built into the protocol for future support • A living specification – Meant to be used and modified – Always backwards compatible 21 Overview
  • 22. Registry API V2 URL Layout Methods URL GET /v2/ GET /v2/<name>/tags/list GET, PUT, DELETE /v2/<name>/manifests/<reference> GET /v2/<name>/blobs/<digest> POST /v2/<name>/blobs/uploads/ GET, PUT, PATCH, DELETE /v2/<name>/blobs/uploads/<uuid> 22 https://docs.docker.com/registry/spec/api/
  • 23. Docker Registry API V2 • Content addresses (digests) are primary identifier • Unrolled image description model • Multi-step upload – Provides flexibility in failure modes – Options for future alternative upload location (redirects) • No Search API – In V1, this API does everything – Replacing with something better • No explicit tagging API – This will change: https://github.com/docker/distribution/pull/173 23 Differences from V1
  • 25. Docker Registry 2.0 • Registry 2.0 released with Docker 1.6 – Mostly a success – https://github.com/docker/distribution • Running the Hub – S3 backend • Having some trouble with round trips to s3 :( – Decent performance with very little caching • A lot of low hanging fruit left to tackle 25 Status
  • 26. Docker Registry 2.0 • Full support release with Docker 1.6 – Minimal bugs – Most problems are common to version upgrades • Header required to declare support for 2.0 API • Validated most concepts in 1.3, 1.4 with V2 preview – Much faster pull performance – You’ve probably already used it • There are some edge cases – push-heavy workflows – disk io when verifying large images – We are mitigating these 26 Does it work?
  • 27. Docker Registry 2.0 • Are you on Docker 1.6+? – Yes. • Evaluate it • Test it • Break it (and file bugs https://github.com/docker/distribution/issues) • Deploy it • Are you on Docker <1.6? – Are you entrenched in v1? • Perhaps, hold off – Run dual stack v1, v2 • https://docs.docker.com/registry/deploying/#configure-nginx-with-a-v1-and-v2-registry • Not recommended to auto-port images between v1 and v2 27 Should you be using it?
  • 28. Docker Registry 2.0 • Internal deployments – Use the filesystem driver — it is really fast – Backup with rsync • Scale storage – Use S3 driver • Make sure you are “close” since round trip times can have an effect • Scale Reads – Use round robin DNS • Do not use this for HA – Rsync to followers on read-only filesystem – Add machines to taste • https://docs.docker.com/registry/deploying/ 28 Deploying
  • 29. Docker Registry 2.0 • Feature parity with V1 – Maturity – Building collective operational knowledge • Hard to break some bad practices from v1 • Proxy Caching • Catalog API – What’s in my registry? • Deletes – Diverse backend support makes this hard – https://github.com/docker/distribution/issues/461 – https://github.com/docker/distribution/issues/462 • Search – See the goals of Distribution to see why this is interesting 29 Future
  • 30. Docker Distribution A project to improve packing, shipping, storing, and delivering content
  • 31. Docker Distribution • Goals – Improve the state of image distribution in Docker • Focus – Security – Reliability – Performance • Build a solid and secure foundation • Unlock new distribution models – Moving images around no longer requires a registry – Peer to Peer for large deployments 31 Overview
  • 32. Docker Distribution • Clean up the docker daemon code base – Defined new APIs for working with docker content – Increase feature velocity – Generalize around strong base • Current Manifest format is provisional – Still includes v1 layer JSON – Content-addressability + mediatypes make support new formats trivial – https://github.com/docker/distribution/pull/62 • Road Map: https://github.com/docker/distribution/wiki 32 Future
  • 33. Docker Distribution Google Group: distribution@dockerproject.org GitHub: https://github.com/docker/distribution IRC on Freenode: #docker-distribution
  • 34. Join us at DockerCon Save 10% on registration with code: containyourself June 22-23, 2015 in San Francisco, CA Register now: dockercon.com @dockercon | dockercon.com
  • 35. Q&A

Editor's Notes

  1. Please ask questions in chat, I’ll address them at the end
  2. “Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications.“ key point One can build applications These applications can be packed into images Others can run those images
  3. This is a harder definition Moving these around is the main problem we are working on
  4. How have we moved these bits around in the past?
  5. doesn’t represent the runnable components if you fail to push a layer, the id will be assigned by that registry and represents the partial data
  6. image names are restricted to two components image ids have minimal access control at a url level
  7. - Don’t worry; I love Python
  8. “Using less commonly used root term of message digest"
  9. doesn’t represent the runnable components if you fail to push a layer, the id will be assigned by that registry and represents the partial data
  10. Sorry for the eye chart Resumable push - more important to build in protocol than implement right away Refactoring the client code in the daemon to make this easier
  11. image names are restricted to two components image ids have minimal access control at a url level
  12. come participate