Proposal: Engine V1 REST API Transition

The transition from REST to gRPC is a big one, but (I think) it is one which we agree (right?) we want to do and which has long term benefits.

Unfortunately it is the sort of big transition for which the activation energy to get started is quite high and where sometimes you just have to decide to “rip the plaster off” and take the short term pain to get on with it.

I appreciate that there is a natural desire to prefer sitting in a bath waiting for the plaster to soak off by itself but I don’t think this is one of those times. If we chose that path then my prediction is we will never make any meaningful progress towards our end goal anytime soon.

One of the worries seems to be that we may end up having to support the transitional/unstable gRPC API. If we communicate and version it appropriately (e.g. 0.x.y in semver speak) then I don’t see any reason why this would be a problem. As far as end users who want stable APIs go during this process, they will still be able to target the stable REST API via the proposed proxy (whose implementation will track the unstable gRPC API internally), they need never know. If they are using client libraries they might even never know we we eventually switch them over to an underlying stable gRPC layer instead! User and developers who are interested in and excited by the gRPC APIs will have something to play with (and become motivated to contribute too) and so long as we don’t mislead them regarding the stability of the initial unstable/transitional gRPC API I see no reason for their to be problems with being forced into supporting it passed the point we have fully replaced it.

But moving even to an internal/unstable gRPC API would still be useful and would lessen the friction and activation energy for people who want to factor bits out and beat them into a stable form. It is much easier to take an unstable gRPC API and iterate towards stability than it is to start from an non-gRPC unstable API and produce a stable gRPC one out of whole cloth. Further to that our experiences with the unstable gRPC API will inform us greatly on the needs and requirements of the eventual stable gRPC API and we are therefore more likely to get it right (or more right) than if we just make up a gRPC interface and call it stable. In the same vein the tooling we build to help us with the unstable gRPC API is going to be just as useful for the eventual stable ones.

As far as the “more work” argument goes I agree, it is certainly going to be more work than the no work towards our goal which I believe will happen otherwise (and I’m not trying to be flippant here).

One of the worries seems to be that we may end up having to support the transitional/unstable gRPC API

Each component will of course have to implement it’s own API that will incur breaking changes during the development, and that’s fine.
The worry is introducing a copy of the existing API in grpc form since this isn’t fixing or improving anything.
It is also my understanding that having a “the one API to bind them all” is an explicit non-goal of Moby.

The transition from REST to gRPC is a big one, but (I think) it is one which we agree (right?) we want to do and which has long term benefits.

We have to be careful here. First yes, I think GRPC is a good thing and we should use it… however just switching to GRPC doesn’t fix anything at all. There’s lots of ugly things in the Docker API that will be just as ugly in GRPC form (or in some cases not even possible).

it is certainly going to be more work than the no work towards our goal which I believe will happen otherwise

My point is replicating the Docker API in GRPC is not work towards our goal and is likely to be actively harmful to forward progress.

To clarify, I’m majorly +1 to Daniel’s (main) proposed solution and we need to focus on component API’s here.

1 Like

however just switching to GRPC doesn’t fix anything at all

Switching to the the transitional/unstable gRPC is not supposed to, it’s supposed to ease the path forwards towards creating good gRPC interfaces in the future, which I made several cases for in my first post (by making a start leading to reduced activation energy and giving a more concrete place to contribute to, giving the ability to iterate on the bad gRPC interfaces until they are good, gaining real world experience with gRPC in this context etc).

My point is replicating the Docker API in GRPC is not work towards our goal and is likely to be actively harmful to forward progress.

I think the opposite on both points.

In our case, that unstable API will be huge and and full of cruft. I’d argue that it’d make it harder to make it stable and sane from its state that it would be to design the API from scratch.

IMHO, the best path would be to decide which component to tear out of the monolith first, start to do so while designing its API (in an unstable form) and use this as the base to iterate in the way you propose.

I think it could be a decent middle ground and would have the advantage to force us to work on both the API transition and the componentization of the monolith at the same time.

1 Like

Completely agree on this one. We need to start working on a component, keeping the current “monolith” rest api intact, learn from it and slowly but surely, it will become more clear what to do with this V1 rest api.

1 Like

I think you are missing the main point. The goal is not to transition from REST to gRPC. The goal is to fix the architecture of what was formally known as the Docker Engine so that it’s a set of components that work together instead of a monolith.

A side-effect of that goal is that the old V1 REST API doesn’t make sense anymore. It assumes a specific set of components are always included, and it’s implementation would be nothing more than a proxy to “whatever API those components happen to provide”. It is probably safe to assume those components will provide a gRPC API because that seems to be the general consensus right now.

Since each component would be required to provide its own API (for inter-component communication) it seems natural that the client would also be able to use those APIs directly, instead of a centralised API.

So “the new API” is not just a reworking of what we have now. Converting the current V1 API doesn’t get us any closer to the goal of a component architecture.

This is exactly where the “more work” comes in. There are two ways to implement the HTTP proxy:

  1. generate the proxy from the gRPC definition
  2. manually write the proxy using gRPC calls

With option 1 we have gained nothing. We still can’t make changes to the gRPC API because we must retain the same types and bevahiour to generate a correct HTTP API. The gRPC API is required to be just as stable as the old REST API. This doesn’t actually give us any way to iterate on the gRPC API.

With option 2 we’re doing both what I proposed (an HTTP proxy which uses gRPC calls), but we’re also doing all the work upfront to convert the entire monolith API to gRPC as well, instead of doing it incrementally as we split out components. This is twice the work, and it doesn’t get us any closer to creating components. It actually diverts work away from the componentization to this API transition.

If we believe that creating new gRPC APIs within the current monolith will help us break things into components we should definitely do so. I think this sounds like a great idea. It will allow us to start replacing some calls with gRPC calls and make incremental process toward splitting out components. However this work is unrelated to the proposal and this thread, so we should continue that discussion in a new thread.

Maybe this is way off topic but since you all are doing this, gate the
features by api version, not the whole service. The client is very hard to
use with previous versions of docker as it stands now.

This is one of the very rare cases where I would suggest a one time generation and then treating the generated source as first class code going forward, i.e. to generate the first pass then just edit the result going forward as we iterate the gRPC APIs.

I’d expect to generate both the gRPC and the proxy from the swagger definitions as the first pass, except I’m not sure how complete the swagger stuff is.

Again, to what end?
It is my understanding there is not going to be a single big API but rather clients talk to the component they need access to.
The existing http API is kept for compat but other than that you are talking to components.

Why are we wanting to spend time on copying a broken design to a new platform? This sounds like “we heard GRPC is cool and makes everything better, so let’s use that”.

Build a good component with a well designed interface.
Then we can deal with adapting the http API to talk to the new backend component.
What is the reason to throw in another layer there?

cpuguy83 https://forums.mobyproject.org/u/cpuguy83
May 25

One of the worries seems to be that we may end up having to support the
transitional/unstable gRPC API

Each component will of course have to implement it’s own API that will
incur breaking changes during the development, and that’s fine.
The worry is introducing a copy of the existing API in grpc form since
this isn’t fixing or improving anything.

I listed several ways in which it will improve things. Please address them.

It is also my understanding that having a “the one API to bind them all” is
an explicit non-goal of Moby.

That’s correct. What’s your point?

The transition from REST to gRPC is a big one, but (I think) it is one
which we agree (right?) we want to do and which has long term benefits.

We have to be careful here. First yes, I think GRPC is a good thing and we
should use it… however just switching to GRPC doesn’t fix anything at
all. There’s lots of ugly things in the Docker API that will be just as
ugly in GRPC form (or in some cases not even possible).

Again. Please address the actual benefits that were actually listed earlier
in the thread.

it is certainly going to be more work than the no work towards our goal
which I believe will happen otherwise

My point is replicating the Docker API in GRPC is not work towards our
goal and is likely to be actively harmful to forward progress.

In what way?

I did address these benefits above and detailed out my concerns.
Clearly my concerns stem from a severe disconnect with the intentions/goals of the proposal and what I perceive it as so I’ll withdraw them.

Discussed this with @shykes and @dnephin on Slack and I now realize the goal here is to provide the engine itself as a (really big) “component”, which will require a GRPC interface, while the engine is being replaced by other smaller components.
:thumbsup:

(also feel free to correct me if I’ve said something incorrect above)

There was some discussion about this in slack last week. @mlaventure suggested it would be good to update this thread with a summary of that discussion.

First, this proposal assumed that the primary goal was to split up the monolith. That is a goal, but it sounds like it’s not necessarily the primary focus. The primary focus could be to bring in new components along side the monolith, where the monolith acts as a single large component until it can be split up.

With this alternative focus, the option of porting the entire V1 Engine API to gRPC would make the monolith look like just another component.

This raised question about “what is a moby component”. The current working definition is that it’s a binary that exposes a gRPC interface. Some components will be “init” components which are run outside of a container (ex: containerd), and others are “service” components which are run as containerd containers.

So how does this proposal change given the new focus?

We need to investigate:

  1. How long it would take to port the entire engine API to gRPC (considering we can not freeze it until it is completely ported). My estimate is that it’s still a multi-month project.
  2. Once we have a gRPC APi definition are we able to generate a golang HTTP proxy that perfectly mirrors the existing V1 REST API? Many things like hijacking for log streams, build output, and attach are difficult to represent.

If we’re able to represent the V1 REST API using a gRPC definition, then we could replace the current API with this generate code, but that still leaves open the question of how users will migrate. The immediate monolith gRPC interface would not be the final interface, it would eventually get replaced by the individual smaller component interfaces.

So the actual migration options would be:

  1. The original proposal (but delayed until after we migrate the engine API to gRPC)
  2. Migrate first from REST V1 -> monolith gRPC, then from monolith gRPC -> component gRPC

I doubt anyone will be interested in migrating twice.

Given the discussion and change of focus, I don’t think this proposal needs to change. The only question is if the priority should be to make the monolith appear like other components first, or to split it up first. Either way, users options for transiting to a new API remain about the same.

Thanks for the summary @dnephin

From it, I’d say that the next step if we want to move along quickly would be to do a quick PoC with the more challenging endpoints (i.e. the one you mentioned would be difficult to represent) and see if there’s any blockers there.

Anyone willing to tackle this? :innocent:

Regarding the double migration, I also think most people won’t do the intermediate migration. For those who will, we will have to make sure to advertise that it’s a temporary stop gap. In a way, it would force us not to get complacent when it comes to create those new components and their API.

I have one question left unanswered though. ATM what would be another component if we exclude the engine?

There was some discussion about this in slack last week. @mlaventure
suggested it would be good to update this thread with a summary of
that discussion.

Thanks for this! Just a couple of questions below…


With this alternative focus, the option of porting the entire V1
Engine API to gRPC would make the monolith look like just another
component.

Can you elaborate on this? By “just another component”, do you mean that
the monolith would be a sibling to the other (new) components that are
being created as we split things? If so, I view it slightly differently.
I view the new components as children of the monolith, where the monolith
will shrink over time and eventually get to the point where is nothing but
the integration point for all of the components to be brought together
under
a single gRPC endpoint/proxy. This allows for clients to talk to a single
endpoint but based on some routing info (e.g. the path of the URL) each
request will be routed to the appropriate sub-component.

This also means that as things are split out the monolith would have the
mapping from the existing REST API to the gRPC API of the new component.
So the monolith could support both (old/REST and new/gRPC) calls at the
same time.


We need to investigate:
How long it would take to port the entire engine API to gRPC
(considering we can not freeze it until it is completely ported). My
estimate is that it’s still a multi-month project.

I would prefer to think of this as “how long will it take to split
the monolith” rather than to phrase it as a statement about APIs.
The API part of it should just happen automatically.

Once we have a gRPC APi definition are we able to generate a golang
HTTP proxy that perfectly mirrors the existing V1 REST API? Many
things like hijacking for log streams, build output, and attach are
difficult to represent.
If we’re able to represent the V1 REST API using a gRPC definition,
then we could replace the current API with this generate code, but
that still leaves open the question of how users will migrate. The
immediate monolith gRPC interface would not be the final interface,
it would eventually get replaced by the individual smaller component
interfaces.

My lack of knowledge of gRPC will show here, but… why is there a
migration between gRPC interfaces? It seems to me that even after
we split things up we’ll still want a single endpoint that the gRPC clients
will talk to, no? As an admin I’d want to only secure one endpoint instead
of lots (secure, ACLs, …). And even if each component did expose its APIs
to clients directly, I would think the migration would be more of a
URL difference than a real API/body difference, no?

-Doug

That is a different question. My question was not about splitting components. It was addressing the alternative proposal by Solomon to port the entire V1 API to gRPC without splitting out any components.

No, I don’t think we want any centralized API router. That would require it to be aware of all the components.

It was a response to the alternative proposal where we would first port the entire V1 API to gRPC, then later split out components. The migration would be from V1 gRPC to V2 APIs provided by components.

The difference should be a lot more significant than just different paths. If all we’re doing is moving API endpoints around we gain nothing. We need to redesign the API from the ground up to fix the problems with V1.

No, it would be sibling to “the other components that make up docker that are not the engine”. On linux the only other “component” is containerd (I think), but on other platforms I’m told there are other components.

It seems to me that even after we split things up we’ll still want a
single endpoint that the gRPC clients will talk to, no?
No, I don’t think we want any centralized API router. That would require
it to be aware of all the components.

I depends on what “aware” means :slight_smile: Hard-coded? No. Each component would
just need to be registered with it at runtime. Otherwise I think you’d
asking for a maintanence nightmare asking for an admin to stand-up and
secure each component independently on the network/internet.

thanks
-Doug

Doug, grpc allows multi-plexing multiple services into a single transport. So sysadmins keep the benefits of a single address (only configure and secure once), but we get the benefit of cleanly separated components without maintaining a brittle hand-crafted aggregation frontend. The aggregation is done automatically by grpc.

After taking some time to cool off and reconsider the whole situation, here are my thoughts:

  • I am still a big believer in reaching for an all-grpc core engine, with an optional convenience REST compat layer on top.

  • However the reality is that, until the situation of the engine and its V1 API is clarified, repositories in their final place, and the respective roles of Engine, Moby, LinuxKit etc. are better documented and better understood by the contributor community… It’s difficult to embark in an ambitious new project like switching to an all-grpc architecture, brand new client-generation tools, etc.

  • Therefore I feel like we are better off following Daniel’s recommendation and moving on. Let’s reach a point of clarity and stability for the current code, then we can worry about writing brand new code.

  • I also think it’s too early to talk about deprecation. Based on Doug’s comments, I suspect many people don’t consider the V1 API needs deprecation, and before we start that debate I would prefer to close the debates already in front of us. So let’s kick that can down the road a few more months, if that’s OK with you.

  • To answer your question Daniel: I think priority #1 is to start making progress splitting up the engine. Priority #2 is to switch to an all-grpc architecture with better tooling for client generation and service discovery.

3 Likes

gRPC allows you to configure multiple “gRPC services” within the same process to a single transport. I think it’s important to mention that a “gRPC service” is just a code construct. It’s not a “component”. This isn’t much different from any REST API where you can configure multiple routes at different urls on the same transport.

If you want to run different processes on the same transport you need another proxy process which handles that multiplexing.

Does this match your understanding? Our working definition of “component” has been “separate binary” (so a separate process). This means we would not be running multiple components on the same socket/port. But a proxy can be used to support that design.