Skip to content

Make Docker image Best Practices-Compliant #183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sr229 opened this issue Mar 10, 2019 · 14 comments
Closed

Make Docker image Best Practices-Compliant #183

sr229 opened this issue Mar 10, 2019 · 14 comments

Comments

@sr229
Copy link
Contributor

sr229 commented Mar 10, 2019

Unfortunately, this image does not adhere to Best practices as stated in #65. We should rewrite the Dockerfile to adhere to the following directives:

  • using a user account
  • Optimizing layers (minimizing layers)

I'm afraid such official image has to set an example like the rest of the Docker Library images.

@sr229 sr229 added the enhancement Some improvement that isn't a feature label Mar 10, 2019
@kylecarbs
Copy link
Member

cc: @nhooyr

@nhooyr
Copy link
Contributor

nhooyr commented Mar 10, 2019

There is already an open issue for sudo. Not sure what you mean in terms of minimizing layers, its very minimal already.

@vincentclaes
Copy link

if the dockerfile follows the best practices it can run on open shift cluster. now i cannot run it on open shift because it does not allow to be run as root. that's why a 👍 for me

@vincentclaes
Copy link

vincentclaes commented Mar 10, 2019

@nhooyr what OS are you building your dockerfile? i get error messages when building on osx high sierra

@nhooyr
Copy link
Contributor

nhooyr commented Mar 10, 2019

I'm on Mojave. What error messages do you get?

@nhooyr nhooyr added needs-decision and removed enhancement Some improvement that isn't a feature labels Mar 10, 2019
@sr229
Copy link
Contributor Author

sr229 commented Mar 11, 2019

This is a tracking issue since I'm already implementing it @nhooyr. The other issue was more on debating on the Docker image's compliancy.

@nhooyr
Copy link
Contributor

nhooyr commented Mar 11, 2019

@sr229 I appreciate and admire your initiative but please wait for consensus before implementing things. I don't want to have to reject things after you put your time into them.

Please respond to my question regarding minimizing layers.

Also #65 is the tracking issue for implementing sudo. So unless there is more to minimizing layers, this issue should be closed.

@sr229
Copy link
Contributor Author

sr229 commented Mar 11, 2019

Each RUN and COPY layer is equivalent to 1 layer. Even if we have a small enough image, having many layers does not equal to optimization.

Without minding the build layer. We can reduce the two RUNs into one RUN to be one layer. Currently we have 4 layers, which can be reduced to 3 layers which could result in faster extraction. This is the same concept we're trying to do in Theia/Gitpod @nhooyr

@nhooyr
Copy link
Contributor

nhooyr commented Mar 11, 2019

Without minding the build layer. We can reduce the two RUNs into one RUN to be one layer. Currently we have 4 layers, which can be reduced to 3 layers which could result in faster extraction. This is the same concept we're trying to do in Theia/Gitpod @nhooyr

That will make things harder to read as the two RUN instructions are unrelated. Performance is not all that matters. I also doubt this optimization will be user visible as it's very minor. It's unlikely to get approved by me.

So I'm closing this as wontfix.

@nhooyr nhooyr closed this as completed Mar 11, 2019
@nhooyr nhooyr added the wontfix label Mar 11, 2019
@sr229
Copy link
Contributor Author

sr229 commented Mar 11, 2019

I don't see why it won't be readable. You're closing the issue because point 2 does not match your specific requirements, point 1 still needs to be done to compensate for clusters where compliancy is a must (OpenShift).

This is arguably biased.

@nhooyr
Copy link
Contributor

nhooyr commented Mar 11, 2019

I don't see why it won't be readable. You're closing the issue because point 2 does not match your specific requirements

Because the two instructions are doing two different things. The first one installs runtime deps for code-server and the second one generates locales. By separating the instructions, their distinctness becomes clear. A single instruction would not get this across without comments separating them, which is janky. Furthermore, the optimization is very minor and wouldn't affect users.

point 1 still needs to be done to compensate for clusters where compliancy is a must (OpenShift)

That change is still being discussed and no decision has been made and there is already a tracking issue. #65

Furthermore, you're not blocked on it. Its trivial to extend the Dockerfile and add your own user.

This is arguably biased.

Calling me biased does not help your case. Please stick to technical arguments. I am using my best judgement to help guide the development of code-server.

@kennylevinsen
Copy link

kennylevinsen commented Mar 14, 2019

We can reduce the two RUNs into one RUN to be one layer.

Layers are meant to be a single logical change, allowing reuse on rebuild when that layer is not modified, as well as allowing reuse across different containers that do the same thing. This is not possible if unrelated actions are combined into a single layer. That is why a single Dockerfile doesn't translate to a single layer.

Minimizing layer count is good, but the sole point of layers in Docker is lost if they no longer contain only a single logical change each.

@sr229
Copy link
Contributor Author

sr229 commented Mar 15, 2019

@kennylevinsen While that is true, abusing layers just for this sole reason is why we have too many unoptimized images. Layer counts does not translate to how fast your application can pull and extract, and keeping some changes in one layer versus two is more ideal.

Please do not count layer as one logical change, Layers are counted in the idea there is substantial changes in the FS that will be overlayed. That's not how it works.

@kennylevinsen
Copy link

Layer counts does not translate to how fast your application can pull and extract

Exactly. One 100MB layer does not pull or extract faster than ten 10MB layers. Make an update, and you'll see that the one layer approach needs to download 90MB more than the ten layer approach (which also wastes storage).

and keeping some changes in one layer versus two is more ideal.

Depends on what you mean by changes. One change, sure, don't split single tasks if there is no reason to do so.

Several changes, no. Fewer layers guarantee waste, where more layers give the possibility of avoiding it.

Please do not count layer as one logical change, Layers are counted in the idea there is substantial changes in the FS that will be overlayed. That's not how it works.

That's exactly how it works, and how docker is designed. There's a reason that docker stubbornly creates a layer for each and every statement—because docker wants you to create many, tiny layers.

Attempting to combine many things into a single layer is simply against the design of docker, and optimizes an irrelevant metric (layer count), at the cost of relevant, limited and sometimes even billed metrics (storage and bandwidth consumption).

(Oh, and just to clarify: I do not consider one logical change to be installing one out of hundreds of dependencies, but rather something like "install base dependencies".)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants