Skip to content

Serialize File Content #574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rmorshea opened this issue Jan 10, 2022 · 6 comments
Open

Serialize File Content #574

rmorshea opened this issue Jan 10, 2022 · 6 comments
Labels
priority-1-high Should be resolved ASAP.

Comments

@rmorshea
Copy link
Collaborator

rmorshea commented Jan 10, 2022

Right now we don't serialize file content. Just metadata. Along with this we'll want to allow users to configure the max message size to help users protect against large file upload attacks.

See this comment for implementation details.

@rmorshea rmorshea added this to the 1.0 milestone Jan 11, 2022
@rmorshea rmorshea added the priority-2-moderate Should be resolved on a reasonable timeline. label Jan 13, 2022
This was referenced Feb 16, 2022
@Archmonger
Copy link
Contributor

Just had a thought.

We could possibly leave file serialization to HTTP, and handle things on the backend.

For example, adding a HTTP endpoint

# Uploads are given a unique UUID path to route to
path("_idom/upload/{uuid}")

Not really sure how we'd be able to tell the WS side of things when an upload is a complete though... Can't rely on boolean flags due to cross-process compatibility.

@Archmonger
Copy link
Contributor

Archmonger commented Apr 6, 2022

If we do go down this road of serializing via websockets, it's going to be extremely important to stream the file to disk rather than buffer it in memory. Otherwise, this would be a DDOS attack vector.

Ideally, "file stuff" should be handled by individual frameworks... but that design of that might not be technologically feasible.

@Archmonger
Copy link
Contributor

PyZMQ might be a decent option to communicate between separate ASGI processes.

@rmorshea
Copy link
Collaborator Author

The initial implementation should probably use form elements to upload files over HTTP. We can overwrite the default onSubmit behavior for forms to submit without a redirect. Unfortunately though, I don't think we can supply a built-in solution that will allow for inter-process communication - that will have to be supplied on a case-by-case basis depending on the user's deployment.

With that said, I have some thoughts on how this could be accomplished via the websocket in a more natural way, but it would require some pretty deep changes to how we send messages between the client and the server. Right now, there is no real messaging protocol - the server always sends layout updates and the client always sends layout events. We'd need to change this in order to allow the client and server to send different types of messages to each other. For example, all messages between the client and server would probably take the form:

{
  "type": str,  # some unique name
  "version": str,  # allow client and server to sort out compatibility
  ...  # any other data
}

@rmorshea rmorshea modified the milestones: 1.0, 2.0 Dec 30, 2022
@Archmonger Archmonger modified the milestones: Luxury, Essential Jan 29, 2023
@rmorshea rmorshea removed this from the Essential milestone Feb 21, 2023
@Archmonger Archmonger added priority-1-high Should be resolved ASAP. and removed priority-2-moderate Should be resolved on a reasonable timeline. labels Jun 14, 2023
@Archmonger
Copy link
Contributor

Archmonger commented Jun 17, 2023

We are likely going to develop two file upload handlers.

For small files, we stream the file directly to memory.
For large files, we stream the file to disk (buffered in memory).

  1. Client begins sending file contents to the server
  2. Server reassembles the file contents into a buffer
  3. In the background, the server empties that buffer into a temp file onto the disk, in order to not use up all the system RAM

In order to support this without having this become an attack vector, we should implement a few safeguards:

  1. MAX_BUFFER_SIZE: The maximum amount of bytes to buffer within memory before throttling responses
  2. MAX_FILE_SIZE: The maximum size of a file that can be saved to the disk
  3. UPLOAD_DIR: Directory where files get saved
  4. If the transfer ends pre-maturely, the buffer and disk file are deleted
  5. Look at Django's upload handlers and see if there are other security steps that need to be taken
    • django.core.files.uploadhandler.MemoryFileUploadHandler
    • django.core.files.uploadhandler.TemporaryFileUploadHandler

@Archmonger
Copy link
Contributor

We might want to transfer this issue to the reactpy-router repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority-1-high Should be resolved ASAP.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants