-
Notifications
You must be signed in to change notification settings - Fork 53
Dispatching fromfile
and tofile
#490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@jakirkham thanks for the suggestion. This seems like a scope expansion that's better left alone. From https://data-apis.org/array-api/latest/purpose_and_scope.html: "The following topics are out of scope: I/O, ...". There's a huge amount of data formats, no common API for them across libraries, plus the semantics of data loaders are very hard to specify. There's another reason why it is fine to have I/O routines be out of scope: you should never need to call I/O routines from within libraries. It's end users that load data (with a specific array library), and then pass that array to library functions.
|
Yeah for clarity wasn't suggesting we handle formats or encoding/decoding here. Just suggesting we have a way to load very simple files into memory and represent it as an array of the desired type or write it back to disk. Zarr, in particular (given it was mentioned), uses very simple binary files in its representation of array data. So the request as-is would be useful here. With CSV, both JSON and HDF5 are probably complicated enough that implementing them in the spec wouldn't make sense. That said, the simple ability to read binary data into an in-memory array potentially could be a useful first step for decoding and encoding these would do. |
Might not be high priority at this point? If a library doesn't ditch NumPy completely, they can read both binary and text files using it and then convert using What we are missing to some degree may be the HDF5 seems more like a protocol issue? Let the HDF5 library export the buffer protocol (and/or DLPack) and you already have support (this may well already be the case). Reading (simple?) CSV may be core. But it is not trivial to implement unless you are OK with relying on the builtin |
It is worth noting |
If we are to design new API, I would tend to remove the double meaning from There are competing interest, I think. For this use-case, it would be nice to formalize clean API somewhere that may not match NumPy 100% (even if end-user orientated maybe). |
Maybe we could narrow the focus to binary files and use something like If we wanted text, that could be another API like |
Agreed. Forgot to mention above |
+1 I would certianly consider adding something like |
Based on our discussion, for simple binary loading users may better off implementing a file object (maybe |
Based on the above discussion, we're not likely to move forward with |
A common, simple case that NumPy provides is the ability to read & write a pure binary file. Even some simple file formats are little more than a header (and/or a footer) with binary array data included. Having functionality to load & save arrays generally as binary files would be quite helpful for getting users started in these cases.
The text was updated successfully, but these errors were encountered: