-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
POC of PDEP-9 (I/O plugins) #53005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
POC of PDEP-9 (I/O plugins) #53005
Changes from 1 commit
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
c0d0115
POC of PDEP-9 (I/O plugins)
datapythonista 91da43a
Implementing the POC with a pyarrow fallback as the connector protocol
datapythonista 67a69a9
Black+isort
datapythonista 2439ed9
Adding docstring and black
datapythonista 2b0e13f
Minor fixes to __init__.py
datapythonista 000ea21
raising if to_ method exists
datapythonista 59b0c3a
Use the dataframe interchange protocol instead
datapythonista b511fe4
Merge remote-tracking branch 'upstream/main' into pdep9_impl
datapythonista 51f7588
Warning on conflict
datapythonista File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check that dataframe_io_entry_point.name isn't overwriting something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. Probably some other checks could be useful too. So far my goal was more to show what PDEP-9 could imply in terms of code, as I don't understand all the opposition for what in my opinion is a small change with huge benefits. So I guess a MVP implementation can help undertand understand what are the implications to the PDEP. But fully agree we should raise if two installed packages use the same entrypoint name, iirc it's mentioned in the PDEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
to be clear, my only opposition to pdep9 was in renaming the hugely established
pd.read_csv
(which is also the most visited page in the docs, according to the analytics you sent me)adding an optional plugin system like this which allows third-party authors to develop readers/writers sounds like a net positive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying your position @MarcoGorelli.
I think that was great feedback. While it'd be nice to have a better I/O API IMHO, probably not worth the change, and in any case, that can be discussed separately, as it's independent and adds noise to the discussion about plugins.
I was concerned of adding to much stuff to the pandas already huge namespaces, but in a second thought, if we eventually move connectors like SAS or BigQuery to third-party projects, most users will probably end up having less connectors than now, not more.