-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: pd.DataFrame(dtype) arg cannot be list, dict, Series. And None will infer wider type than necessary. #14764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
repro for the error case df = pd.DataFrame(columns=['year', 'month'],
dtype={'year': 'int32', 'month': 'int8'},
index=range(10), data=-1) I could see an argument for supporting this, consistent with #13375 |
To be clear, I'm only asking for the documentation to reflect what currently happens. It's pretty confusing. We can live with fixing-up the dtypes after the df is declared; it would be good to document that workaround. |
this is a (partial) duplicate of: #4464 (which is the actual impl issue). No objections to clarifying the doc-string. |
I would like to do this but I am not exactly sure what to do. Should I copy paste the example given by @smcinerney in docstring of DataFrame class in pandas/core/frame.py ? |
@patniharshit well, the idea is to create a nice-example / expl in the doc-string. so you can certainly start with that. |
Looks like this issue got dropped. Happy to see if i can take a stab at this now. If anyone is at pycon sprints, happy to pair! |
Ok added an example in the docstring. Would welcome feedback about its usefulness! |
…as-dev#16487) * Adding some more documentation on dataframe with regards to dtype * Making example for creating dataframe from np matrix easier
…as-dev#16487) * Adding some more documentation on dataframe with regards to dtype * Making example for creating dataframe from np matrix easier
Commented here on this issue, to note it's still a problem--maybe fixed in next release? thanks: |
Code Sample, a copy-pastable example if possible
Problem description
The DataFrame() doc doesn't explicitly say a list/dict/Series/array-like is not allowed (and if you pass in one the error is not very friendly). Also behaves differently to read_csv(dtype).
Leaving dtype=None in constructor will infer a wider type than necessary.
So in general you either set dtype=widest_necessary_type, or dtype=None and then manually fix them up after declaration, by casting with
astype()
Expected Output
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: