You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm almost always setting index to False when using to_csv, which is probably true for most people.
Although setting the default to False is impossible for compatibility, would it be possible to make it so that noninformative indices (like RangeIndex) are ignored by default?
I don't imagine most people want a range as their first column in the output.
Feature Description
The default value for index in to_csv could be set to ignore_range which triggers this behaviour.
Default indices (with col name "Unnamed") are not only redundant and non-informative but also tend to cause bugs. It is just very questionable behavior in principle -- why is some random column with a weird name appearing in my saved file when I am not asking for it? to_csv followed by a load_csv should be an identity operation by default, why is it not?
For example, if there is a to_csv save that is consumed by some downstream function expecting a specific set of columns (such as ingesting into some SQL table), and someone forgets to add the index=False, the whole thing breaks. IMO that is a far more concerning behavior than ambiguities around auto-inferring the index as the order of rows on load when there is no index specified.
Yup, I agree that the behavior is questionable.
The default behavior (without additional parameters) should produce the expected output.
Most people do not expect a new column "Unnamed: 0" in their csv files.
But they also expect indices to show up in output files if the indices contain information (ex. not RangeIndex).
Which is why I'm proposing a special exception for RangeIndex to be excluded from csv files by default.
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
I'm almost always setting index to False when using to_csv, which is probably true for most people.
Although setting the default to False is impossible for compatibility, would it be possible to make it so that noninformative indices (like RangeIndex) are ignored by default?
I don't imagine most people want a range as their first column in the output.
Feature Description
The default value for index in to_csv could be set to
ignore_range
which triggers this behaviour.Alternative Solutions
Leave as is.
Additional Context
Similar issues #34576 and #46583
The text was updated successfully, but these errors were encountered: