Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
Sometimes we store Python objects (e.g. numpy.ndarray
objects) in a column of a DataFrame
and we want to store it as CSV file. In this case, the array is converted to string using the __str__
method of the object, which is not the best format for later parsing. I suggest to add an option similar to the cls
parameter of json.dumps
which allows to encode a specific type in a custom format.
Feature Description
The user can define an encoder:
def csv_encoder(obj):
if isinstance(obj, numpy.ndarray):
return obj.tolist()
else:
return obj
Then we can pass it to the to_csv
method which is applied to each element of the DF:
df.to_csv("/tmp/file.csv", element_encoder=csv_encoder)
Internally, we could just call
df.applymap(element_encoder)
just before saving the file.
Alternative Solutions
Another solution is to format manually before each call to to_csv()
:
df = pd.DataFrame({"a": [1, 2, 3], "b": ["a", "b", "c"], "c": [np.array(range(0, 3)), np.array(range(1, 4)), np.array(range(2, 5))]})
formatted_df = df.copy()
formatted_df["c"] = formatted_df["c"].apply(lambda x: x.tolist())
formatted_df.to_csv("/tmp/file.csv")
Additional Context
No response