Performance of ExtensionArray display in DataFrame/Series #43020
Labels
Duplicate Report
Duplicate issue or pull request
ExtensionArray
Extending pandas with custom dtypes or arrays.
Output-Formatting
__repr__ of pandas objects, to_string
Performance
Memory or execution speed performance
When printing a DataFrame/Series that is backed by an ExtensionArray the values are first converted to a numpy array:
pandas/pandas/io/formats/format.py
Lines 1561 to 1574 in c03ee85
This is problematic for ExtensionArrays that are very expensive to convert to numpy arrays. i.e. converting an extremely large ExtensionArray to a numpy array to ultimately print ~10 values.
Can this be overcome or worked around? There is a already a special case for categorical here.
The text was updated successfully, but these errors were encountered: