Skip to content

BUG: Fix getsizeof when using Series(obj) and taking into account GC corrections #52112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Mar 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
52cd4dc
BUG: Fix getsizeof when using Series(obj) and taking into account GC …
balexandermunoz Mar 22, 2023
802b442
BUG: Fix getsizeof when using Series(obj) and taking into account GC …
balexandermunoz Mar 22, 2023
7c4355f
Merge branch 'main' into getsizeof
balexandermunoz Mar 22, 2023
563ff9a
Whatsnew v2.1.0 conflict solved
balexandermunoz Mar 22, 2023
b658fe6
Merged branches
balexandermunoz Mar 22, 2023
2c4fa35
Removing unnecessary tests that depend on OS bits
balexandermunoz Mar 22, 2023
9bc02cf
Merge branch 'main' into getsizeof
balexandermunoz Mar 22, 2023
73e58f4
Merge branch 'pandas-dev:main' into getsizeof
balexandermunoz Mar 22, 2023
beaaa93
Merge branch 'main' of https://github.com/balexandermunoz/pandas into…
balexandermunoz Mar 22, 2023
99f50bf
Merge branch 'pandas-dev:main' into getsizeof
balexandermunoz Mar 22, 2023
d382fca
Merge branch 'main' of https://github.com/balexandermunoz/pandas into…
balexandermunoz Mar 22, 2023
c0e3fc5
TST: Included python data types
balexandermunoz Mar 22, 2023
b3a56a4
TST: Added more objects Series with objects for Memory test
balexandermunoz Mar 22, 2023
43ef8d2
TST: Added more objects Series with objects for Memory test
balexandermunoz Mar 22, 2023
a7c5839
CLN: Deleted unnecesary tests
balexandermunoz Mar 22, 2023
ad34c91
Merge branch 'getsizeof' of https://github.com/balexandermunoz/pandas…
balexandermunoz Mar 22, 2023
4cb7c10
Merge branch 'pandas-dev:main' into getsizeof
balexandermunoz Mar 23, 2023
d309f0e
Merge branch 'main' of https://github.com/balexandermunoz/pandas into…
balexandermunoz Mar 23, 2023
03df875
TYP: Renamed _object_series to _type_objects_series
balexandermunoz Mar 23, 2023
6fc021e
Merge branch 'getsizeof' of https://github.com/balexandermunoz/pandas…
balexandermunoz Mar 23, 2023
96f72ca
Merge branch 'main' into getsizeof
balexandermunoz Mar 24, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,7 @@ Styler
Other
^^^^^
- Bug in :func:`assert_almost_equal` now throwing assertion error for two unequal sets (:issue:`51727`)
- Bug in :meth:`Series.memory_usage` when ``deep=True`` throw an error with Series of objects and the returned value is incorrect, as it does not take into account GC corrections (:issue:`51858`)

.. ***DO NOT USE THIS SECTION***

Expand Down
3 changes: 2 additions & 1 deletion pandas/_libs/lib.pyx
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from collections import abc
from decimal import Decimal
from enum import Enum
from sys import getsizeof
from typing import (
Literal,
_GenericAlias,
Expand Down Expand Up @@ -159,7 +160,7 @@ def memory_usage_of_objects(arr: object[:]) -> int64_t:

n = len(arr)
for i in range(n):
size += arr[i].__sizeof__()
size += getsizeof(arr[i])
return size


Expand Down
17 changes: 17 additions & 0 deletions pandas/_testing/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,23 @@
np.uint32,
]

PYTHON_DATA_TYPES = [
str,
int,
float,
complex,
list,
tuple,
range,
dict,
set,
frozenset,
bool,
bytes,
bytearray,
memoryview,
]

ENDIAN = {"little": "<", "big": ">"}[byteorder]

NULL_OBJECTS = [None, np.nan, pd.NaT, float("nan"), pd.NA, Decimal("NaN")]
Expand Down
23 changes: 23 additions & 0 deletions pandas/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -760,6 +760,29 @@ def index_or_series_obj(request):
return _index_or_series_objs[request.param].copy(deep=True)


_typ_objects_series = {
f"{dtype.__name__}-series": Series(dtype) for dtype in tm.PYTHON_DATA_TYPES
}


_index_or_series_memory_objs = {
**indices_dict,
**_series,
**_narrow_series,
**_typ_objects_series,
}


@pytest.fixture(params=_index_or_series_memory_objs.keys())
def index_or_series_memory_obj(request):
"""
Fixture for tests on indexes, series, series with a narrow dtype and
series with empty objects type
copy to avoid mutation, e.g. setting .name
"""
return _index_or_series_memory_objs[request.param].copy(deep=True)


# ----------------------------------------------------------------
# DataFrames
# ----------------------------------------------------------------
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/base/test_misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,8 @@ def test_ndarray_compat_properties(index_or_series_obj):


@pytest.mark.skipif(PYPY, reason="not relevant for PyPy")
def test_memory_usage(index_or_series_obj):
obj = index_or_series_obj
def test_memory_usage(index_or_series_memory_obj):
obj = index_or_series_memory_obj
# Clear index caches so that len(obj) == 0 report 0 memory usage
if isinstance(obj, Series):
is_ser = True
Expand Down