-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: AttributeError: 'BooleanArray' object has no attribute 'sum' while infer types #44079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you post a reproducible example along with the version of pandas you're using. |
I'm having the same problem but from another place, it happens when I use This is the code that I'm using: >>> import pandas as pd
>>> df = pd.read_csv(
... 'geo20210715_pp.txt',
... names=[
... 'cod_est', 'cod_mun', 'cod_par',
... 'nombre_est', 'nombre_mun', 'nombre_par'
... ],
... index_col=['cod_est', 'cod_mun', 'cod_par'],
... dtype={
... 'cod_est': 'UInt8', 'cod_mun': 'UInt8', 'cod_par': 'UInt8',
... 'nombre_est': 'string', 'nombre_mun': 'string',
... 'nombre_par': 'string'
... },
... engine='c',
... encoding='cp1252'
... )
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv
return _read(filepath_or_buffer, kwds)
File "/usr/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 488, in _read
return parser.read(nrows)
File "/usr/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1047, in read
index, columns, col_dict = self._engine.read(nrows)
File "/usr/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 309, in read
index, names = self._make_index(data, alldata, names)
File "/usr/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py", line 416, in _make_index
index = self._agg_index(index)
File "/usr/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py", line 512, in _agg_index
arr, _ = self._infer_types(arr, col_na_values | col_na_fvalues)
File "/usr/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py", line 695, in _infer_types
na_count = mask.sum()
AttributeError: 'BooleanArray' object has no attribute 'sum' And this is the CSV file (don't worry, is public information): geo20210715_pp.txt. My way for circumventing it is this: >>> import pandas as pd
>>> df = pd.read_csv(
... '/home/galaxyljgd/Documentos/otros/registro_electoral_2021/geo20210715_pp.txt',
... names=[
... 'cod_est', 'cod_mun', 'cod_par',
... 'nombre_est', 'nombre_mun', 'nombre_par'
... ],
... dtype={
... 'cod_est': 'UInt8', 'cod_mun': 'UInt8', 'cod_par': 'UInt8',
... 'nombre_est': 'string', 'nombre_mun': 'string',
... 'nombre_par': 'string'
... },
... engine='c',
... encoding='cp1252'
... )
>>> df.set_index(['cod_est', 'cod_mun', 'cod_par'], inplace=True)
>>> df
nombre_est nombre_mun nombre_par
cod_est cod_mun cod_par
21 1 3 EDO. ZULIA MP. BARALT PQ. MANUEL GUANIPA MATOS
4 EDO. ZULIA MP. BARALT PQ. MARCELINO BRICEÑO
5 EDO. ZULIA MP. BARALT PQ. SAN TIMOTEO
6 EDO. ZULIA MP. BARALT PQ. PUEBLO NUEVO
2 1 EDO. ZULIA MP. SANTA RITA PQ. PEDRO LUCAS URRIBARRI
... ... ... ...
6 1 11 EDO. BOLIVAR MP. CARONI PQ. 5 DE JULIO
99 13 4 EMBAJADA CANADA VANCOUVER
97 1 EMBAJADA JORDANIA AMMAN
98 1 EMBAJADA CHIPRE NICOSIA
99 1 EMBAJADA SERBIA BELGRADO
[1394 rows x 3 columns] |
Thanks @LawrenceJGD, I've updated the OP with a minimal example and confirmed this exists on master. Investigations and PRs to fix are most welcome! |
take |
Line 694 in the pandas/io/parsers/base_parser.py file,
Caused an "AttributeError: 'BooleanArray' object has no attribute 'sum'" following some global ananconda update. Changed the line to
na_count =mask.astype('uint8').sum()
fixed the error in my case.
Edit by rhshadrach:
Minimal example
The text was updated successfully, but these errors were encountered: