-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Cannot append to DataFrame with Timestamp column with non-nanosecond unit #55374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There is a bug in pandas pandas-dev/pandas#55374
There is a bug in pandas pandas-dev/pandas#55374
There is a bug in pandas pandas-dev/pandas#55374
I could reproduce the issue and found that the error disappears if the timestamp is higher to pandas.Timestamp.max that is equal to Timestamp('2262-04-11 23:47:16.854775807'), whatever the units ('m', 's', 'ms',...) so for example the below code with 'us' units works fine:
I know it doesn't help too much but it can give some direction for the investigation. |
I should say that when I was digging into this error to begin with, the error occurred in the >>> import pandas as pd
>>> df = pd.DataFrame({"time": pd.Timestamp(1513393355, unit="us"), "A":[0]})
>>> pd.concat([df,df])
time A
0 1970-01-01 00:25:13.393355 0
0 1970-01-01 00:25:13.393355 0
>>> df.loc[1] = df.loc[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../.venv/lib/python3.10/site-packages/pandas/core/indexing.py", line 885, in __setitem__
iloc._setitem_with_indexer(indexer, value, self.name)
File "/.../.venv/lib/python3.10/site-packages/pandas/core/indexing.py", line 1883, in _setitem_with_indexer
self._setitem_with_indexer_missing(indexer, value)
File "/.../.venv/lib/python3.10/site-packages/pandas/core/indexing.py", line 2241, in _setitem_with_indexer_missing
self.obj._mgr = self.obj._append(value)._mgr
File "/.../.venv/lib/python3.10/site-packages/pandas/core/frame.py", line 10227, in _append
result = concat(
File "/.../.venv/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 393, in concat
return op.get_result()
File "/.../.venv/lib/python3.10/site-packages/pandas/core/reshape/concat.py", line 680, in get_result
new_data = concatenate_managers(
File "/.../.venv/lib/python3.10/site-packages/pandas/core/internals/concat.py", line 199, in concatenate_managers
return BlockManager(tuple(blocks), axes)
File "/.../.venv/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 916, in __init__
self._verify_integrity()
File "/.../.venv/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 923, in _verify_integrity
raise_construction_error(tot_items, block.shape[1:], self.axes)
File "/.../.venv/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 2118, in raise_construction_error
raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}")
ValueError: Shape of passed values is (1, 2), indices imply (2, 2) |
currently investigating concat.py files,
|
take |
Not necessarily a duplicate but pretty similar to #55067 |
I think I have a similar issue, with pandas 2.1.1 and pyarrow 13.0.0:
Let me know if I should open a new issue because it's substantially different; I know some of these have already been addressed in main (and I didn't have a chance to check main) so didn't want to spam. |
Ah, this is a bug in pyarrow==13.0.0, downgrading to 12.0.0 solves that. |
No longer raises on main, possibly closed by #53641? |
Looks to have been addressed by #53641 so closing |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
Appending a row to a DataFrame with
.loc
is broken when one of the columns contains a timestamp that has units that are not nanoseconds. Attempting to do so raises a ValueError.Expected Behavior
The new row should be appended to the DataFrame successfully. The below example works:
Installed Versions
INSTALLED VERSIONS
commit : 77bc67a
python : 3.11.1.final.0
python-bits : 64
OS : Darwin
OS-release : 22.6.0
Version : Darwin Kernel Version 22.6.0: Wed Jul 5 22:22:05 PDT 2023; root:xnu-8796.141.3~6/RELEASE_ARM64_T6000
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 0+untagged.33100.g77bc67a.dirty
numpy : 1.24.4
pytz : 2023.3
dateutil : 2.8.2
setuptools : 65.5.0
pip : 23.2.1
Cython : 0.29.33
pytest : 7.4.0
hypothesis : 6.82.5
sphinx : 6.2.1
blosc : 1.11.1
feather : None
xlsxwriter : 3.1.2
lxml.etree : 4.9.3
html5lib : 1.1
pymysql : 1.4.6
psycopg2 : 2.9.7
jinja2 : 3.1.2
IPython : 8.14.0
pandas_datareader : None
bs4 : 4.12.2
bottleneck : 1.3.7
dataframe-api-compat: None
fastparquet : 2023.7.0
fsspec : 2023.6.0
gcsfs : 2023.6.0
matplotlib : 3.7.2
numba : 0.57.1
numexpr : 2.8.5
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : 12.0.1
pyreadstat : 1.2.2
python-calamine : None
pyxlsb : 1.0.10
s3fs : 2023.6.0
scipy : 1.11.2
sqlalchemy : 2.0.20
tables : None
tabulate : 0.9.0
xarray : 2023.7.0
xlrd : 2.0.1
zstandard : 0.21.0
tzdata : 2023.3
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: