-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Index with dtype int32 #16404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is possible, but would require some non-trivial work. RangeIndex obviates the need for much of this anyhow. Further most indexing requires int64 anyhow, so you end up upcasting at times. Careful performance testing would be required. So this would require community contribution to do this. |
By the way: I think the following should behave as expected: class Int32Index(pd.Int64Index):
_default_dtype = np.int32
i = Int32Index(np.array([...], dtype='int32')) ... except that, as suggested by @jreback , unexpected upcastings may happen when doing any non-trivial operation. |
I don't recommend this. At least in pandas 0.22.0 this doesn't work as expected.
Seems like sort_values is missing some internal validation (the output len should be the same as the input len) |
@stuz5000 this is totally unsupported |
Me neither ;-) But in case you want to keep experimenting, try class Int32Index(pd.Int64Index):
_default_dtype = np.int32
@property
def asi8(self):
return self.values which fixes the problem you report. |
Code Sample, a copy-pastable example if possible
Problem description
I want to make a DataFrame with Index that has size of int32. Can't do it.
A discussion here: https://stackoverflow.com/questions/44090944/how-to-change-index-dtype-of-pandas-dataframe-to-int32
Expected Output
Index with dtype int32. It will use 4 bytes instead of 8 bytes.
Output of
pd.show_versions()
pandas: 0.20.1
pytest: 2.9.2
pip: 8.1.2
setuptools: 34.3.2
Cython: 0.24.1
numpy: 1.12.0
scipy: 0.18.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.1
feather: None
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: