-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Using str() in .map() on floats gives string with higher precision than before #13228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
you would need to be sure that you are using exactly the same python version (I mean even the minor version matters) and well as the exact same numpy version. This is not something pandas controls. |
stringifying floats is a really bad idea. (even for comparisons), use the pandas functions |
I thought so too Jeff, but all I did was switching between "conda install pandas=0.18.1" and "conda install pandas=0.17.1" and the installer never mentioned anything but pandas. Here's pd.show_version() for 0.18.1: INSTALLED VERSIONScommit: None pandas: 0.18.1 and 0.17.1: INSTALLED VERSIONScommit: None pandas: 0.17.1 |
@marcomayer I do recall some changes w.r.t. float formatting so its certainly possible. not really sure where/how though. If you'd like to investigate would be great. |
regarding stringifying floats, I always ran into trouble because I have to find a way to get numbers to decimal.decimal, then save them (fast) to HD5 and get them exactly the same back when reading it from HD5 and into a DF. The only way I found to make that work was to take the decimal.decimal make a str() out of it and then write it to HD5. Then read it and convert it back to decimal.decimal. Probably there's a much more sane way to go but I couldn't find it yet. |
@marcomayer doesn't sound fun. you really need to use If I had to do this, and wanted to store in HDF5; here is a way (just sort of cooked this up), but should be pretty efficient. say I have a you can explode them like:
then store them as a table of integers (its 50 wide) in a sub-node. Then you can exactly reconstruct them. |
Thanks Jeff, I'll definitely give that a try, maybe it'll even be faster than the str()-converts I do at the moment. Yes having decimal.decimal as a native format in pandas and HD5 etc. would be amazing. And to be honest I do wonder why it's not as in the world of finance I can't imagine to be the only one having that necessity. Sure for most use-cases there are ways around decimal.decimal as hardly anyone really needs that precision, but then using just floats quickly causes other pains when dealing with futures that tick in 0.0078125 for example and you have to round all the time etc. |
but |
it comes down to having to round a lot each time you add or substract a few ticks and then check for greater or lower then for example, where that 1 at the end would make a difference, or having to round each time to place an order since the exchange won't accept it if it has more decimals than the instrument. |
but maybe I'm just a bit too paranoid and just decided to take the way that looked save to me at some point ;) |
|
another possibly is to store 2 columns, the actual value as a float and the rounding unit. another way is to turn it into an int64 and store the significance. |
|
that's a good idea, if I find some time I'll give it a try! Would indeed be wonderful to get rid of decimal.decimal which as a non-native is a real PITA ;) |
This looks to be fixed on master. Could use a test:
|
Code Sample, a copy-pastable example if possible
Expected Output
I'd expect the same output as in 0.17.x and before.
I do this a lot to convert floats to decimal.decimal with .map(lambda x: D(str(x))) which is slightly faster than using .astype(str).map(D).
This also messed up many of my unit-tests where I convert DFs to string dicts. Thanks to those I found this at all.
I checked the change docs but couldn't find something that points to why this should have changed.
output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: