Skip to content

Pypy fixes #15080

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Pypy fixes #15080

wants to merge 4 commits into from

Conversation

mattip
Copy link
Contributor

@mattip mattip commented Jan 7, 2017

benign changes visa-vis CPython needed to build pandas for PyPy 2.7

@codecov-io
Copy link

codecov-io commented Jan 7, 2017

Current coverage is 84.77% (diff: 100%)

Merging #15080 into master will increase coverage by <.01%

@@             master     #15080   diff @@
==========================================
  Files           145        145          
  Lines         51212      51220     +8   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43413      43421     +8   
  Misses         7799       7799          
  Partials          0          0          

Powered by Codecov. Last update 42cdb34...b38d7be

@mattip
Copy link
Contributor Author

mattip commented Jan 8, 2017

commit 2e3c256 fixes a bug where substitution of PyString_GET_SIZE with PyUnicode_GET_SIZE always happened, on both python 2 and 3. It just turns out that the c-macros PyUnicode_GET_SIZE and PyString_GET_SIZE are equivalent on CPython, so the error compiled and ran as advertised. The issue manifested itself on PyPy which is more particular. [EDIT upon review and updated pull request this became changeset 91522c6, thanks for the corrections] [EDIT the suggested fix didn't work, revert to working version}

@@ -26,9 +26,9 @@ from cpython cimport (PyDict_New, PyDict_GetItem, PyDict_SetItem,
PyBytes_GET_SIZE,
PyUnicode_GET_SIZE)

try:
IF PY_MAJOR_VERSION == 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this more typical? (and already defined)

#if PY_VERSION_HEX >= 0x03000000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is more typical for C code, but converting a pyx file to C happens during a cython compiliation, and the only variables available are the ones appearing in cython_compile_time_env AFAICT, which is why I needed to define the variable in setup.py.

It is not surprising that this is not well known, the only reference I could find on it was a stack overflow question. I tried all different ways to rewrite this code to keep the PyString_GET_SIZE on python2 and make it PyUnicode_GET_SIZE on python3, this was the least intrusive

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chris-b1 that did the trick, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chris-b1 I could not get that to work since this is needed at compile time, so I needed the upper-case IF <env_var>, not a runtime if <value> :( I reverted to the original fix

Copy link
Contributor

@jreback jreback Jan 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still seems very odd, the why can't you do

#if PY_VERSION_HEX >= 0x03000000
    from cpython cimport PyUnicode_GET_SIZE as PyString_GET_SIZE
#else
    from cpython cimport PyString_GET_SIZE
#endif

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that. It does not work for me. Running python3 setup.py cython results in a lib.c with PyString_GET_SIZE in the translated max_len_string_array, where if it worked that would become PyUnicode_GET_SIZE

@@ -149,6 +149,11 @@ def build_extensions(self):

with open(outfile, "w") as f:
f.write(pyxcontent)
# XXX used only in lib.pyx, assumes old_build_ext
if not self.cython_compile_time_env:
self.cython_compile_time_env = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

@@ -427,7 +427,11 @@ int Object_npyObjectAddKey(void *prv, JSOBJ obj, JSOBJ name, JSOBJ value) {

// only fill label array once, assumes all column labels are the same
// for 2-dimensional arrays.
#ifdef PYPY_VERSION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you are defining this on external compile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is defined when compiling c-extensions with the PyPy headers

@jreback jreback added the Compat pandas objects compatability with Numpy or Python functions label Jan 9, 2017
@mattip
Copy link
Contributor Author

mattip commented Jan 9, 2017

updated pull request as per reviews and suggestions

@jreback jreback added this to the 0.20.0 milestone Jan 9, 2017
jreback pushed a commit to jreback/pandas that referenced this pull request Jan 9, 2017
BUG: bring implementation into compliance with cython documentation

BUG: fix improper use of conditional compilation

closes pandas-dev#15080
@mattip
Copy link
Contributor Author

mattip commented Jan 10, 2017

An alternative to the PyString_GET_SIZE vs. PyUnicode_GET_SIZE would be to just use Py_SIZE instead on obth python2 and python3, which is less clear but works. Any opinions?

@jreback
Copy link
Contributor

jreback commented Jan 10, 2017

@mattip FYI I have #15101 to more generally fix the dealloc issue.

@mattip
Copy link
Contributor Author

mattip commented Jan 10, 2017

@jreback cool, #15101 makes my changes redundant. So in light of the unsatisfactory solution to the PyString_GET_SIZE vs PyUInicode_GET_SIZE problem, and #15101, I will close this pull request, and resubmit only the change to pandas/src/ujson/python/JSONtoObj.c

@mattip mattip closed this Jan 10, 2017
@mattip mattip deleted the pypy-fixes branch January 10, 2017 20:32
@mattip
Copy link
Contributor Author

mattip commented Jan 10, 2017

Sorry for the noise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants