Skip to content

Clean Up C Warnings #31935

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 17, 2020
Merged

Clean Up C Warnings #31935

merged 7 commits into from
Feb 17, 2020

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Feb 12, 2020

Getting closer to no warnings (at least with clang). Changes in this PR get rid of the following warnings:

warning: pandas/_libs/window/aggregations.pyx:60:4: Buffer unpacking not optimized away.
warning: pandas/_libs/window/aggregations.pyx:60:4: Buffer unpacking not optimized away.
warning: pandas/_libs/window/aggregations.pyx:60:36: Buffer unpacking not optimized away.
warning: pandas/_libs/window/aggregations.pyx:60:36: Buffer unpacking not optimized away.

pandas/_libs/hashing.c:25100:20: warning: unused function '__pyx_memview_get_object' [-Wunused-function]
  static PyObject *__pyx_memview_get_object(const char *itemp) {
                   ^
pandas/_libs/hashing.c:25105:12: warning: unused function '__pyx_memview_set_object' [-Wunused-function]
static int __pyx_memview_set_object(const char *itemp, PyObject *obj) {

pandas/_libs/algos.c:81410:3: warning: code will never be executed [-Wunreachable-code]
  __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_arr.rcbuffer->pybuffer);
  ^~~~~~~~~~~~~~~~~~~~~~~
pandas/_libs/algos.c:83045:3: warning: code will never be executed [-Wunreachable-code]
  __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_arr.rcbuffer->pybuffer);
  ^~~~~~~~~~~~~~~~~~~~~~~
pandas/_libs/algos.c:83951:3: warning: code will never be executed [-Wunreachable-code]
  __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_arr.rcbuffer->pybuffer);
  ^~~~~~~~~~~~~~~~~~~~~~~
pandas/_libs/algos.c:84857:3: warning: code will never be executed [-Wunreachable-code]
  __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_arr.rcbuffer->pybuffer);
  ^~~~~~~~~~~~~~~~~~~~~~~
pandas/_libs/algos.c:85034:3: warning: code will never be executed [-Wunreachable-code]
  __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_arr.rcbuffer->pybuffer);
  ^~~~~~~~~~~~~~~~~~~~~~~
pandas/_libs/algos.c:85940:3: warning: code will never be executed [-Wunreachable-code]
  __Pyx_SafeReleaseBuffer(&__pyx_pybuffernd_arr.rcbuffer->pybuffer);
  ^~~~~~~~~~~~~~~~~~~~~~~

pandas/_libs/tslibs/conversion.c:33113:20: warning: unused function '__pyx_memview_get_object' [-Wunused-function]
  static PyObject *__pyx_memview_get_object(const char *itemp) {
                   ^
pandas/_libs/tslibs/conversion.c:33118:12: warning: unused function '__pyx_memview_set_object' [-Wunused-function]
static int __pyx_memview_set_object(const char *itemp, PyObject *obj) {

Only a handful left after this

@WillAyd WillAyd added Build Library building on various platforms Clean labels Feb 12, 2020
@@ -15,7 +15,7 @@ DEF dROUNDS = 4


@cython.boundscheck(False)
def hash_object_array(object[:] arr, object key, object encoding='utf8'):
def hash_object_array(ndarray[object] arr, object key, object encoding='utf8'):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the impression that memory views used in tandem with object will throw warnings cc @jbrockmendel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense. this would also raise ATM if we were to pass a read-only ndarray[object], so a worthwhile change

@@ -1173,8 +1174,8 @@ ctypedef fused out_t:

@cython.boundscheck(False)
@cython.wraparound(False)
def diff_2d(ndarray[diff_t, ndim=2] arr,
ndarray[out_t, ndim=2] out,
def diff_2d(diff_t[:, :] arr,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm tried but seems to crash the compiler. Traceback below

[1/1] Cythonizing pandas/_libs/algos.pyx

Error compiling Cython file:
------------------------------------------------------------
...
    float64_t


@cython.boundscheck(False)
@cython.wraparound(False)
def diff_2d(const diff_t[:, :] arr,
                        ^
------------------------------------------------------------

pandas/_libs/algos.pyx:1178:25: Compiler crash in AnalyseDeclarationsTransform

File 'ModuleNode.py', line 124, in analyse_declarations: ModuleNode(algos.pyx:1:0,
    full_module_name = 'pandas._libs.algos')
File 'Nodes.py', line 431, in analyse_declarations: StatListNode(algos.pyx:1:0)
File 'Nodes.py', line 431, in analyse_declarations: StatListNode(algos.pyx:1178:0)
File 'Nodes.py', line 375, in analyse_declarations: CompilerDirectivesNode(algos.pyx:1178:0)
File 'Nodes.py', line 431, in analyse_declarations: StatListNode(algos.pyx:1178:0)
File 'Nodes.py', line 2887, in analyse_declarations: DefNode(algos.pyx:1178:0,
    modifiers = [...]/0,
    name = 'diff_2d',
    num_required_args = 4,
    outer_attrs = [...]/2,
    py_wrapper_required = True,
    reqd_kw_flags_cname = '0')
File 'Nodes.py', line 2925, in analyse_argument_types: DefNode(algos.pyx:1178:0,
    modifiers = [...]/0,
    name = 'diff_2d',
    num_required_args = 4,
    outer_attrs = [...]/2,
    py_wrapper_required = True,
    reqd_kw_flags_cname = '0')
File 'Nodes.py', line 1093, in analyse: MemoryViewSliceTypeNode(algos.pyx:1178:25,
    name = 'memoryview')

Compiler crash traceback from this point on:
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Compiler/Nodes.py", line 1093, in analyse
    self.type = PyrexTypes.MemoryViewSliceType(base_type, axes_specs)
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Compiler/PyrexTypes.py", line 625, in __init__
    self.dtype_name = Buffer.mangle_dtype_name(self.dtype)
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Compiler/Buffer.py", line 631, in mangle_dtype_name
    return prefix + dtype.specialization_name()
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Compiler/PyrexTypes.py", line 57, in specialization_name
    common_subs = (self.empty_declaration_code()
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Compiler/PyrexTypes.py", line 51, in empty_declaration_code
    self._empty_declaration = self.declaration_code('')
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Compiler/PyrexTypes.py", line 1582, in declaration_code
    return self.const_base_type.declaration_code("const %s" % entity_code, for_display, dll_linkage, pyrex)
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Compiler/PyrexTypes.py", line 1649, in declaration_code
    raise Exception("This may never happen, please report a bug")
Exception: This may never happen, please report a bug
Traceback (most recent call last):
  File "setup.py", line 784, in <module>
    setup_package()
  File "setup.py", line 754, in setup_package
    ext_modules=maybe_cythonize(extensions, compiler_directives=directives),
  File "setup.py", line 527, in maybe_cythonize
    return cythonize(extensions, *args, **kwargs)
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Build/Dependencies.py", line 1096, in cythonize
    cythonize_one(*args)
  File "/Users/williamayd/miniconda3/envs/pandas-dev/lib/python3.7/site-packages/Cython/Build/Dependencies.py", line 1219, in cythonize_one
    raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: pandas/_libs/algos.pyx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, possibly a cython issue

@@ -1179,7 +1180,7 @@ def diff_2d(diff_t[:, :] arr,
Py_ssize_t periods, int axis):
cdef:
Py_ssize_t i, j, sx, sy, start, stop
bint f_contig = arr.flags.f_contiguous
bint f_contig = PyBuffer_IsContiguous(arr, 'F')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat. probably not worth the effort on our end, but itd be nice if cython automatically made this optimization

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh didn't look too much into the subsequent code; think could be cleaned up?

@mroeschke
Copy link
Member

If you'd like to remove the warning: pandas/_libs/window/aggregations.pyx:60:4: Buffer unpacking not optimized away. warnings, this worked locally for me (clang as well)

diff --git a/pandas/_libs/window/aggregations.pyx b/pandas/_libs/window/aggregations.pyx
index f67581859..b9283b7ee 100644
--- a/pandas/_libs/window/aggregations.pyx
+++ b/pandas/_libs/window/aggregations.pyx
@@ -56,9 +56,7 @@ cdef:
 cdef inline int int_max(int a, int b): return a if a >= b else b
 cdef inline int int_min(int a, int b): return a if a <= b else b

-cdef inline bint is_monotonic_start_end_bounds(
-    ndarray[int64_t, ndim=1] start, ndarray[int64_t, ndim=1] end
-):
+cdef inline bint is_monotonic_start_end_bounds(int64_t[:] start, int64_t[:] end):
     return is_monotonic(start, False)[0] and is_monotonic(end, False)[0]

@WillAyd
Copy link
Member Author

WillAyd commented Feb 12, 2020

Nice - I think that's a better solution

@mroeschke
Copy link
Member

Darn. Sorry, looks like my suggestion didn't work @WillAyd.

I only compiled + tested the tests in pandas/tests/window. Guess the way you had it before works.

@WillAyd
Copy link
Member Author

WillAyd commented Feb 12, 2020

I think the failure is from something else. Giving it a closer look

@jreback jreback added this to the 1.1 milestone Feb 15, 2020
@@ -56,7 +56,7 @@ cdef:
cdef inline int int_max(int a, int b): return a if a >= b else b
cdef inline int int_min(int a, int b): return a if a <= b else b

cdef inline bint is_monotonic_start_end_bounds(
cdef bint is_monotonic_start_end_bounds(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, why can't this also be in-line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the cause of the warning: pandas/_libs/window/aggregations.pyx:60:4: Buffer unpacking not optimized away. warning. @mroeschke suggested a potential inline approach that got rid of the warning and used memory views but caused test failures. Could still adjust a few things to make that work but I think without a larger effort this is the best option for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a highly optimized path

if u can show this doesn’t have any perf regressions then ok

@WillAyd
Copy link
Member Author

WillAyd commented Feb 16, 2020

@jreback here are the ASV results after this change. Mixed bag but generally my interpretation would be generally better performance

.       before           after         ratio
     [5dd27ed7]       [644055bb]
     <master>         <memview-warnings>
+     3.50±0.02ms       4.21±0.9ms     1.21  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 1, 'linear')
+     3.10±0.01ms       3.71±0.3ms     1.20  rolling.Quantile.time_quantile('Series', 1000, 'int', 1, 'nearest')
+      2.70±0.2ms       3.22±0.3ms     1.19  rolling.VariableWindowMethods.time_rolling('DataFrame', '50s', 'float', 'count')
+     1.81±0.05ms      2.13±0.09ms     1.18  rolling.EWMMethods.time_ewm('DataFrame', 1000, 'int', 'mean')
+      3.27±0.1ms       3.80±0.3ms     1.16  rolling.Quantile.time_quantile('Series', 1000, 'int', 0, 'nearest')
+         164±4ms         189±10ms     1.15  rolling.Apply.time_rolling('Series', 3, 'float', <function Apply.<lambda>>, False)
+      2.76±0.1ms       3.13±0.3ms     1.14  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 0, 'nearest')
+      52.6±0.4ms         58.3±8ms     1.11  rolling.Quantile.time_quantile('Series', 10, 'int', 0.5, 'linear')
-         245±9ms        222±0.5ms     0.91  rolling.Engine.time_rolling_apply('DataFrame', 'int', <function sum>, 'numba')
-     3.09±0.07ms      2.81±0.02ms     0.91  rolling.Methods.time_rolling('Series', 1000, 'int', 'skew')
-      2.85±0.9ms      2.58±0.02ms     0.90  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 1, 'higher')
-      2.94±0.1ms      2.65±0.04ms     0.90  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'count')
-      3.26±0.2ms      2.91±0.08ms     0.89  rolling.ExpandingMethods.time_expanding('Series', 'float', 'skew')
-        4.01±2ms      3.58±0.07ms     0.89  rolling.Apply.time_rolling('Series', 300, 'float', <function Apply.<lambda>>, True)
-      4.57±0.1ms      4.02±0.03ms     0.88  rolling.ExpandingMethods.time_expanding('Series', 'float', 'count')
-      5.44±0.3ms      4.74±0.04ms     0.87  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'float', 'min')
-      2.53±0.1ms      2.20±0.06ms     0.87  rolling.ExpandingMethods.time_expanding('Series', 'int', 'sum')
-      2.13±0.3ms      1.85±0.01ms     0.87  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'mean')
-      2.59±0.2ms      2.20±0.02ms     0.85  rolling.Methods.time_rolling('Series', 1000, 'float', 'mean')
-        21.8±2ms       18.3±0.1ms     0.84  rolling.Pairwise.time_pairwise(None, 'corr', True)
-      3.57±0.2ms       2.93±0.1ms     0.82  rolling.Apply.time_rolling('Series', 300, 'float', <function sum>, True)
-      3.38±0.4ms       2.77±0.1ms     0.82  rolling.VariableWindowMethods.time_rolling('Series', '50s', 'float', 'count')
-        4.41±1ms      3.56±0.09ms     0.81  rolling.Quantile.time_quantile('Series', 10, 'float', 1, 'lower')
-      4.20±0.1ms      3.39±0.01ms     0.81  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0, 'lower')
-      4.78±0.9ms       3.81±0.2ms     0.80  rolling.Methods.time_rolling('Series', 1000, 'float', 'max')
-      5.22±0.2ms       4.10±0.2ms     0.79  rolling.ExpandingMethods.time_expanding('Series', 'int', 'count')
-     3.71±0.04ms      2.90±0.02ms     0.78  rolling.VariableWindowMethods.time_rolling('Series', '1d', 'float', 'sum')
-      5.54±0.5ms       4.32±0.4ms     0.78  rolling.Quantile.time_quantile('Series', 10, 'float', 0, 'nearest')
-      3.67±0.7ms      2.84±0.07ms     0.77  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'float', 'sum')
-      3.89±0.2ms      2.90±0.05ms     0.75  rolling.ExpandingMethods.time_expanding('Series', 'float', 'std')
-      4.75±0.6ms       3.43±0.2ms     0.72  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'min')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

@jbrockmendel
Copy link
Member

@mroeschke of the asv results Will posted, are any of those benchmarks more important than others? i.e. are the few that are slower deal-breakers?

@mroeschke
Copy link
Member

None that are concerning IMO. And the Numba engine benchmarks improved which I'm happy about :)

@jbrockmendel
Copy link
Member

LGTM

@jreback
Copy link
Contributor

jreback commented Feb 17, 2020

thanks @WillAyd

@jreback jreback merged commit c81b0ba into pandas-dev:master Feb 17, 2020
roberthdevries pushed a commit to roberthdevries/pandas that referenced this pull request Mar 2, 2020
@WillAyd WillAyd deleted the memview-warnings branch April 12, 2023 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms Clean
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants