You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File ~/tmp/.venv/lib/python3.10/site-packages/dask/dataframe/core.py:312, in Scalar.__bool__(self)
47
-
311def__bool__(self):
48
-
-->312raiseTypeError(
49
-
313f"Trying to convert {self} to a boolean value. Because Dask objects are "
50
-
314"lazily evaluated, they cannot be converted to a boolean value or used "
51
-
315"in boolean conditions like if statements. Try calling .compute() to "
52
-
316"force computation prior to converting to a boolean value or using in "
53
-
317"a conditional statement."
54
-
318 )
41
+
[...]
55
42
56
43
TypeError: Trying to convert dd.Scalar<gt-bbc3..., dtype=bool> to a boolean value. Because Dask objects are lazily evaluated, they cannot be converted to a boolean value or used in boolean conditions like if statements. Try calling .compute() to force computation prior to converting to a boolean value or using in a conditional statement.
57
44
```
58
45
59
-
Exactly which methods require computation may vary across implementations. Some may
60
-
implicitly do it for users under-the-hood for certain methods, whereas others require
61
-
the user to explicitly trigger it.
62
-
63
-
Therefore, the Dataframe API has a `Dataframe.maybe_evaluate` method. This is to be
64
-
interpreted as a hint, rather than as a directive - the implementation itself may decide
65
-
whether to force execution at this step, or whether to defer it to later.
66
-
67
-
Operations which require `DataFrame.may_execute` to have been called at some prior
68
-
point are:
69
-
-`DataFrame.to_array`
70
-
-`DataFrame.shape`
71
-
-`Column.to_array`
72
-
- calling `bool`, `int`, or `float` on a scalar
73
-
74
-
Therefore, the Standard-compliant way to write the code above is:
46
+
The Dataframe API has a `DataFrame.maybe_evaluate` for addressing the above. We can use it to rewrite the code above
47
+
as follows:
75
48
```python
76
49
df: DataFrame
77
50
df = df.may_execute()
@@ -82,6 +55,20 @@ for column_name in df.column_names:
82
55
return features
83
56
```
84
57
58
+
Note that `maybe_evaluate` is to be interpreted as a hint, rather than as a directive -
59
+
the implementation itself may decide
60
+
whether to force execution at this step, or whether to defer it to later.
61
+
For example, a dataframe which can convert to a lazy array could decide to ignore
62
+
`maybe_evaluate` when evaluting `DataFrame.to_array` but to respect it when evaluating
63
+
`float(Column.std())`.
64
+
65
+
Operations which require `DataFrame.may_execute` to have been called at some prior
66
+
point are:
67
+
-`DataFrame.to_array`
68
+
-`DataFrame.shape`
69
+
-`Column.to_array`
70
+
- calling `bool`, `int`, or `float` on a scalar
71
+
85
72
Note now `DataFrame.may_execute` is called only once, and as late as possible.
86
73
Conversely, the "wrong" way to execute the above would be:
0 commit comments