Skip to content

Commit 438ff01

Browse files
zhengruifengragnarok56
authored andcommitted
[SPARK-44889][PYTHON][CONNECT] Fix docstring of monotonically_increasing_id
### What changes were proposed in this pull request? Fix docstring of `monotonically_increasing_id` ### Why are the changes needed? 1, using `from pyspark.sql import functions as F` to avoid implicit wildcard import; 2, using dataframe APIs instead of RDD, so the docstring can be reused in Connect; after this fix, all dostrings are reused between vanilla PySpark and Spark Connect Python Client ### Does this PR introduce _any_ user-facing change? yes ### How was this patch tested? CI ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#42582 from zhengruifeng/fix_monotonically_increasing_id_docstring. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
1 parent 5a0a7c0 commit 438ff01

File tree

2 files changed

+16
-6
lines changed

2 files changed

+16
-6
lines changed

python/pyspark/sql/connect/functions.py

-3
Original file line numberDiff line numberDiff line change
@@ -3901,9 +3901,6 @@ def _test() -> None:
39013901

39023902
globs = pyspark.sql.connect.functions.__dict__.copy()
39033903

3904-
# Spark Connect does not support Spark Context but the test depends on that.
3905-
del pyspark.sql.connect.functions.monotonically_increasing_id.__doc__
3906-
39073904
globs["spark"] = (
39083905
PySparkSession.builder.appName("sql.connect.functions tests")
39093906
.remote("local[4]")

python/pyspark/sql/functions.py

+16-3
Original file line numberDiff line numberDiff line change
@@ -4312,9 +4312,22 @@ def monotonically_increasing_id() -> Column:
43124312
43134313
Examples
43144314
--------
4315-
>>> df0 = sc.parallelize(range(2), 2).mapPartitions(lambda x: [(1,), (2,), (3,)]).toDF(['col1'])
4316-
>>> df0.select(monotonically_increasing_id().alias('id')).collect()
4317-
[Row(id=0), Row(id=1), Row(id=2), Row(id=8589934592), Row(id=8589934593), Row(id=8589934594)]
4315+
>>> from pyspark.sql import functions as F
4316+
>>> spark.range(0, 10, 1, 2).select(F.monotonically_increasing_id()).show()
4317+
+-----------------------------+
4318+
|monotonically_increasing_id()|
4319+
+-----------------------------+
4320+
| 0|
4321+
| 1|
4322+
| 2|
4323+
| 3|
4324+
| 4|
4325+
| 8589934592|
4326+
| 8589934593|
4327+
| 8589934594|
4328+
| 8589934595|
4329+
| 8589934596|
4330+
+-----------------------------+
43184331
"""
43194332
return _invoke_function("monotonically_increasing_id")
43204333

0 commit comments

Comments
 (0)