Skip to content

Commit 2220aa4

Browse files
itholicragnarok56
authored andcommitted
[SPARK-44841][FOLLOWUP] Add migration guide for the behavior change
### What changes were proposed in this pull request? This PR followups for apache#42525. ### Why are the changes needed? To fill the missing migration guide for the behavior change. ### Does this PR introduce _any_ user-facing change? No, this is migration guide update. ### How was this patch tested? The existing CI should pass. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#42578 from itholic/SPARK-44841-followup. Authored-by: itholic <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
1 parent ebcb3de commit 2220aa4

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

python/docs/source/migration_guide/pyspark_upgrade.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ Upgrading from PySpark 3.5 to 4.0
3737
* In Spark 4.0, the various datetime attributes of ``DatetimeIndex`` (``day``, ``month``, ``year`` etc.) are now ``int32`` instead of ``int64`` from pandas API on Spark.
3838
* In Spark 4.0, ``sort_columns`` parameter from ``DataFrame.plot`` and `Series.plot`` has been removed from pandas API on Spark.
3939
* In Spark 4.0, the default value of ``regex`` parameter for ``Series.str.replace`` has been changed from ``True`` to ``False`` from pandas API on Spark. Additionally, a single character ``pat`` with ``regex=True`` is now treated as a regular expression instead of a string literal.
40+
* In Spark 4.0, the resulting name from ``value_counts`` for all objects sets to ``'count'`` (or ``'propotion'`` if ``nomalize=True`` was passed) from pandas API on Spark, and the index will be named after the original object.
4041

4142

4243
Upgrading from PySpark 3.3 to 3.4

0 commit comments

Comments
 (0)