[SPARK-44841][FOLLOWUP] Add migration guide for the behavior change

itholic · ragnarok56 · commit 2220aa49b838 · 2024-03-02T15:14:07.000-05:00
### What changes were proposed in this pull request? This PR followups for apache#42525. ### Why are the changes needed? To fill the missing migration guide for the behavior change. ### Does this PR introduce _any_ user-facing change? No, this is migration guide update. ### How was this patch tested? The existing CI should pass. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#42578 from itholic/SPARK-44841-followup. Authored-by: itholic <haejoon.lee@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
diff --git a/python/docs/source/migration_guide/pyspark_upgrade.rst b/python/docs/source/migration_guide/pyspark_upgrade.rst
@@ -37,6 +37,7 @@ Upgrading from PySpark 3.5 to 4.0
 * In Spark 4.0, the various datetime attributes of ``DatetimeIndex`` (``day``, ``month``, ``year`` etc.) are now ``int32`` instead of ``int64`` from pandas API on Spark.
 * In Spark 4.0, ``sort_columns`` parameter from ``DataFrame.plot`` and `Series.plot`` has been removed from pandas API on Spark.
 * In Spark 4.0, the default value of ``regex`` parameter for ``Series.str.replace`` has been changed from ``True`` to ``False`` from pandas API on Spark. Additionally, a single character ``pat`` with ``regex=True`` is now treated as a regular expression instead of a string literal.
+* In Spark 4.0, the resulting name from ``value_counts`` for all objects sets to ``'count'`` (or ``'propotion'`` if ``nomalize=True`` was passed) from pandas API on Spark, and the index will be named after the original object.
 
 
 Upgrading from PySpark 3.3 to 3.4