You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### What changes were proposed in this pull request?
This PR proposes to remove `squeeze` parameter from `read_csv` to follow the behavior of latest pandas. See pandas-dev/pandas#40413 and pandas-dev/pandas#43427 for detail.
This PR also enables more tests for pandas 2.0.0 and above.
### Why are the changes needed?
To follow the behavior of latest pandas, and increase the test coverage.
### Does this PR introduce _any_ user-facing change?
`squeeze` will be no longer available from `read_csv`. Otherwise, it's test-only.
### How was this patch tested?
Enabling & updating the existing tests.
Closes#42551 from itholic/pandas_remaining_tests.
Authored-by: itholic <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
Copy file name to clipboardExpand all lines: python/docs/source/migration_guide/pyspark_upgrade.rst
+1
Original file line number
Diff line number
Diff line change
@@ -38,6 +38,7 @@ Upgrading from PySpark 3.5 to 4.0
38
38
* In Spark 4.0, ``sort_columns`` parameter from ``DataFrame.plot`` and `Series.plot`` has been removed from pandas API on Spark.
39
39
* In Spark 4.0, the default value of ``regex`` parameter for ``Series.str.replace`` has been changed from ``True`` to ``False`` from pandas API on Spark. Additionally, a single character ``pat`` with ``regex=True`` is now treated as a regular expression instead of a string literal.
40
40
* In Spark 4.0, the resulting name from ``value_counts`` for all objects sets to ``'count'`` (or ``'propotion'`` if ``nomalize=True`` was passed) from pandas API on Spark, and the index will be named after the original object.
41
+
* In Spark 4.0, ``squeeze`` parameter from ``ps.read_csv`` and ``ps.read_excel`` has been removed from pandas API on Spark.
0 commit comments