Skip to content

STY: Enable ruff ambiguous unicode character #54330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 1, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -7134,7 +7134,7 @@ def value_counts(
ascending : bool, default False
Sort in ascending order.
dropna : bool, default True
Dont include counts of rows that contain NA values.
Don't include counts of rows that contain NA values.

.. versionadded:: 1.3.0

Expand Down Expand Up @@ -9971,7 +9971,7 @@ def map(
func : callable
Python function, returns a single value from a single value.
na_action : {None, 'ignore'}, default None
If ignore, propagate NaN values, without passing them to func.
If 'ignore', propagate NaN values, without passing them to func.
**kwargs
Additional keyword arguments to pass as keywords arguments to
`func`.
Expand Down Expand Up @@ -10057,7 +10057,7 @@ def applymap(
func : callable
Python function, returns a single value from a single value.
na_action : {None, 'ignore'}, default None
If ignore, propagate NaN values, without passing them to func.
If 'ignore', propagate NaN values, without passing them to func.
**kwargs
Additional keyword arguments to pass as keywords arguments to
`func`.
Expand Down
4 changes: 2 additions & 2 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -5633,7 +5633,7 @@ def filter(
Keep labels from axis for which "like in label == True".
regex : str (regular expression)
Keep labels from axis for which re.search(regex, label) == True.
axis : {0 or index, 1 or columns, None}, default None
axis : {0 or 'index', 1 or 'columns', None}, default None
The axis to filter on, expressed either as an index (int)
or axis name (str). By default this is the info axis, 'columns' for
DataFrame. For `Series` this parameter is unused and defaults to `None`.
Expand Down Expand Up @@ -5922,7 +5922,7 @@ def sample(

np.random.Generator objects now accepted

axis : {0 or index, 1 or columns, None}, default None
axis : {0 or 'index', 1 or 'columns', None}, default None
Axis to sample. Accepts axis number or name. Default is stat axis
for given data type. For `Series` this parameter is unused and defaults to `None`.
ignore_index : bool, default False
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2322,7 +2322,7 @@ def value_counts(
ascending : bool, default False
Sort in ascending order.
dropna : bool, default True
Dont include counts of rows that contain NA values.
Don't include counts of rows that contain NA values.

Returns
-------
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ class DatetimeIndex(DatetimeTimedeltaMixin):
yearfirst : bool, default False
If True parse dates in `data` with the year first order.
dtype : numpy.dtype or DatetimeTZDtype or str, default None
Note that the only NumPy dtype allowed is datetime64[ns].
Note that the only NumPy dtype allowed is `datetime64[ns]`.
copy : bool, default False
Make a copy of input ndarray.
name : label, default None
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/reshape/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -2336,7 +2336,7 @@ def _factorize_keys(
sort : bool, defaults to True
If True, the encoding is done such that the unique elements in the
keys are sorted.
how : {left’, ‘right’, ‘outer’, ‘inner}, default inner
how : {'left', 'right', 'outer', 'inner'}, default 'inner'
Type of merge.

Returns
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/tools/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -916,7 +916,7 @@ def to_datetime(
- **DataFrame/dict-like** are converted to :class:`Series` with
:class:`datetime64` dtype. For each row a datetime is created from assembling
the various dataframe columns. Column keys can be common abbreviations
like [year’, ‘month’, ‘day’, ‘minute’, ‘second’, ‘ms’, ‘us’, ‘ns’]) or
like ['year', 'month', 'day', 'minute', 'second', 'ms', 'us', 'ns']) or
plurals of the same.

The following causes are responsible for :class:`datetime.datetime` objects
Expand Down
4 changes: 2 additions & 2 deletions pandas/io/excel/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1268,14 +1268,14 @@ def __init__(
@property
def date_format(self) -> str:
"""
Format string for dates written into Excel files (e.g. YYYY-MM-DD).
Format string for dates written into Excel files (e.g. 'YYYY-MM-DD').
"""
return self._date_format

@property
def datetime_format(self) -> str:
"""
Format string for dates written into Excel files (e.g. YYYY-MM-DD).
Format string for dates written into Excel files (e.g. 'YYYY-MM-DD').
"""
return self._datetime_format

Expand Down
2 changes: 1 addition & 1 deletion pandas/io/formats/html.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def render(self) -> list[str]:
self._write_table()

if self.should_show_dimensions:
by = chr(215) # ×
by = chr(215) # × # noqa: RUF003
self.write(
f"<p>{len(self.frame)} rows {by} {len(self.frame.columns)} columns</p>"
)
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/formats/style.py
Original file line number Diff line number Diff line change
Expand Up @@ -3489,7 +3489,7 @@ def highlight_quantile(
Left bound, in [0, q_right), for the target quantile range.
q_right : float, default 1
Right bound, in (q_left, 1], for the target quantile range.
interpolation : {linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest}
interpolation : {'linear', 'lower', 'higher', 'midpoint', 'nearest'}
Argument passed to ``Series.quantile`` or ``DataFrame.quantile`` for
quantile estimation.
inclusive : {'both', 'neither', 'left', 'right'}
Expand Down
6 changes: 3 additions & 3 deletions pandas/io/sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -441,7 +441,7 @@ def read_sql_query(
rows to include in each chunk.
dtype : Type name or dict of columns
Data type for data or columns. E.g. np.float64 or
{‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64}.
{'a': np.float64, 'b': np.int32, 'c': 'Int64'}.

.. versionadded:: 1.3.0
dtype_backend : {'numpy_nullable', 'pyarrow'}, default 'numpy_nullable'
Expand Down Expand Up @@ -597,7 +597,7 @@ def read_sql(
.. versionadded:: 2.0
dtype : Type name or dict of columns
Data type for data or columns. E.g. np.float64 or
{‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64}.
{'a': np.float64, 'b': np.int32, 'c': 'Int64'}.
The argument is ignored if a table is passed instead of a query.

.. versionadded:: 2.0.0
Expand Down Expand Up @@ -1759,7 +1759,7 @@ def read_query(
of rows to include in each chunk.
dtype : Type name or dict of columns
Data type for data or columns. E.g. np.float64 or
{‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64}
{'a': np.float64, 'b': np.int32, 'c': 'Int64'}

.. versionadded:: 1.3.0

Expand Down
2 changes: 1 addition & 1 deletion pandas/io/stata.py
Original file line number Diff line number Diff line change
Expand Up @@ -3768,7 +3768,7 @@ def _validate_variable_name(self, name: str) -> str:
and c != "_"
)
or 128 <= ord(c) < 192
or c in {"×", "÷"}
or c in {"×", "÷"} # noqa: RUF001
):
name = name.replace(c, "_")

Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/extension/test_arrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -2104,7 +2104,7 @@ def test_str_slice_replace(start, stop, repl, exp):
["!|,", "isalnum", False],
["aaa", "isalpha", True],
["!!!", "isalpha", False],
["٠", "isdecimal", True],
["٠", "isdecimal", True], # noqa: RUF001
["~!", "isdecimal", False],
["2", "isdigit", True],
["~", "isdigit", False],
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/frame/methods/test_to_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -957,7 +957,7 @@ def test_to_csv_path_is_none(self, float_frame):
(DataFrame([["abc", "def", "ghi"]], columns=["X", "Y", "Z"]), "ascii"),
(DataFrame(5 * [[123, "你好", "世界"]], columns=["X", "Y", "Z"]), "gb2312"),
(
DataFrame(5 * [[123, "Γειά σου", "Κόσμε"]], columns=["X", "Y", "Z"]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noqa here instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Fixed

DataFrame(5 * [[123, "Γειά oou", "Κόσμε"]], columns=["X", "Y", "Z"]),
"cp737",
),
],
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/groupby/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -1402,7 +1402,7 @@ def test_groupby_dtype_inference_empty():


def test_groupby_unit64_float_conversion():
#  GH: 30859 groupby converts unit64 to floats sometimes
# GH: 30859 groupby converts unit64 to floats sometimes
df = DataFrame({"first": [1], "second": [1], "value": [16148277970000000000]})
result = df.groupby(["first", "second"])["value"].max()
expected = Series(
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/io/json/test_json_table_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ def test_read_json_from_to_json_results(self):
"recommender_id": {"row_0": 3},
"recommender_name_jp": {"row_0": "浦田"},
"recommender_name_en": {"row_0": "Urata"},
"name_jp": {"row_0": "博多人形松尾吉将まつお よしまさ"},
"name_jp": {"row_0": "博多人形(松尾吉将まつお よしまさ)"},
"name_en": {"row_0": "Hakata Dolls Matsuo"},
}
)
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/io/parser/test_encoding.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,12 +223,12 @@ def test_encoding_named_temp_file(all_parsers):
def test_parse_encoded_special_characters(encoding):
# GH16218 Verify parsing of data with encoded special characters
# Data contains a Unicode 'FULLWIDTH COLON' (U+FF1A) at position (0,"a")
data = "a\tb\n:foo\t0\nbar\t1\nbaz\t2"
data = "a\tb\n:foo\t0\nbar\t1\nbaz\t2" # noqa: RUF001
encoded_data = BytesIO(data.encode(encoding))
result = read_csv(encoded_data, delimiter="\t", encoding=encoding)

expected = DataFrame(
data=[[":foo", 0], ["bar", 1], ["baz", 2]],
data=[[":foo", 0], ["bar", 1], ["baz", 2]], # noqa: RUF001
columns=["a", "b"],
)
tm.assert_frame_equal(result, expected)
Expand Down
3 changes: 2 additions & 1 deletion pandas/tests/io/parser/test_read_fwf.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,8 @@ def test_read_csv_compat():


def test_bytes_io_input():
result = read_fwf(BytesIO("שלום\nשלום".encode()), widths=[2, 2], encoding="utf8")
data = BytesIO("שלום\nשלlם").encode()
result = read_fwf(data, widths=[2, 2], encoding="utf8")
expected = DataFrame([["של", "ום"]], columns=["של", "ום"])
tm.assert_frame_equal(result, expected)

Expand Down
6 changes: 3 additions & 3 deletions pandas/tests/io/test_clipboard.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,9 @@ def df(request):
data_type = request.param

if data_type == "delims":
return DataFrame({"a": ['"a,\t"b|c', "d\tef´"], "b": ["hi'j", "k''lm"]})
return DataFrame({"a": ['"a,\t"b|c', "d\tef`"], "b": ["hi'j", "k''lm"]})
elif data_type == "utf8":
return DataFrame({"a": ["µasd", "Ωœ∑´"], "b": ["øπ∆˚¬", "œ∑´®"]})
return DataFrame({"a": ["µasd", "Ωœ∑`"], "b": ["øπ∆˚¬", "œ∑`®"]})
elif data_type == "utf16":
return DataFrame(
{"a": ["\U0001f44d\U0001f44d", "\U0001f44d\U0001f44d"], "b": ["abc", "def"]}
Expand Down Expand Up @@ -402,7 +402,7 @@ def test_round_trip_valid_encodings(self, enc, df):
self.check_round_trip_frame(df, encoding=enc)

@pytest.mark.single_cpu
@pytest.mark.parametrize("data", ["\U0001f44d...", "Ωœ∑´...", "abcd..."])
@pytest.mark.parametrize("data", ["\U0001f44d...", "Ωœ∑`...", "abcd..."])
@pytest.mark.xfail(
(os.environ.get("DISPLAY") is None and not is_platform_mac())
or is_ci_environment(),
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/io/test_stata.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ def test_read_dta18(self, datapath):
["Cat", "Bogota", "Bogotá", 1, 1.0, "option b Ünicode", 1.0],
["Dog", "Boston", "Uzunköprü", np.nan, np.nan, np.nan, np.nan],
["Plane", "Rome", "Tromsø", 0, 0.0, "option a", 0.0],
["Potato", "Tokyo", "Elâzığ", -4, 4.0, 4, 4],
["Potato", "Tokyo", "Elâzığ", -4, 4.0, 4, 4], # noqa: RUF001
["", "", "", 0, 0.3332999, "option a", 1 / 3.0],
],
columns=[
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/series/methods/test_to_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ def test_to_csv_path_is_none(self):
# GH 21241, 21118
(Series(["abc", "def", "ghi"], name="X"), "ascii"),
(Series(["123", "你好", "世界"], name="中文"), "gb2312"),
(Series(["123", "Γειά σου", "Κόσμε"], name="Ελληνικά"), "cp737"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good call. Fixed

(Series(["123", "Γειά oou", "Κόσμε"], name="Ελληνικά"), "cp737"),
],
)
def test_to_csv_compression(self, s, encoding, compression):
Expand Down
14 changes: 7 additions & 7 deletions pandas/tests/strings/test_strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,8 +226,8 @@ def test_isnumeric_unicode(method, expected, any_string_dtype):
# 0x00bc: ¼ VULGAR FRACTION ONE QUARTER
# 0x2605: ★ not number
# 0x1378: ፸ ETHIOPIC NUMBER SEVENTY
# 0xFF13: 3 Em 3
ser = Series(["A", "3", "¼", "★", "፸", "", "four"], dtype=any_string_dtype)
# 0xFF13: 3 Em 3 # noqa: RUF003
ser = Series(["A", "3", "¼", "★", "፸", "3", "four"], dtype=any_string_dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the comment above is indicating this one shouldn't be changed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark for the rest of the changes in this file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Added noqas here

expected_dtype = "bool" if any_string_dtype == "object" else "boolean"
expected = Series(expected, dtype=expected_dtype)
result = getattr(ser.str, method)()
Expand All @@ -246,7 +246,7 @@ def test_isnumeric_unicode(method, expected, any_string_dtype):
],
)
def test_isnumeric_unicode_missing(method, expected, any_string_dtype):
values = ["A", np.nan, "¼", "★", np.nan, "", "four"]
values = ["A", np.nan, "¼", "★", np.nan, "3", "four"]
ser = Series(values, dtype=any_string_dtype)
expected_dtype = "object" if any_string_dtype == "object" else "boolean"
expected = Series(expected, dtype=expected_dtype)
Expand Down Expand Up @@ -564,12 +564,12 @@ def test_decode_errors_kwarg():
"form, expected",
[
("NFKC", ["ABC", "ABC", "123", np.nan, "アイエ"]),
("NFC", ["ABC", "ABC", "123", np.nan, "アイエ"]),
("NFC", ["ABC", "ABC", "123", np.nan, "アイエ"]),
],
)
def test_normalize(form, expected, any_string_dtype):
ser = Series(
["ABC", "ABC", "123", np.nan, "アイエ"],
["ABC", "ABC", "123", np.nan, "アイエ"],
index=["a", "b", "c", "d", "e"],
dtype=any_string_dtype,
)
Expand All @@ -580,7 +580,7 @@ def test_normalize(form, expected, any_string_dtype):

def test_normalize_bad_arg_raises(any_string_dtype):
ser = Series(
["ABC", "ABC", "123", np.nan, "アイエ"],
["ABC", "ABC", "123", np.nan, "アイエ"],
index=["a", "b", "c", "d", "e"],
dtype=any_string_dtype,
)
Expand All @@ -589,7 +589,7 @@ def test_normalize_bad_arg_raises(any_string_dtype):


def test_normalize_index():
idx = Index(["ABC", "123", "アイエ"])
idx = Index(["ABC", "123", "アイエ"])
expected = Index(["ABC", "123", "アイエ"])
result = idx.str.normalize("NFKC")
tm.assert_index_equal(result, expected)
Expand Down
2 changes: 1 addition & 1 deletion pandas/tseries/holiday.py
Original file line number Diff line number Diff line change
Expand Up @@ -570,7 +570,7 @@ def merge(self, other, inplace: bool = False):
offset=DateOffset(weekday=MO(3)),
)
USPresidentsDay = Holiday(
"Washingtons Birthday", month=2, day=1, offset=DateOffset(weekday=MO(3))
"Washington's Birthday", month=2, day=1, offset=DateOffset(weekday=MO(3))
)
GoodFriday = Holiday("Good Friday", month=1, day=1, offset=[Easter(), Day(-2)])

Expand Down
6 changes: 0 additions & 6 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -326,12 +326,6 @@ ignore = [
"PLR0124",
# Consider `elif` instead of `else` then `if` to remove indentation level
"PLR5501",
# ambiguous-unicode-character-string
"RUF001",
# ambiguous-unicode-character-docstring
"RUF002",
# ambiguous-unicode-character-comment
"RUF003",
# collection-literal-concatenation
"RUF005",
# pairwise-over-zipped (>=PY310 only)
Expand Down
4 changes: 2 additions & 2 deletions web/pandas/about/governance.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ In particular, the Core Team may:
and merging pull requests.
- Make decisions about the Services that are run by The Project and manage
those Services for the benefit of the Project and Community.
- Make decisions when regular community discussion doesnt produce consensus
- Make decisions when regular community discussion doesn't produce consensus
on an issue in a reasonable time frame.

### Core Team membership
Expand Down Expand Up @@ -157,7 +157,7 @@ they will be considered for removal from the Core Team. Before removal,
inactive Member will be approached by the BDFL to see if they plan on returning
to active participation. If not they will be removed immediately upon a Core
Team vote. If they plan on returning to active participation soon, they will be
given a grace period of one year. If they dont return to active participation
given a grace period of one year. If they don't return to active participation
within that time period they will be removed by vote of the Core Team without
further grace period. All former Core Team members can be considered for
membership again at any time in the future, like any other Project Contributor.
Expand Down
2 changes: 1 addition & 1 deletion web/pandas/community/coc.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Examples of unacceptable behavior by participants include:
* Other unethical or unprofessional conduct

Furthermore, we encourage inclusive behavior - for example,
please dont say “hey guys!” but “hey everyone!”.
please don't say “hey guys!” but “hey everyone!”.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
Expand Down