Skip to content

Introduce temporary named expressions for match subjects #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

arvi18
Copy link

@arvi18 arvi18 commented Apr 23, 2025

Fixes python#18440. Fixes python#17230.

python#16503 did this specifically for CallExpr, but that isn't the only kind of such statements. I propose to expand this for more general expressions and believe that a blacklist is more reasonable here: we do not want to introduce a temporary name only for certain expressions that either are already named or can be used to infer contained variables (inline tuple/list/dict/set literals).

Writing logic to generate a name for every other kind of expression would be quite cumbersome - I circumvent this by using a simple counter to generate unique names on demand.

Summary by CodeRabbit

  • New Features
    • Improved pattern matching support for complex expressions, including function calls, attribute access, operations, and dictionary item access, with more accurate type narrowing and detection of unreachable cases.
  • Bug Fixes
    • Updated handling of match statements to ensure consistent naming and type inference for complex match subjects.
  • Tests
    • Added new test cases for pattern matching with various expression types and unreachable case detection.
    • Renamed an existing test case to reflect that it now passes.

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

Whoa, I expected this to produce at least a couple of unused ignore diagnostics, but apparently pattern matching isn't very popular in wild...

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

Diff from mypy_primer, showing the effect of this PR on open source code:

schemathesis (https://github.com/schemathesis/schemathesis)
-   File "/tmp/mypy_primer/old_mypy/venv/bin/mypy", line 8, in <module>
+   File "/tmp/mypy_primer/new_mypy/venv/bin/mypy", line 8, in <module>
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/__main__.py", line 15, in console_entry
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/__main__.py", line 15, in console_entry
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 119, in main
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 119, in main
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 203, in run_build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 203, in run_build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 191, in build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 191, in build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 267, in _build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 267, in _build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 2947, in dispatch
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 2947, in dispatch
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 976, in write_deps_cache
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 976, in write_deps_cache

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

Meh, the mypyc failure is unrelated and didn't happen during first run, likely a glitch.

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

Diff from mypy_primer, showing the effect of this PR on open source code:

schemathesis (https://github.com/schemathesis/schemathesis)
-   File "/tmp/mypy_primer/old_mypy/venv/bin/mypy", line 8, in <module>
+   File "/tmp/mypy_primer/new_mypy/venv/bin/mypy", line 8, in <module>
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/__main__.py", line 15, in console_entry
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/__main__.py", line 15, in console_entry
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 119, in main
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 119, in main
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 203, in run_build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/main.py", line 203, in run_build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 191, in build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 191, in build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 267, in _build
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 267, in _build
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 2947, in dispatch
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 2947, in dispatch
-   File "/tmp/mypy_primer/old_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 976, in write_deps_cache
+   File "/tmp/mypy_primer/new_mypy/venv/lib/python3.12/site-packages/mypy/build.py", line 976, in write_deps_cache

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Copy link

coderabbitai bot commented Apr 23, 2025

Warning

Rate limit exceeded

@visz11 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 25 minutes and 7 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 380cb8d and ded98cb.

📒 Files selected for processing (2)
  • mypy/checker.py (7 hunks)
  • test-data/unit/check-python310.test (2 hunks)

Walkthrough

The changes introduce a new helper method in the TypeChecker class to consistently generate a named expression for complex or unnamed match statement subjects. This improves type inference and narrowing in match statements by ensuring that all complex subjects are assigned a unique dummy variable if they are not already named or literal expressions. The logic for handling match statements is refactored to use this helper, and a mechanism for generating unique dummy names is added. Additionally, the test suite is expanded with new cases to verify pattern matching behavior with various expressions and types, and a previously expected-failing test is now expected to pass.

Changes

File(s) Change Summary
mypy/checker.py Added _make_named_statement_for_match helper to TypeChecker for generating named expressions for match subjects; introduced _unique_id attribute and new_unique_dummy_name method for unique dummy names; updated visit_match_stmt to use new helper; improved type map propagation for match statements; imported and recognized ComplexExpr as a literal expression.
test-data/unit/check-python310.test Renamed [case testMatchCapturePatternFromAsyncFunctionReturningUnion-xfail] to remove xfail marker; added new test cases for match statements with function calls, attributes, operations, dictionary item access, and literal values to verify type narrowing and exhaustiveness.

Sequence Diagram(s)

sequenceDiagram
    participant UserCode as User Code
    participant TypeChecker
    participant AST as AST Nodes

    UserCode->>TypeChecker: Defines a match statement with a complex subject
    TypeChecker->>AST: Inspects match subject expression
    TypeChecker->>TypeChecker: Calls _make_named_statement_for_match
    alt Subject is named/literal/ComplexExpr
        TypeChecker->>TypeChecker: Use subject as-is
    else
        TypeChecker->>TypeChecker: Generate unique dummy name
        TypeChecker->>AST: Create NameExpr for dummy
    end
    TypeChecker->>TypeChecker: Use named subject for type narrowing
    TypeChecker->>UserCode: Performs type checking and narrowing in match cases
Loading

Poem

In the warren of types, a match did appear,
With subjects complex, once fuzzy, now clear.
A helper hops in, naming each case,
Ensuring type narrowing keeps a swift pace.
Tests sprout like carrots, all neat in a row—
The checker’s new wisdom is starting to show!
🥕🐇


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

This interferes with recently added narrowing by item/attribute (aka discriminated union, testMatchNarrowingUnionClassViaAttribute). So a simple solution like this won't fly. This information for index/attr should be stored in some other way to support narrowing of attr/item and the object itself at the same time. This may still work for other, less informative nodes (await?)

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

To support narrowing the subject and its "dependencies" simultaneously, I propose to just push the narrowed type of subject to typemap as the type of its original form, allowing for any inference on it.

@arvi18
Copy link
Author

arvi18 commented Apr 23, 2025

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 380cb8d and ded98cb.

📒 Files selected for processing (2)
  • mypy/checker.py (7 hunks)
  • test-data/unit/check-python310.test (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
mypy/checker.py (1)
mypy/nodes.py (23)
  • ComplexExpr (1783-1797)
  • MatchStmt (1640-1667)
  • Expression (201-207)
  • NameExpr (1875-1895)
  • AssignmentExpr (2074-2087)
  • IntExpr (1696-1710)
  • StrExpr (1719-1733)
  • BytesExpr (1740-1763)
  • FloatExpr (1766-1780)
  • EllipsisExpr (1800-1806)
  • name (234-235)
  • name (353-354)
  • name (532-533)
  • name (571-577)
  • name (809-810)
  • name (907-908)
  • name (1068-1069)
  • name (2287-2288)
  • name (2572-2573)
  • name (3205-3207)
  • name (3752-3753)
  • name (3864-3865)
  • Var (973-1118)
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Type check our own code (py39-ubuntu)
  • GitHub Check: Type check our own code (py39-windows-64)
  • GitHub Check: Formatting and code style with Black + ruff
  • GitHub Check: mypyc runtime tests with py39-macos
  • GitHub Check: Test suite with py313-ubuntu, mypyc-compiled
  • GitHub Check: Test suite with py311-ubuntu, mypyc-compiled
  • GitHub Check: Test suite with py310-ubuntu
  • GitHub Check: Test suite with py312-ubuntu, mypyc-compiled
  • GitHub Check: Test suite with py39-windows-64
  • GitHub Check: Test suite with py39-ubuntu, mypyc-compiled
  • GitHub Check: Test mypyc suite with 32-bit Python
  • GitHub Check: Run mypy_primer (3)
  • GitHub Check: Run mypy_primer (4)
  • GitHub Check: Run mypy_primer (1)
  • GitHub Check: Run mypy_primer (2)
  • GitHub Check: Run mypy_primer (0)
🔇 Additional comments (7)
mypy/checker.py (1)

70-70: Import looks good.

Adding ComplexExpr to the import list is necessary for the new helper and is correctly placed alongside the other node classes.

test-data/unit/check-python310.test (6)

1305-1319: Test case now expected to pass rather than fail

This change removes the -xfail marker from the testMatchCapturePatternFromAsyncFunctionReturningUnion test case, indicating that it's now expected to pass. This is consistent with the PR's goal of improving match subject handling for complex expressions like the result of an async function call.


2589-2601: Good test coverage for matching on function call results

This new test case verifies that pattern matching on function call expressions works correctly, with proper type narrowing and unreachable path detection. The test confirms that the type checker can accurately track the types for each match case and detect when a case is unreachable.


2602-2615: Good test coverage for matching on attribute access

This test case validates pattern matching on attribute access expressions. The test ensures proper type narrowing when matching on a class attribute that has a union type, and correctly identifies unreachable code paths.


2616-2643: Comprehensive tests for matching on unary and binary operations

These test cases verify pattern matching functionality for unary operations (like negation) and binary operations (addition and comparison). They test both literal matching and type-based matching scenarios, confirming that the type checker correctly narrows types and identifies unreachable branches.


2644-2659: Good test for matching on dictionary item access

This test case ensures that pattern matching works correctly when the subject is a dictionary item access expression. It properly tests type narrowing for a union typed value and identifies unreachable code paths.


2660-2671: Good pathological case test for literal value matching

This test verifies a corner case where matching on a literal value works correctly. It ensures the type checker properly handles matching a literal value against itself and identifies subsequent unreachable branches.

Comment on lines +354 to +356
# A helper state to produce unique temporary names on demand.
_unique_id: int

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

_unique_id must be added to __slots__ to avoid AttributeError.

TypeChecker uses __slots__ to save memory. Writing self._unique_id = 0 in __init__ will raise at runtime unless the new attribute is appended to the __slots__ tuple defined near the top of the class.

Suggested follow‑up (illustrative – adjust exact location):

 class TypeChecker(TraverserVisitor):
-    __slots__ = (
-        "...",
-        # existing slots
+    __slots__ = (
+        "...",
+        "_unique_id",   # ← add here
     )

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +5506 to +5535
def _make_named_statement_for_match(self, s: MatchStmt) -> Expression:
"""Construct a fake NameExpr for inference if a match clause is complex."""
subject = s.subject
expressions_to_preserve = (
# Already named - we should infer type of it as given
NameExpr,
AssignmentExpr,
# Primitive literals - their type is known, no need to name them
IntExpr,
StrExpr,
BytesExpr,
FloatExpr,
ComplexExpr,
EllipsisExpr,
)
if isinstance(subject, expressions_to_preserve):
return subject
elif s.subject_dummy is not None:
return s.subject_dummy
else:
# Create a dummy subject expression to handle cases where a match statement's subject
# is not a literal value. This lets us correctly narrow types and check exhaustivity
# This is hack!
name = self.new_unique_dummy_name("match")
v = Var(name)
named_subject = NameExpr(name)
named_subject.node = v
s.subject_dummy = named_subject
return named_subject

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Preserve source‑location info when synthesising a dummy subject.

NameExpr() defaults to line/column = -1, which degrades error positioning and can break linters relying on accurate locations.
Copy the location from the original subject so diagnostics continue to point to the correct place.

-            named_subject = NameExpr(name)
+            named_subject = NameExpr(name)
+            # Keep error messages anchored to the original expression
+            named_subject.set_line(subject.line)
+            named_subject.set_column(subject.column)

(If set_column is unavailable, use copy_location(named_subject, subject) helper that is used elsewhere in mypy.)

Optionally, also populate v.line = subject.line for completeness.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _make_named_statement_for_match(self, s: MatchStmt) -> Expression:
"""Construct a fake NameExpr for inference if a match clause is complex."""
subject = s.subject
expressions_to_preserve = (
# Already named - we should infer type of it as given
NameExpr,
AssignmentExpr,
# Primitive literals - their type is known, no need to name them
IntExpr,
StrExpr,
BytesExpr,
FloatExpr,
ComplexExpr,
EllipsisExpr,
)
if isinstance(subject, expressions_to_preserve):
return subject
elif s.subject_dummy is not None:
return s.subject_dummy
else:
# Create a dummy subject expression to handle cases where a match statement's subject
# is not a literal value. This lets us correctly narrow types and check exhaustivity
# This is hack!
name = self.new_unique_dummy_name("match")
v = Var(name)
named_subject = NameExpr(name)
named_subject.node = v
s.subject_dummy = named_subject
return named_subject
def _make_named_statement_for_match(self, s: MatchStmt) -> Expression:
"""Construct a fake NameExpr for inference if a match clause is complex."""
subject = s.subject
expressions_to_preserve = (
# Already named - we should infer type of it as given
NameExpr,
AssignmentExpr,
# Primitive literals - their type is known, no need to name them
IntExpr,
StrExpr,
BytesExpr,
FloatExpr,
ComplexExpr,
EllipsisExpr,
)
if isinstance(subject, expressions_to_preserve):
return subject
elif s.subject_dummy is not None:
return s.subject_dummy
else:
# Create a dummy subject expression to handle cases where a match statement's subject
# is not a literal value. This lets us correctly narrow types and check exhaustivity
# This is hack!
name = self.new_unique_dummy_name("match")
v = Var(name)
named_subject = NameExpr(name)
# Keep error messages anchored to the original expression
named_subject.set_line(subject.line)
named_subject.set_column(subject.column)
named_subject.node = v
s.subject_dummy = named_subject
return named_subject

@visz11
Copy link
Collaborator

visz11 commented Apr 23, 2025

@coderabbitai full review

Copy link

coderabbitai bot commented Apr 23, 2025

✅ Actions performed

Full review triggered.

Copy link

Diff from mypy_primer, showing the effect of this PR on open source code:

vision (https://github.com/pytorch/vision)
- torchvision/utils.py:270: error: Unused "type: ignore" comment  [unused-ignore]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Type narrowing broken with dict items match narrowing fails when the operand is an await expression
3 participants