Skip to content

ENH: Add Optional Schema Definitions to Enable IDE Autocompletion #61304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 3 tasks
YoniChechik opened this issue Apr 17, 2025 · 1 comment
Closed
1 of 3 tasks

ENH: Add Optional Schema Definitions to Enable IDE Autocompletion #61304

YoniChechik opened this issue Apr 17, 2025 · 1 comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@YoniChechik
Copy link

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Pandas is widely used in data-heavy workflows, and in many cases, the structure of a DataFrame is known in advance — especially when loading from sources like CSVs, databases, or APIs.

However, pandas DataFrames are fully dynamic, so IDEs and static type checkers cannot infer the structure. This limits productivity, especially in large codebases, because Column names don’t autocomplete

We’re not asking for runtime schema enforcement or data validation — we’re already familiar with Pandera and similar tools. What’s missing is a mechanism for IDEs and static tools (like Pylance and MyPy) to recognize DataFrame schemas for better code intelligence.

Feature Description

Introduce an optional way to define column names and types for a DataFrame that tools like VS Code + Pylance can use for autocompletion and type hints.

Example syntax (suggested API):

import pandas as pd
from pandas.typing import Schema  # hypothetical

class OrderSchema(Schema):
    OrderID: int
    CustomerName: str
    OrderDate: str
    Product: str
    Quantity: int
    Price: float
    Country: str

df: pd.DataFrame[OrderSchema] = pd.read_csv("orders.csv")

# IDE should support:
df.Country           # autocomplete & type: str

This would behave similarly to how TypedDict or Pydantic models enable structure-aware development, but focused on DataFrame-level constructs.

It does not need to affect runtime at all — just serve as a static hint for tooling.

Alternative Solutions

No

Additional Context

No response

@YoniChechik YoniChechik added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 17, 2025
@mroeschke
Copy link
Member

Thanks for the suggestion. I think this feature request is more appropriate for pandas stubs so I would suggest opening an issue in that repo https://github.com/pandas-dev/pandas-stubs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants