Skip to content

ENH: Add Optional Schema Definitions to Enable IDE Autocompletion #1190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
YoniChechik opened this issue Apr 17, 2025 · 2 comments
Open

ENH: Add Optional Schema Definitions to Enable IDE Autocompletion #1190

YoniChechik opened this issue Apr 17, 2025 · 2 comments

Comments

@YoniChechik
Copy link

Originaly from: pandas-dev/pandas#61304 (comment)

Problem Description

Pandas is widely used in data-heavy workflows, and in many cases, the structure of a DataFrame is known in advance — especially when loading from sources like CSVs, databases, or APIs.

However, pandas DataFrames are fully dynamic, so IDEs and static type checkers cannot infer the structure. This limits productivity, especially in large codebases, because Column names don’t autocomplete

We’re not asking for runtime schema enforcement or data validation — we’re already familiar with Pandera and similar tools. What’s missing is a mechanism for IDEs and static tools (like Pylance and MyPy) to recognize DataFrame schemas for better code intelligence.

Feature Description

Introduce an optional way to define column names and types for a DataFrame that tools like VS Code + Pylance can use for autocompletion and type hints.

Example syntax (suggested API):

import pandas as pd
from pandas.typing import Schema  # hypothetical

class OrderSchema(Schema):
    OrderID: int
    CustomerName: str
    OrderDate: str
    Product: str
    Quantity: int
    Price: float
    Country: str

df: pd.DataFrame[OrderSchema] = pd.read_csv("orders.csv")

# IDE should support:
df.Country           # autocomplete & type: str

This would behave similarly to how TypedDict or Pydantic models enable structure-aware development, but focused on DataFrame-level constructs.

It does not need to affect runtime at all — just serve as a static hint for tooling.

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Apr 17, 2025

This is a nice idea, but I think it would be tough to implement in the stubs. Open to a PR that makes it work without messing up people who don't use a Generic version of DataFrame.

@loicdiridollou
Copy link
Contributor

Related to #295

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants