Skip to content

ENH: get_rename_function: add support for __getitem__ override #39921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
philippefutureboy opened this issue Feb 19, 2021 · 5 comments
Closed
Labels
Duplicate Report Duplicate issue or pull request Enhancement

Comments

@philippefutureboy
Copy link

philippefutureboy commented Feb 19, 2021

Is your feature request related to a problem?

I would like to be able to use a subclass of UserDict with custom __getitem__ implementation as a mapper for my dataframe columns. My use case is that I want to have a mapper with default remapping rules when the key is not present.
Unfortunately, as the implementation checks for membership of the key before actually calling __getitem__, I am only able to do the behaviour by wrapping the UserDict with a lambda.

Current implementation of get_rename_function in pandas:

def get_rename_function(mapper):
    """
    Returns a function that will map names/labels, dependent if mapper
    is a dict, Series or just a function.
    """
    if isinstance(mapper, (abc.Mapping, ABCSeries)):
        def f(x):
            if x in mapper:
                return mapper[x]
            else:
                return x

    else:
        f = mapper

    return f

My proposition is the following:

def get_rename_function(mapper):
    """
    Returns a function that will map names/labels, dependent if mapper
    is a dict, Series or just a function.
    """
    if isinstance(mapper, (abc.Mapping, ABCSeries)):
        def f(k):
            return mapper.get(k, k)
    else:
        f = mapper

    return f

which functionally should result in the same behaviour.
The advantage for me is that __getitem__ is called and I can process k to return my default mapping.
I don't know however if this would have a significant performance impact. If so, alternatively I'd propose the following:

def get_rename_function(mapper):
    """
    Returns a function that will map names/labels, dependent if mapper
    is a dict, Series or just a function.
    """
    if not callable(mapper) and isinstance(mapper, (abc.Mapping, ABCSeries)):
        def f(x):
            if x in mapper:
                return mapper[x]
            else:
                return x
    else:
        f = mapper

    return f

Then I could implement the __call__ method. The downside however is that this could be a breaking change for some people that pass a subclass of a Mapping as well.

What do you think?

@philippefutureboy philippefutureboy added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 19, 2021
@jreback
Copy link
Contributor

jreback commented Feb 20, 2021

we have an errors kwarg that would do what u want

this proposal is a breaking change

@philippefutureboy
Copy link
Author

Hey @jreback !

Thanks for your answer.

Which errors kwarg do you refer to? My use case is specifically with DataFrame.rename.

I think I understand why this is a breaking change. It's because we can't assume that calling getitem is non breaking cause we don't know what is the implementation of the client right?

@jbrockmendel
Copy link
Member

I'm not 100% clear on the desired end-result. Is this similar to #11723?

@philippefutureboy
Copy link
Author

philippefutureboy commented Jun 13, 2021

Well long story short I wanted to skip the call to __contains__ and use .get directly instead.

And in terms of the need, yes! That's exactly what I want 😄🙌

@mroeschke
Copy link
Member

Since #11723 seems like the aligned feature request, going to close this issue and encourage further discussion and support to happen in that issue

@mroeschke mroeschke added Duplicate Report Duplicate issue or pull request and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request Enhancement
Projects
None yet
Development

No branches or pull requests

4 participants