-
Notifications
You must be signed in to change notification settings - Fork 418
Feature generator #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature generator #126
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
- Check my inline comments.
- Can you write tested documentation about this feature in README.rst? This way people will know about this feature. It's important to emphasize the difference in features treatment from the basic case as outlined in [Question] Apply the same transformation to several features #113.
Thanks!
tests/test_features_generator.py
Outdated
}) | ||
|
||
|
||
@pytest.mark.parametrize('columns', [['colA', 'colB', 'colC']]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why using pytest's parametrize
if we only try with one parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess just kind of habit :) Will move into function.
sklearn_pandas/features_generator.py
Outdated
transformer class and init parameters, or None. | ||
|
||
If list of classes is provided, then each of them is | ||
instantiated with default arguments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd write "with default arguments. Example:". Otherwise it is a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
tests/test_features_generator.py
Outdated
feature_dict = dict(feature_defs) | ||
assert columns == sorted(feature_dict) | ||
|
||
expected = {'value': 1, 'name': 'class'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comment # default init arguments for MockClass
for clarification.
tests/test_features_generator.py
Outdated
assert len(feature_defs) == len(columns) | ||
|
||
feature_dict = dict(feature_defs) | ||
assert columns == sorted(feature_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write assert sorted(columns) == sorted(feature_dict.keys())
? I didn't know that sorted(dict)
returns its keys sorted and it's not evident.
sklearn_pandas/features_generator.py
Outdated
params = definition.copy() | ||
klass = params.pop('class') | ||
feature_transformers.append(klass(**params)) | ||
elif isinstance(definition, type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you should substitute the elif
with a simple else
since we want all other cases to be instantiated even if the class is not a new-style one. See:
In [6]: class A:
...: pass
...:
In [7]: isinstance(A, type)
Out[7]: False
In [8]: class B(object):
...: pass
...:
In [9]: isinstance(B, type)
Out[9]: True
However we would like to instantiate A
in this case anyhow. Also, the passed object can be a function that returns an instantiated class as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I see, didn't know about it. I guess the reason is I mostly worked with Python v3.x with its default new-style classes. Will fix that.
sklearn_pandas/features_generator.py
Outdated
|
||
If list of dictionaries is provided, then each of them should | ||
have a 'class' key with transformer class. All other keys are | ||
passed into 'class' value constructor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Say Example:
to be clear that what we are showing is an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
Awesome, thanks! |
Ok, here is an attempt to fix #113. It adds a functions that generates features definition and a couple of tests.
@dukebody Do you think that it is like something you've proposed?