You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from mailing list "[pystatsmodels] read_csv + text file with one column"
Yes. Suppose that function f returns a boolean value given a column
value. To filter you would do:
df[df[column_to_filter_by].map(f)]
oftentimes filtering is using NumPy vector operations like
df[df[column] == val]
but sometimes you have an element-wise Python function you want to apply.
You said you are a new Python programmer so I can understand lambdas
and regular expressions looking weird =) Lambda is just an alternative
to doing something like:
def condition(x):
return x.startswith('A')
df[df[column].map(condition)]
you could also do:
df[[condition(x) for x in df[column]]]
the map method is just a faster way of doing the list comprehension
(and returns an ndarray with the right data type)
The select method applies a function to the *axis labels*, directly
from the docstring:
In [1]: DataFrame.select?
Type: instancemethod
Base Class: <type 'instancemethod'>
String Form:<unbound method DataFrame.select>
Namespace: Interactive
File: /home/wesm/code/pandas/pandas/core/generic.py
Definition: DataFrame.select(self, crit, axis=0)
Docstring:
Return data corresponding to axis labels matching criteria
Parameters
----------
crit : function
To be called on each index (label). Should return True or False
axis : int
Returns
-------
selection : type of caller
hope this helps,
Wes
The text was updated successfully, but these errors were encountered:
First off, thanks for pandas and your recent talk at the NYC Python meetup. On the topic of filtering, NumPy vector filters are awesome for numeric data, but I have found myself reaching for the following idioms when dealing with alpha-numeric columns:
Update: in the thread referenced, I see you've already thought about similar methods (match()) and that handling exceptions due to NA values, among other details, is the tricky bit. On the surface, throwing a type exception when a NA is encountered seems acceptable because it feels consistent with other Python idioms (for better or worse). e.g., ', '.join(x) throws an exception when x contains a non-string element
from mailing list "[pystatsmodels] read_csv + text file with one column"
The text was updated successfully, but these errors were encountered: