Skip to content

pandas.DataFrame.replace raises UnboundLocalError on mixed data types #11698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vladu opened this issue Nov 24, 2015 · 3 comments
Closed

pandas.DataFrame.replace raises UnboundLocalError on mixed data types #11698

vladu opened this issue Nov 24, 2015 · 3 comments
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Internals Related to non-user accessible pandas implementation Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Milestone

Comments

@vladu
Copy link
Contributor

vladu commented Nov 24, 2015

This is in 0.17.1. Suppose we have a frame with mixed data types. Suppose also we want to replace a string value with something else. If the frame has only string columns, or maybe string and int columns, this works fine. However, if the frame has string and datetime columns, this raises an exception. Example:

>>> pandas.DataFrame([('-', pandas.to_datetime('20150101')), ('a', pandas.to_datetime('20150102')), ('b', pandas.to_datetime('20150103'))], columns=['a', 'b']).replace('-', numpy.nan)
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-11-bce7dc78fc44> in <module>()
----> 1 pandas.DataFrame([('-', pandas.to_datetime('20150101')), ('a', pandas.to_datetime('20150102')), ('b', pandas.to_datetime('20150103'))], columns=['a', 'b']).replace('-', numpy.nan)

/.../pandas/core/generic.pyc in replace(self, to_replace, value, inplace, limit, regex, method, axis)
   3108                 elif not com.is_list_like(value):  # NA -> 0
   3109                     new_data = self._data.replace(to_replace=to_replace, value=value,
-> 3110                                                   inplace=inplace, regex=regex)
   3111                 else:
   3112                     msg = ('Invalid "to_replace" type: '

/.../pandas/core/internals.pyc in replace(self, **kwargs)
   2868
   2869     def replace(self, **kwargs):
-> 2870         return self.apply('replace', **kwargs)
   2871
   2872     def replace_list(self, src_list, dest_list, inplace=False, regex=False, mgr=None):

/.../pandas/core/internals.pyc in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   2821
   2822             kwargs['mgr'] = self
-> 2823             applied = getattr(b, f)(**kwargs)
   2824             result_blocks = _extend_blocks(applied, result_blocks)
   2825

/.../pandas/core/internals.pyc in replace(self, to_replace, value, inplace, filter, regex, convert, mgr)
    605
    606             # we can't process the value, but nothing to do
--> 607             if not mask.any():
    608                 return self if inplace else self.copy()
    609

UnboundLocalError: local variable 'mask' referenced before assignment

And the requisite show_versions:

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.18-348.2.1.el5
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: C
LANG: en_US.UTF-8

pandas: 0.17.1
nose: 1.3.7
pip: 7.1.2
setuptools: 18.4
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 4.0.0
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.1.0
numexpr: 2.4.4
matplotlib: 1.5.0
openpyxl: 2.2.6
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.3.2
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
Jinja2: None
@vladu
Copy link
Contributor Author

vladu commented Nov 24, 2015

Note, used to work as expected in 0.16.2.

@jreback
Copy link
Contributor

jreback commented Nov 25, 2015

hmm, looks like an untested code path.

want to do a pull-request?

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Difficulty Novice Internals Related to non-user accessible pandas implementation labels Nov 25, 2015
@jreback jreback added this to the 0.18.0 milestone Nov 25, 2015
@jreback
Copy link
Contributor

jreback commented Dec 2, 2015

closed by #11715

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Internals Related to non-user accessible pandas implementation Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants