Skip to content

FIX/ENH: attempt soft conversion of object series before raising a TypeError when plotting #3912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 16, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 8 additions & 6 deletions RELEASE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,10 @@ pandas 0.11.1
dependencies offered for Linux) (GH3837_).
- Plotting functions now raise a ``TypeError`` before trying to plot anything
if the associated objects have have a dtype of ``object`` (GH1818_,
GH3572_). This happens before any drawing takes place which elimnates any
spurious plots from showing up.
GH3572_, GH3911_, GH3912_), but they will try to convert object arrays to
numeric arrays if possible so that you can still plot, for example, an
object array with floats. This happens before any drawing takes place which
elimnates any spurious plots from showing up.
- Added Faq section on repr display options, to help users customize their setup.
- ``where`` operations that result in block splitting are much faster (GH3733_)
- Series and DataFrame hist methods now take a ``figsize`` argument (GH3834_)
Expand Down Expand Up @@ -341,13 +343,13 @@ pandas 0.11.1
.. _GH3834: https://github.com/pydata/pandas/issues/3834
.. _GH3873: https://github.com/pydata/pandas/issues/3873
.. _GH3877: https://github.com/pydata/pandas/issues/3877
.. _GH3659: https://github.com/pydata/pandas/issues/3659
.. _GH3679: https://github.com/pydata/pandas/issues/3679
.. _GH3880: https://github.com/pydata/pandas/issues/3880
<<<<<<< HEAD
.. _GH3911: https://github.com/pydata/pandas/issues/3911
=======
.. _GH3907: https://github.com/pydata/pandas/issues/3907
>>>>>>> 7b5933247b80174de4ba571e95a1add809dd9d09

.. _GH3911: https://github.com/pydata/pandas/issues/3911
.. _GH3912: https://github.com/pydata/pandas/issues/3912

pandas 0.11.0
=============
Expand Down
10 changes: 7 additions & 3 deletions doc/source/v0.11.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -300,9 +300,11 @@ Bug Fixes
~~~~~~~~~

- Plotting functions now raise a ``TypeError`` before trying to plot anything
if the associated objects have have a ``dtype`` of ``object`` (GH1818_,
GH3572_). This happens before any drawing takes place which elimnates any
spurious plots from showing up.
if the associated objects have have a dtype of ``object`` (GH1818_,
GH3572_, GH3911_, GH3912_), but they will try to convert object arrays to
numeric arrays if possible so that you can still plot, for example, an
object array with floats. This happens before any drawing takes place which
elimnates any spurious plots from showing up.

- ``fillna`` methods now raise a ``TypeError`` if the ``value`` parameter is
a list or tuple.
Expand Down Expand Up @@ -416,3 +418,5 @@ on GitHub for a complete list.
.. _GH3659: https://github.com/pydata/pandas/issues/3659
.. _GH3679: https://github.com/pydata/pandas/issues/3679
.. _GH3907: https://github.com/pydata/pandas/issues/3907
.. _GH3911: https://github.com/pydata/pandas/issues/3911
.. _GH3912: https://github.com/pydata/pandas/issues/3912
34 changes: 20 additions & 14 deletions pandas/io/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@
_VALID_URLS.discard('')


class PerformanceWarning(Warning):
pass


def _is_url(url):
"""Check to see if a URL has a valid protocol.
Expand All @@ -26,27 +30,29 @@ def _is_url(url):
except:
return False


def _is_s3_url(url):
""" Check for an s3 url """
"""Check for an s3 url"""
try:
return urlparse.urlparse(url).scheme == 's3'
except:
return False


def get_filepath_or_buffer(filepath_or_buffer, encoding=None):
""" if the filepath_or_buffer is a url, translate and return the buffer
passthru otherwise
Parameters
----------
filepath_or_buffer : a url, filepath, or buffer
encoding : the encoding to use to decode py3 bytes, default is 'utf-8'
Returns
-------
a filepath_or_buffer, the encoding
"""
"""
If the filepath_or_buffer is a url, translate and return the buffer
passthru otherwise.
Parameters
----------
filepath_or_buffer : a url, filepath, or buffer
encoding : the encoding to use to decode py3 bytes, default is 'utf-8'
Returns
-------
a filepath_or_buffer, the encoding
"""

if _is_url(filepath_or_buffer):
from urllib2 import urlopen
Expand Down
47 changes: 30 additions & 17 deletions pandas/io/pytables.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,23 +12,22 @@
import warnings

import numpy as np
from pandas import (
Series, TimeSeries, DataFrame, Panel, Panel4D, Index,
MultiIndex, Int64Index, Timestamp
)
from pandas import (Series, TimeSeries, DataFrame, Panel, Panel4D, Index,
MultiIndex, Int64Index, Timestamp)
from pandas.sparse.api import SparseSeries, SparseDataFrame, SparsePanel
from pandas.sparse.array import BlockIndex, IntIndex
from pandas.tseries.api import PeriodIndex, DatetimeIndex
from pandas.core.common import adjoin, isnull, is_list_like
from pandas.core.algorithms import match, unique, factorize
from pandas.core.common import adjoin, is_list_like
from pandas.core.algorithms import match, unique
from pandas.core.categorical import Categorical
from pandas.core.common import _asarray_tuplesafe, _try_sort
from pandas.core.common import _asarray_tuplesafe
from pandas.core.internals import BlockManager, make_block
from pandas.core.reshape import block2d_to_blocknd, factor_indexer
from pandas.core.index import Int64Index, _ensure_index
from pandas.core.index import _ensure_index
import pandas.core.common as com
from pandas.tools.merge import concat
from pandas.util import py3compat
from pandas.io.common import PerformanceWarning

import pandas.lib as lib
import pandas.algos as algos
Expand All @@ -42,32 +41,46 @@
# PY3 encoding if we don't specify
_default_encoding = 'UTF-8'


def _ensure_decoded(s):
""" if we have bytes, decode them to unicde """
if isinstance(s, np.bytes_):
s = s.decode('UTF-8')
return s


def _ensure_encoding(encoding):
# set the encoding if we need
if encoding is None:
if py3compat.PY3:
encoding = _default_encoding
return encoding

class IncompatibilityWarning(Warning): pass

class IncompatibilityWarning(Warning):
pass


incompatibility_doc = """
where criteria is being ignored as this version [%s] is too old (or not-defined),
read the file in and write it out to a new file to upgrade (with the copy_to method)
where criteria is being ignored as this version [%s] is too old (or
not-defined), read the file in and write it out to a new file to upgrade (with
the copy_to method)
"""
class AttributeConflictWarning(Warning): pass


class AttributeConflictWarning(Warning):
pass


attribute_conflict_doc = """
the [%s] attribute of the existing index is [%s] which conflicts with the new [%s],
resetting the attribute to None
the [%s] attribute of the existing index is [%s] which conflicts with the new
[%s], resetting the attribute to None
"""
class PerformanceWarning(Warning): pass


performance_doc = """
your performance may suffer as PyTables will pickle object types that it cannot map
directly to c-types [inferred_type->%s,key->%s] [items->%s]
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->%s,key->%s] [items->%s]
"""

# map object types
Expand Down
12 changes: 10 additions & 2 deletions pandas/tests/test_graphics.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from pandas.util.testing import ensure_clean
from pandas.core.config import set_option


import numpy as np

from numpy.testing import assert_array_equal
Expand Down Expand Up @@ -189,15 +190,22 @@ def test_bootstrap_plot(self):
from pandas.tools.plotting import bootstrap_plot
_check_plot_works(bootstrap_plot, self.ts, size=10)

@slow
def test_all_invalid_plot_data(self):
def test_invalid_plot_data(self):
s = Series(list('abcd'))
kinds = 'line', 'bar', 'barh', 'kde', 'density'

for kind in kinds:
self.assertRaises(TypeError, s.plot, kind=kind)

@slow
def test_valid_object_plot(self):
from pandas.io.common import PerformanceWarning
s = Series(range(10), dtype=object)
kinds = 'line', 'bar', 'barh', 'kde', 'density'

for kind in kinds:
_check_plot_works(s.plot, kind=kind)

def test_partially_invalid_plot_data(self):
s = Series(['a', 'b', 1.0, 2])
kinds = 'line', 'bar', 'barh', 'kde', 'density'
Expand Down
22 changes: 14 additions & 8 deletions pandas/tools/plotting.py
Original file line number Diff line number Diff line change
Expand Up @@ -878,15 +878,20 @@ def _get_layout(self):

def _compute_plot_data(self):
try:
# might be a frame
# might be an ndframe
numeric_data = self.data._get_numeric_data()
except AttributeError:
# a series, but no object dtypes allowed!
if self.data.dtype == np.object_:
raise TypeError('invalid dtype for plotting, please cast to a '
'numeric dtype explicitly if you want to plot')

except AttributeError: # TODO: rm in 0.12 (series-inherit-ndframe)
numeric_data = self.data
orig_dtype = numeric_data.dtype

# possible object array of numeric data
if orig_dtype == np.object_:
numeric_data = numeric_data.convert_objects() # soft convert

# still an object dtype so we can't plot it
if numeric_data.dtype == np.object_:
raise TypeError('Series has object dtype and cannot be'
' converted: no numeric data to plot')

try:
is_empty = numeric_data.empty
Expand All @@ -895,7 +900,8 @@ def _compute_plot_data(self):

# no empty frames or series allowed
if is_empty:
raise TypeError('No numeric data to plot')
raise TypeError('Empty {0!r}: no numeric data to '
'plot'.format(numeric_data.__class__.__name__))

self.data = numeric_data

Expand Down
48 changes: 47 additions & 1 deletion pandas/util/testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import string
import sys
import tempfile
import warnings

from contextlib import contextmanager # contextlib is available since 2.5

Expand Down Expand Up @@ -39,7 +40,7 @@

def rands(n):
choices = string.ascii_letters + string.digits
return ''.join([random.choice(choices) for _ in xrange(n)])
return ''.join(random.choice(choices) for _ in xrange(n))


def randu(n):
Expand Down Expand Up @@ -746,3 +747,48 @@ def stdin_encoding(encoding=None):
sys.stdin = SimpleMock(sys.stdin, "encoding", encoding)
yield
sys.stdin = _stdin


@contextmanager
def assert_produces_warning(expected_warning=Warning, filter_level="always"):
"""
Context manager for running code that expects to raise (or not raise)
warnings. Checks that code raises the expected warning and only the
expected warning. Pass ``False`` or ``None`` to check that it does *not*
raise a warning. Defaults to ``exception.Warning``, baseclass of all
Warnings. (basically a wrapper around ``warnings.catch_warnings``).
>>> import warnings
>>> with assert_produces_warning():
... warnings.warn(UserWarning())
...
>>> with assert_produces_warning(False):
... warnings.warn(RuntimeWarning())
...
Traceback (most recent call last):
...
AssertionError: Caused unexpected warning(s): ['RuntimeWarning'].
>>> with assert_produces_warning(UserWarning):
... warnings.warn(RuntimeWarning())
Traceback (most recent call last):
...
AssertionError: Did not see expected warning of class 'UserWarning'.
..warn:: This is *not* thread-safe.
"""
with warnings.catch_warnings(record=True) as w:
saw_warning = False
warnings.simplefilter(filter_level)
yield w
extra_warnings = []
for actual_warning in w:
if (expected_warning and issubclass(actual_warning.category,
expected_warning)):
saw_warning = True
else:
extra_warnings.append(actual_warning.category.__name__)
if expected_warning:
assert saw_warning, ("Did not see expected warning of class %r."
% expected_warning.__name__)
assert not extra_warnings, ("Caused unexpected warning(s): %r."
% extra_warnings)