Skip to content

Commit 92ac063

Browse files
committed
CI: Update deps, docs
1 parent 81690b5 commit 92ac063

10 files changed

+40
-21
lines changed

ci/requirements-2.7-64.run

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ sqlalchemy
1111
lxml=3.2.1
1212
scipy
1313
xlsxwriter
14-
boto
14+
s3fs
1515
bottleneck
1616
html5lib
1717
beautiful-soup

ci/requirements-2.7.run

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ sqlalchemy=0.9.6
1111
lxml=3.2.1
1212
scipy
1313
xlsxwriter=0.4.6
14-
boto=2.36.0
14+
s3fs
1515
bottleneck
1616
psycopg2=2.5.2
1717
patsy

ci/requirements-2.7_SLOW.run

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ numexpr
1313
pytables
1414
sqlalchemy
1515
lxml
16-
boto
16+
s3fs
1717
bottleneck
1818
psycopg2
1919
pymysql

ci/requirements-3.5.run

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ sqlalchemy
1717
pymysql
1818
psycopg2
1919
xarray
20-
boto
20+
s3fs
2121

2222
# incompat with conda ATM
2323
# beautiful-soup

ci/requirements-3.5_OSX.run

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ matplotlib
1212
jinja2
1313
bottleneck
1414
xarray
15-
boto
15+
s3fs
1616

1717
# incompat with conda ATM
1818
# beautiful-soup

doc/source/install.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -262,7 +262,7 @@ Optional Dependencies
262262
* `XlsxWriter <https://pypi.python.org/pypi/XlsxWriter>`__: Alternative Excel writer
263263

264264
* `Jinja2 <http://jinja.pocoo.org/>`__: Template engine for conditional HTML formatting.
265-
* `boto <https://pypi.python.org/pypi/boto>`__: necessary for Amazon S3 access.
265+
* `s3fs <http://s3fs.readthedocs.io/>`__: necessary for Amazon S3 access (s3fs >= 0.0.7).
266266
* `blosc <https://pypi.python.org/pypi/blosc>`__: for msgpack compression using ``blosc``
267267
* One of `PyQt4
268268
<http://www.riverbankcomputing.com/software/pyqt/download>`__, `PySide

doc/source/io.rst

+17
Original file line numberDiff line numberDiff line change
@@ -1487,6 +1487,23 @@ options include:
14871487
Specifying any of the above options will produce a ``ParserWarning`` unless the
14881488
python engine is selected explicitly using ``engine='python'``.
14891489

1490+
Reading remote files
1491+
''''''''''''''''''''
1492+
1493+
You can pass in a URL to a CSV file:
1494+
1495+
.. code-block:: python
1496+
1497+
df = pd.read_csv('https://download.bls.gov/pub/time.series/cu/cu.item',
1498+
sep='\t')
1499+
1500+
S3 URLs are handled as well:
1501+
1502+
.. code-block:: python
1503+
1504+
df = pd.read_csv('s3://pandas-test/tips.csv')
1505+
1506+
14901507
Writing out Data
14911508
''''''''''''''''
14921509

doc/source/whatsnew/v0.20.0.txt

+10-10
Original file line numberDiff line numberDiff line change
@@ -108,8 +108,6 @@ Other enhancements
108108

109109

110110

111-
.. _whatsnew_0200.api:
112-
113111
.. _whatsnew_0200.api_breaking:
114112

115113
Backwards incompatible API changes
@@ -183,24 +181,26 @@ Map on Index types now return other Index types
183181

184182
s.map(lambda x: x.hour)
185183

184+
.. _whatsnew_0200.s3:
185+
186+
S3 File Handling
187+
^^^^^^^^^^^^^^^^
188+
189+
pandas now uses `s3fs <http://s3fs.readthedocs.io/>`_ for handling S3 connections. This shouldn't break
190+
any code. However, since s3fs is not a required dependency, you will need to install it separately (like boto
191+
in prior versions of pandas) (:issue:`11915`).
192+
193+
.. _whatsnew_0200.api:
186194

187195
- ``CParserError`` has been renamed to ``ParserError`` in ``pd.read_csv`` and will be removed in the future (:issue:`12665`)
188196
- ``SparseArray.cumsum()`` and ``SparseSeries.cumsum()`` will now always return ``SparseArray`` and ``SparseSeries`` respectively (:issue:`12855`)
189-
- ``CParserError`` has been renamed to ``ParserError`` in ``pd.read_csv`` and will be removed in the future (:issue:`12665`)
190197

191-
.. _whatsnew_2000.api.s3
192198

193-
S3 File Handling
194-
^^^^^^^^^^^^^^^^
195199

196-
pandas now uses `s3fs <http://s3fs.readthedocs.io/>`_ for handling S3 connections. This shouldn't break
197-
any code, but since s3fs is not a required dependency, you will need to install it separately (like boto
198-
in prior versions of pandas).
199200

200201
Other API Changes
201202
^^^^^^^^^^^^^^^^^
202203

203-
204204
.. _whatsnew_0200.deprecations:
205205

206206
Deprecations

pandas/io/s3.py

+6-4
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
from pandas import compat
33
try:
44
import s3fs
5+
from botocore.exceptions import NoCredentialsError
56
except:
67
raise ImportError("The s3fs library is required to handle s3 files")
78

@@ -19,15 +20,16 @@ def _strip_schema(url):
1920

2021
def get_filepath_or_buffer(filepath_or_buffer, encoding=None,
2122
compression=None):
22-
23-
# Assuming AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_S3_HOST
24-
# are environment variables
2523
fs = s3fs.S3FileSystem(anon=False)
2624
try:
2725
filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer))
28-
except OSError:
26+
except (OSError, NoCredentialsError):
2927
# boto3 has troubles when trying to access a public file
3028
# when credentialed...
29+
# An OSError is raised if you have credentials, but they
30+
# aren't valid for that bucket.
31+
# A NoCredentialsError is raised if you don't have creds
32+
# for that bucket.
3133
fs = s3fs.S3FileSystem(anon=True)
3234
filepath_or_buffer = fs.open(_strip_schema(filepath_or_buffer))
3335
return filepath_or_buffer, None, compression

pandas/util/print_versions.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ def show_versions(as_json=False):
9494
("pymysql", lambda mod: mod.__version__),
9595
("psycopg2", lambda mod: mod.__version__),
9696
("jinja2", lambda mod: mod.__version__),
97-
("boto", lambda mod: mod.__version__),
97+
("s3fs", lambda mod: mod.__version__),
9898
("pandas_datareader", lambda mod: mod.__version__)
9999
]
100100

0 commit comments

Comments
 (0)