Skip to content

Commit e13739a

Browse files
committed
DOC: edits in io.rst
1 parent 6b04681 commit e13739a

File tree

1 file changed

+43
-57
lines changed

1 file changed

+43
-57
lines changed

doc/source/io.rst

+43-57
Original file line numberDiff line numberDiff line change
@@ -2185,13 +2185,13 @@ argument to ``to_excel`` and to ``ExcelWriter``. The built-in engines are:
21852185
21862186
df.to_excel('path_to_file.xlsx', sheet_name='Sheet1')
21872187
2188+
.. _io.excel_writing_buffer:
2189+
21882190
Writing Excel Files to Memory
21892191
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21902192

21912193
.. versionadded:: 0.17
21922194

2193-
.. _io.excel_writing_buffer
2194-
21952195
Pandas supports writing Excel files to buffer-like objects such as ``StringIO`` or
21962196
``BytesIO`` using :class:`~pandas.io.excel.ExcelWriter`.
21972197

@@ -2412,7 +2412,7 @@ for some advanced strategies
24122412

24132413
.. warning::
24142414

2415-
As of version 0.17.0, ``HDFStore`` will not drop rows that have all missing values by default. Previously, if all values (except the index) were missing, ``HDFStore`` would not write those rows to disk.
2415+
As of version 0.17.0, ``HDFStore`` will not drop rows that have all missing values by default. Previously, if all values (except the index) were missing, ``HDFStore`` would not write those rows to disk.
24162416

24172417
.. ipython:: python
24182418
:suppress:
@@ -2511,7 +2511,7 @@ similar to how ``read_csv`` and ``to_csv`` work. (new in 0.11.0)
25112511
os.remove('store_tl.h5')
25122512
25132513
2514-
As of version 0.17.0, HDFStore will no longer drop rows that are all missing by default. This behavior can be enabled by setting ``dropna=True``.
2514+
As of version 0.17.0, HDFStore will no longer drop rows that are all missing by default. This behavior can be enabled by setting ``dropna=True``.
25152515

25162516
.. ipython:: python
25172517
:suppress:
@@ -2520,16 +2520,16 @@ As of version 0.17.0, HDFStore will no longer drop rows that are all missing by
25202520
25212521
.. ipython:: python
25222522
2523-
df_with_missing = pd.DataFrame({'col1':[0, np.nan, 2],
2523+
df_with_missing = pd.DataFrame({'col1':[0, np.nan, 2],
25242524
'col2':[1, np.nan, np.nan]})
25252525
df_with_missing
25262526
2527-
df_with_missing.to_hdf('file.h5', 'df_with_missing',
2527+
df_with_missing.to_hdf('file.h5', 'df_with_missing',
25282528
format = 'table', mode='w')
2529-
2529+
25302530
pd.read_hdf('file.h5', 'df_with_missing')
25312531
2532-
df_with_missing.to_hdf('file.h5', 'df_with_missing',
2532+
df_with_missing.to_hdf('file.h5', 'df_with_missing',
25332533
format = 'table', mode='w', dropna=True)
25342534
pd.read_hdf('file.h5', 'df_with_missing')
25352535
@@ -2547,16 +2547,16 @@ This is also true for the major axis of a ``Panel``:
25472547
[[np.nan, np.nan, np.nan], [np.nan,5,6]],
25482548
[[np.nan, np.nan, np.nan],[np.nan,3,np.nan]]]
25492549
2550-
panel_with_major_axis_all_missing = Panel(matrix,
2550+
panel_with_major_axis_all_missing = Panel(matrix,
25512551
items=['Item1', 'Item2','Item3'],
25522552
major_axis=[1,2],
25532553
minor_axis=['A', 'B', 'C'])
25542554
25552555
panel_with_major_axis_all_missing
25562556
25572557
panel_with_major_axis_all_missing.to_hdf('file.h5', 'panel',
2558-
dropna = True,
2559-
format='table',
2558+
dropna = True,
2559+
format='table',
25602560
mode='w')
25612561
reloaded = read_hdf('file.h5', 'panel')
25622562
reloaded
@@ -3224,8 +3224,7 @@ Notes & Caveats
32243224
``PyTables`` only supports concurrent reads (via threading or
32253225
processes). If you need reading and writing *at the same time*, you
32263226
need to serialize these operations in a single thread in a single
3227-
process. You will corrupt your data otherwise. See the issue
3228-
(:`2397`) for more information.
3227+
process. You will corrupt your data otherwise. See the (:issue:`2397`) for more information.
32293228
- If you use locks to manage write access between multiple processes, you
32303229
may want to use :py:func:`~os.fsync` before releasing write locks. For
32313230
convenience you can use ``store.flush(fsync=True)`` to do this for you.
@@ -3256,34 +3255,19 @@ DataTypes
32563255
``HDFStore`` will map an object dtype to the ``PyTables`` underlying
32573256
dtype. This means the following types are known to work:
32583257

3259-
- floating : ``float64, float32, float16`` *(using* ``np.nan`` *to
3260-
represent invalid values)*
3261-
- integer : ``int64, int32, int8, uint64, uint32, uint8``
3262-
- bool
3263-
- datetime64[ns] *(using* ``NaT`` *to represent invalid values)*
3264-
- object : ``strings`` *(using* ``np.nan`` *to represent invalid
3265-
values)*
3266-
3267-
Currently, ``unicode`` and ``datetime`` columns (represented with a
3268-
dtype of ``object``), **WILL FAIL**. In addition, even though a column
3269-
may look like a ``datetime64[ns]``, if it contains ``np.nan``, this
3270-
**WILL FAIL**. You can try to convert datetimelike columns to proper
3271-
``datetime64[ns]`` columns, that possibly contain ``NaT`` to represent
3272-
invalid values. (Some of these issues have been addressed and these
3273-
conversion may not be necessary in future versions of pandas)
3274-
3275-
.. ipython:: python
3276-
3277-
import datetime
3278-
df = DataFrame(dict(datelike=Series([datetime.datetime(2001, 1, 1),
3279-
datetime.datetime(2001, 1, 2), np.nan])))
3280-
df
3281-
df.dtypes
3282-
3283-
# to convert
3284-
df['datelike'] = Series(df['datelike'].values, dtype='M8[ns]')
3285-
df
3286-
df.dtypes
3258+
====================================================== =========================
3259+
Type Represents missing values
3260+
====================================================== =========================
3261+
floating : ``float64, float32, float16`` ``np.nan``
3262+
integer : ``int64, int32, int8, uint64,uint32, uint8``
3263+
boolean
3264+
``datetime64[ns]`` ``NaT``
3265+
``timedelta64[ns]`` ``NaT``
3266+
categorical : see the section below
3267+
object : ``strings`` ``np.nan``
3268+
====================================================== =========================
3269+
3270+
``unicode`` columns are not supported, and **WILL FAIL**.
32873271

32883272
.. _io.hdf5-categorical:
32893273

@@ -3813,22 +3797,22 @@ connecting to.
38133797

38143798
.. code-block:: python
38153799
3816-
from sqlalchemy import create_engine
3800+
from sqlalchemy import create_engine
38173801
3818-
engine = create_engine('postgresql://scott:tiger@localhost:5432/mydatabase')
3802+
engine = create_engine('postgresql://scott:tiger@localhost:5432/mydatabase')
38193803
3820-
engine = create_engine('mysql+mysqldb://scott:tiger@localhost/foo')
3804+
engine = create_engine('mysql+mysqldb://scott:tiger@localhost/foo')
38213805
3822-
engine = create_engine('oracle://scott:[email protected]:1521/sidname')
3806+
engine = create_engine('oracle://scott:[email protected]:1521/sidname')
38233807
3824-
engine = create_engine('mssql+pyodbc://mydsn')
3808+
engine = create_engine('mssql+pyodbc://mydsn')
38253809
3826-
# sqlite://<nohostname>/<path>
3827-
# where <path> is relative:
3828-
engine = create_engine('sqlite:///foo.db')
3810+
# sqlite://<nohostname>/<path>
3811+
# where <path> is relative:
3812+
engine = create_engine('sqlite:///foo.db')
38293813
3830-
# or absolute, starting with a slash:
3831-
engine = create_engine('sqlite:////absolute/path/to/foo.db')
3814+
# or absolute, starting with a slash:
3815+
engine = create_engine('sqlite:////absolute/path/to/foo.db')
38323816
38333817
For more information see the examples the SQLAlchemy `documentation <http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html>`__
38343818

@@ -3939,8 +3923,8 @@ will produce the dictionary representation of the schema.
39393923

39403924
.. code-block:: python
39413925
3942-
df = pandas.DataFrame({'A': [1.0]})
3943-
gbq.generate_bq_schema(df, default_type='STRING')
3926+
df = pandas.DataFrame({'A': [1.0]})
3927+
gbq.generate_bq_schema(df, default_type='STRING')
39443928
39453929
.. warning::
39463930

@@ -4145,14 +4129,16 @@ This is an informal comparison of various IO methods, using pandas 0.13.1.
41454129

41464130
.. code-block:: python
41474131
4148-
In [3]: df = DataFrame(randn(1000000,2),columns=list('AB'))
4132+
In [1]: df = DataFrame(randn(1000000,2),columns=list('AB'))
4133+
4134+
In [2]: df.info()
41494135
<class 'pandas.core.frame.DataFrame'>
41504136
Int64Index: 1000000 entries, 0 to 999999
41514137
Data columns (total 2 columns):
4152-
A 1000000 non-null values
4153-
B 1000000 non-null values
4138+
A 1000000 non-null float64
4139+
B 1000000 non-null float64
41544140
dtypes: float64(2)
4155-
4141+
memory usage: 22.9 MB
41564142
41574143
Writing
41584144

0 commit comments

Comments
 (0)