forked from pandas-dev/pandas
-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathRELEASE.rst
4173 lines (3861 loc) · 214 KB
/
RELEASE.rst
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
=============
Release Notes
=============
This is the list of changes to pandas between each release. For full details,
see the commit logs at http://github.com/pydata/pandas
What is it
----------
pandas is a Python package providing fast, flexible, and expressive data
structures designed to make working with “relational” or “labeled” data both
easy and intuitive. It aims to be the fundamental high-level building block for
doing practical, real world data analysis in Python. Additionally, it has the
broader goal of becoming the most powerful and flexible open source data
analysis / manipulation tool available in any language.
Where to get it
---------------
* Source code: http://github.com/pydata/pandas
* Binary installers on PyPI: http://pypi.python.org/pypi/pandas
* Documentation: http://pandas.pydata.org
pandas 0.11.1
=============
**Release date:** not-yet-released
**New features**
- ``pd.read_html()`` can now parse HTML strings, files or urls and
returns a list of ``DataFrame`` s courtesy of @cpcloud. (GH3477_, GH3605_,
GH3606_)
- Support for reading Amazon S3 files. (GH3504_)
- Added module for reading and writing Stata files: pandas.io.stata (GH1512_)
includes ``to_stata`` DataFrame method, and a ``read_stata`` top-level reader
- Added support for writing in ``to_csv`` and reading in ``read_csv``,
multi-index columns. The ``header`` option in ``read_csv`` now accepts a
list of the rows from which to read the index. Added the option,
``tupleize_cols`` to provide compatiblity for the pre 0.11.1 behavior of
writing and reading multi-index columns via a list of tuples. The default in
0.11.1 is to write lists of tuples and *not* interpret list of tuples as a
multi-index column.
Note: The default value will change in 0.12 to make the default *to* write and
read multi-index columns in the new format. (GH3571_, GH1651_, GH3141_)
- Add iterator to ``Series.str`` (GH3638_)
- ``pd.set_option()`` now allows N option, value pairs (GH3667_).
- Added keyword parameters for different types of scatter_matrix subplots
- A ``filter`` method on grouped Series or DataFrames returns a subset of
the original (GH3680_, GH919_)
- Access to historical Google Finance data in pandas.io.data (GH3814_)
**Improvements to existing features**
- Fixed various issues with internal pprinting code, the repr() for various objects
including TimeStamp and Index now produces valid python code strings and
can be used to recreate the object, (GH3038_, GH3379_, GH3251_, GH3460_)
- ``convert_objects`` now accepts a ``copy`` parameter (defaults to ``True``)
- ``HDFStore``
- will retain index attributes (freq,tz,name) on recreation (GH3499_)
- will warn with a ``AttributeConflictWarning`` if you are attempting to append
an index with a different frequency than the existing, or attempting
to append an index with a different name than the existing
- support datelike columns with a timezone as data_columns (GH2852_)
- table writing performance improvements.
- support python3 (via ``PyTables 3.0.0``) (GH3750_)
- Add modulo operator to Series, DataFrame
- Add ``date`` method to DatetimeIndex
- Simplified the API and added a describe method to Categorical
- ``melt`` now accepts the optional parameters ``var_name`` and ``value_name``
to specify custom column names of the returned DataFrame (GH3649_),
thanks @hoechenberger
- clipboard functions use pyperclip (no dependencies on Windows, alternative
dependencies offered for Linux) (GH3837_).
- Plotting functions now raise a ``TypeError`` before trying to plot anything
if the associated objects have have a dtype of ``object`` (GH1818_,
GH3572_, GH3911_, GH3912_), but they will try to convert object arrays to
numeric arrays if possible so that you can still plot, for example, an
object array with floats. This happens before any drawing takes place which
elimnates any spurious plots from showing up.
- Added Faq section on repr display options, to help users customize their setup.
- ``where`` operations that result in block splitting are much faster (GH3733_)
- Series and DataFrame hist methods now take a ``figsize`` argument (GH3834_)
- DatetimeIndexes no longer try to convert mixed-integer indexes during join
operations (GH3877_)
- Add ``unit`` keyword to ``Timestamp`` and ``to_datetime`` to enable passing of
integers or floats that are in an epoch unit of ``s, ms, us, ns``
(e.g. unix timestamps or epoch ``s``, with fracional seconds allowed) (GH3540_)
- DataFrame corr method (spearman) is now cythonized.
**API Changes**
- ``HDFStore``
- When removing an object, ``remove(key)`` raises
``KeyError`` if the key is not a valid store object.
- raise a ``TypeError`` on passing ``where`` or ``columns``
to select with a Storer; these are invalid parameters at this time
- can now specify an ``encoding`` option to ``append/put``
to enable alternate encodings (GH3750_)
- enable support for ``iterator/chunksize`` with ``read_hdf``
- The repr() for (Multi)Index now obeys display.max_seq_items rather
then numpy threshold print options. (GH3426_, GH3466_)
- Added mangle_dupe_cols option to read_table/csv, allowing users
to control legacy behaviour re dupe cols (A, A.1, A.2 vs A, A ) (GH3468_)
Note: The default value will change in 0.12 to the "no mangle" behaviour,
If your code relies on this behaviour, explicitly specify mangle_dupe_cols=True
in your calls.
- Do not allow astypes on ``datetime64[ns]`` except to ``object``, and
``timedelta64[ns]`` to ``object/int`` (GH3425_)
- The behavior of ``datetime64`` dtypes has changed with respect to certain
so-called reduction operations (GH3726_). The following operations now
raise a ``TypeError`` when perfomed on a ``Series`` and return an *empty*
``Series`` when performed on a ``DataFrame`` similar to performing these
operations on, for example, a ``DataFrame`` of ``slice`` objects:
- sum, prod, mean, std, var, skew, kurt, corr, and cov
- Do not allow datetimelike/timedeltalike creation except with valid types
(e.g. cannot pass ``datetime64[ms]``) (GH3423_)
- Add ``squeeze`` keyword to ``groupby`` to allow reduction from
DataFrame -> Series if groups are unique. Regression from 0.10.1,
partial revert on (GH2893_) with (GH3596_)
- Raise on ``iloc`` when boolean indexing with a label based indexer mask
e.g. a boolean Series, even with integer labels, will raise. Since ``iloc``
is purely positional based, the labels on the Series are not alignable (GH3631_)
- The ``raise_on_error`` option to plotting methods is obviated by GH3572_,
so it is removed. Plots now always raise when data cannot be plotted or the
object being plotted has a dtype of ``object``.
- ``DataFrame.interpolate()`` is now deprecated. Please use
``DataFrame.fillna()`` and ``DataFrame.replace()`` instead (GH3582_,
GH3675_, GH3676_).
- the ``method`` and ``axis`` arguments of ``DataFrame.replace()`` are
deprecated
- ``DataFrame.replace`` 's ``infer_types`` parameter is removed and now
performs conversion by default. (GH3907_)
- Deprecated display.height, display.width is now only a formatting option
does not control triggering of summary, similar to < 0.11.0.
- Add the keyword ``allow_duplicates`` to ``DataFrame.insert`` to allow a duplicate column
to be inserted if ``True``, default is ``False`` (same as prior to 0.11.1) (GH3679_)
- io API changes
- added ``pandas.io.api`` for i/o imports
- removed ``Excel`` support to ``pandas.io.excel``
- added top-level ``pd.read_sql`` and ``to_sql`` DataFrame methods
- removed ``clipboard`` support to ``pandas.io.clipboard``
- replace top-level and instance methods ``save`` and ``load`` with top-level ``read_pickle`` and
``to_pickle`` instance method, ``save`` and ``load`` will give deprecation warning.
- the ``method`` and ``axis`` arguments of ``DataFrame.replace()`` are
deprecated
- the ``method`` and ``axis`` arguments of ``DataFrame.replace()`` are
deprecated
- Implement ``__nonzero__`` for ``NDFrame`` objects (GH3691_, GH3696_)
- ``as_matrix`` with mixed signed and unsigned dtypes will result in 2 x the lcd of the unsigned
as an int, maxing with ``int64``, to avoid precision issues (GH3733_)
- ``na_values`` in a list provided to ``read_csv/read_excel`` will match string and numeric versions
e.g. ``na_values=['99']`` will match 99 whether the column ends up being int, float, or string (GH3611_)
- ``read_html`` now defaults to ``None`` when reading, and falls back on
``bs4`` + ``html5lib`` when lxml fails to parse. a list of parsers to try
until success is also valid
- more consistency in the to_datetime return types (give string/array of string inputs) (GH3888_)
**Bug Fixes**
- Fixed an esoteric excel reading bug, xlrd>= 0.9.0 now required for excel
support. Should provide python3 support (for reading) which has been
lacking. (GH3164_)
- Allow unioning of date ranges sharing a timezone (GH3491_)
- Fix to_csv issue when having a large number of rows and ``NaT`` in some
columns (GH3437_)
- ``.loc`` was not raising when passed an integer list (GH3449_)
- Unordered time series selection was misbehaving when using label slicing (GH3448_)
- Fix sorting in a frame with a list of columns which contains datetime64[ns] dtypes (GH3461_)
- DataFrames fetched via FRED now handle '.' as a NaN. (GH3469_)
- Fix regression in a DataFrame apply with axis=1, objects were not being converted back
to base dtypes correctly (GH3480_)
- Fix issue when storing uint dtypes in an HDFStore. (GH3493_)
- Non-unique index support clarified (GH3468_)
- Addressed handling of dupe columns in df.to_csv new and old (GH3454_, GH3457_)
- Fix assigning a new index to a duplicate index in a DataFrame would fail (GH3468_)
- Fix construction of a DataFrame with a duplicate index
- ref_locs support to allow duplicative indices across dtypes,
allows iget support to always find the index (even across dtypes) (GH2194_)
- applymap on a DataFrame with a non-unique index now works
(removed warning) (GH2786_), and fix (GH3230_)
- Fix to_csv to handle non-unique columns (GH3495_)
- Duplicate indexes with getitem will return items in the correct order (GH3455_, GH3457_)
and handle missing elements like unique indices (GH3561_)
- Duplicate indexes with and empty DataFrame.from_records will return a correct frame (GH3562_)
- Concat to produce a non-unique columns when duplicates are across dtypes is fixed (GH3602_)
- Non-unique indexing with a slice via ``loc`` and friends fixed (GH3659_)
- Allow insert/delete to non-unique columns (GH3679_)
- Extend ``reindex`` to correctly deal with non-unique indices (GH3679_)
- ``DataFrame.itertuples()`` now works with frames with duplicate column
names (GH3873_)
- Fixed bug in groupby with empty series referencing a variable before assignment. (GH3510_)
- Fixed bug in mixed-frame assignment with aligned series (GH3492_)
- Fixed bug in selecting month/quarter/year from a series would not select the time element
on the last day (GH3546_)
- Fixed a couple of MultiIndex rendering bugs in df.to_html() (GH3547_, GH3553_)
- Properly convert np.datetime64 objects in a Series (GH3416_)
- Raise a ``TypeError`` on invalid datetime/timedelta operations
e.g. add datetimes, multiple timedelta x datetime
- Fix ``.diff`` on datelike and timedelta operations (GH3100_)
- ``combine_first`` not returning the same dtype in cases where it can (GH3552_)
- Fixed bug with ``Panel.transpose`` argument aliases (GH3556_)
- Fixed platform bug in ``PeriodIndex.take`` (GH3579_)
- Fixed bud in incorrect conversion of datetime64[ns] in ``combine_first`` (GH3593_)
- Fixed bug in reset_index with ``NaN`` in a multi-index (GH3586_)
- ``fillna`` methods now raise a ``TypeError`` when the ``value`` parameter
is a ``list`` or ``tuple``.
- Fixed bug where a time-series was being selected in preference to an actual column name
in a frame (GH3594_)
- Fix modulo and integer division on Series,DataFrames to act similary to ``float`` dtypes to return
``np.nan`` or ``np.inf`` as appropriate (GH3590_)
- Fix incorrect dtype on groupby with ``as_index=False`` (GH3610_)
- Fix ``read_csv/read_excel`` to correctly encode identical na_values, e.g. ``na_values=[-999.0,-999]``
was failing (GH3611_)
- Disable HTML output in qtconsole again. (GH3657_)
- Reworked the new repr display logic, which users found confusing. (GH3663_)
- Fix indexing issue in ndim >= 3 with ``iloc`` (GH3617_)
- Correctly parse date columns with embedded (nan/NaT) into datetime64[ns] dtype in ``read_csv``
when ``parse_dates`` is specified (GH3062_)
- Fix not consolidating before to_csv (GH3624_)
- Fix alignment issue when setitem in a DataFrame with a piece of a DataFrame (GH3626_) or
a mixed DataFrame and a Series (GH3668_)
- Fix plotting of unordered DatetimeIndex (GH3601_)
- ``sql.write_frame`` failing when writing a single column to sqlite (GH3628_),
thanks to @stonebig
- Fix pivoting with ``nan`` in the index (GH3558_)
- Fix running of bs4 tests when it is not installed (GH3605_)
- Fix parsing of html table (GH3606_)
- ``read_html()`` now only allows a single backend: ``html5lib`` (GH3616_)
- ``convert_objects`` with ``convert_dates='coerce'`` was parsing some single-letter strings
into today's date
- ``DataFrame.from_records`` did not accept empty recarrays (GH3682_)
- ``DataFrame.to_csv`` will succeed with the deprecated option ``nanRep``, @tdsmith
- ``DataFrame.to_html`` and ``DataFrame.to_latex`` now accept a path for
their first argument (GH3702_)
- Fix file tokenization error with \r delimiter and quoted fields (GH3453_)
- Groupby transform with item-by-item not upcasting correctly (GH3740_)
- Incorrectly read a HDFStore multi-index Frame witha column specification (GH3748_)
- ``read_html`` now correctly skips tests (GH3741_)
- PandasObjects raise TypeError when trying to hash (GH3882_)
- Fix incorrect arguments passed to concat that are not list-like (e.g. concat(df1,df2)) (GH3481_)
- Correctly parse when passed the ``dtype=str`` (or other variable-len string dtypes)
in ``read_csv`` (GH3795_)
- Fix index name not propogating when using ``loc/ix`` (GH3880_)
- Fix groupby when applying a custom function resulting in a returned DataFrame was
not converting dtypes (GH3911_)
- Fixed a bug where ``DataFrame.replace`` with a compiled regular expression
in the ``to_replace`` argument wasn't working (GH3907_)
- Fixed ``__truediv__`` in Python 2.7 with ``numexpr`` installed to actually do true division when dividing
two integer arrays with at least 10000 cells total (GH3764_)
- Indexing with a string with seconds resolution not selecting from a time index (GH3925_)
.. _GH3164: https://github.com/pydata/pandas/issues/3164
.. _GH2786: https://github.com/pydata/pandas/issues/2786
.. _GH2194: https://github.com/pydata/pandas/issues/2194
.. _GH3230: https://github.com/pydata/pandas/issues/3230
.. _GH3425: https://github.com/pydata/pandas/issues/3425
.. _GH3416: https://github.com/pydata/pandas/issues/3416
.. _GH3423: https://github.com/pydata/pandas/issues/3423
.. _GH3251: https://github.com/pydata/pandas/issues/3251
.. _GH3379: https://github.com/pydata/pandas/issues/3379
.. _GH3480: https://github.com/pydata/pandas/issues/3480
.. _GH3481: https://github.com/pydata/pandas/issues/3481
.. _GH2852: https://github.com/pydata/pandas/issues/2852
.. _GH3100: https://github.com/pydata/pandas/issues/3100
.. _GH3454: https://github.com/pydata/pandas/issues/3454
.. _GH3457: https://github.com/pydata/pandas/issues/3457
.. _GH3491: https://github.com/pydata/pandas/issues/3491
.. _GH3426: https://github.com/pydata/pandas/issues/3426
.. _GH3466: https://github.com/pydata/pandas/issues/3466
.. _GH3038: https://github.com/pydata/pandas/issues/3038
.. _GH3510: https://github.com/pydata/pandas/issues/3510
.. _GH3547: https://github.com/pydata/pandas/issues/3547
.. _GH3553: https://github.com/pydata/pandas/issues/3553
.. _GH3437: https://github.com/pydata/pandas/issues/3437
.. _GH3468: https://github.com/pydata/pandas/issues/3468
.. _GH3453: https://github.com/pydata/pandas/issues/3453
.. _GH3455: https://github.com/pydata/pandas/issues/3455
.. _GH3457: https://github.com/pydata/pandas/issues/3457
.. _GH3477: https://github.com/pydata/pandas/issues/3457
.. _GH3460: https://github.com/pydata/pandas/issues/3460
.. _GH3461: https://github.com/pydata/pandas/issues/3461
.. _GH3546: https://github.com/pydata/pandas/issues/3546
.. _GH3468: https://github.com/pydata/pandas/issues/3468
.. _GH3448: https://github.com/pydata/pandas/issues/3448
.. _GH3499: https://github.com/pydata/pandas/issues/3499
.. _GH3495: https://github.com/pydata/pandas/issues/3495
.. _GH3492: https://github.com/pydata/pandas/issues/3492
.. _GH3540: https://github.com/pydata/pandas/issues/3540
.. _GH3552: https://github.com/pydata/pandas/issues/3552
.. _GH3562: https://github.com/pydata/pandas/issues/3562
.. _GH3586: https://github.com/pydata/pandas/issues/3586
.. _GH3561: https://github.com/pydata/pandas/issues/3561
.. _GH3493: https://github.com/pydata/pandas/issues/3493
.. _GH3579: https://github.com/pydata/pandas/issues/3579
.. _GH3593: https://github.com/pydata/pandas/issues/3593
.. _GH3556: https://github.com/pydata/pandas/issues/3556
.. _GH3594: https://github.com/pydata/pandas/issues/3594
.. _GH3590: https://github.com/pydata/pandas/issues/3590
.. _GH3610: https://github.com/pydata/pandas/issues/3610
.. _GH3596: https://github.com/pydata/pandas/issues/3596
.. _GH3617: https://github.com/pydata/pandas/issues/3617
.. _GH3435: https://github.com/pydata/pandas/issues/3435
.. _GH3611: https://github.com/pydata/pandas/issues/3611
.. _GH3558: https://github.com/pydata/pandas/issues/3558
.. _GH3062: https://github.com/pydata/pandas/issues/3062
.. _GH3624: https://github.com/pydata/pandas/issues/3624
.. _GH3626: https://github.com/pydata/pandas/issues/3626
.. _GH3601: https://github.com/pydata/pandas/issues/3601
.. _GH3631: https://github.com/pydata/pandas/issues/3631
.. _GH3602: https://github.com/pydata/pandas/issues/3602
.. _GH1512: https://github.com/pydata/pandas/issues/1512
.. _GH3571: https://github.com/pydata/pandas/issues/3571
.. _GH1651: https://github.com/pydata/pandas/issues/1651
.. _GH3141: https://github.com/pydata/pandas/issues/3141
.. _GH3628: https://github.com/pydata/pandas/issues/3628
.. _GH3638: https://github.com/pydata/pandas/issues/3638
.. _GH3668: https://github.com/pydata/pandas/issues/3668
.. _GH3605: https://github.com/pydata/pandas/issues/3605
.. _GH3606: https://github.com/pydata/pandas/issues/3606
.. _GH3659: https://github.com/pydata/pandas/issues/3659
.. _GH3649: https://github.com/pydata/pandas/issues/3649
.. _GH3679: https://github.com/pydata/pandas/issues/3679
.. _Gh3616: https://github.com/pydata/pandas/issues/3616
.. _GH1818: https://github.com/pydata/pandas/issues/1818
.. _GH3572: https://github.com/pydata/pandas/issues/3572
.. _GH3582: https://github.com/pydata/pandas/issues/3582
.. _GH3676: https://github.com/pydata/pandas/issues/3676
.. _GH3675: https://github.com/pydata/pandas/issues/3675
.. _GH3682: https://github.com/pydata/pandas/issues/3682
.. _GH3702: https://github.com/pydata/pandas/issues/3702
.. _GH3691: https://github.com/pydata/pandas/issues/3691
.. _GH3696: https://github.com/pydata/pandas/issues/3696
.. _GH3667: https://github.com/pydata/pandas/issues/3667
.. _GH3733: https://github.com/pydata/pandas/issues/3733
.. _GH3740: https://github.com/pydata/pandas/issues/3740
.. _GH3748: https://github.com/pydata/pandas/issues/3748
.. _GH3741: https://github.com/pydata/pandas/issues/3741
.. _GH3750: https://github.com/pydata/pandas/issues/3750
.. _GH3726: https://github.com/pydata/pandas/issues/3726
.. _GH3795: https://github.com/pydata/pandas/issues/3795
.. _GH3814: https://github.com/pydata/pandas/issues/3814
.. _GH3834: https://github.com/pydata/pandas/issues/3834
.. _GH3873: https://github.com/pydata/pandas/issues/3873
.. _GH3877: https://github.com/pydata/pandas/issues/3877
.. _GH3659: https://github.com/pydata/pandas/issues/3659
.. _GH3679: https://github.com/pydata/pandas/issues/3679
.. _GH3880: https://github.com/pydata/pandas/issues/3880
.. _GH3911: https://github.com/pydata/pandas/issues/3911
.. _GH3907: https://github.com/pydata/pandas/issues/3907
.. _GH3911: https://github.com/pydata/pandas/issues/3911
.. _GH3912: https://github.com/pydata/pandas/issues/3912
<<<<<<< HEAD
.. _GH3764: https://github.com/pydata/pandas/issues/3764
.. _GH3888: https://github.com/pydata/pandas/issues/3888
=======
.. _GH3925: https://github.com/pydata/pandas/issues/3925
>>>>>>> BUG: (GH3925) Indexing with a string with seconds resolution not selecting from a time index
pandas 0.11.0
=============
**Release date:** 2013-04-22
**New features**
- New documentation section, ``10 Minutes to Pandas``
- New documentation section, ``Cookbook``
- Allow mixed dtypes (e.g ``float32/float64/int32/int16/int8``) to coexist in
DataFrames and propogate in operations
- Add function to pandas.io.data for retrieving stock index components from
Yahoo! finance (GH2795_)
- Support slicing with time objects (GH2681_)
- Added ``.iloc`` attribute, to support strict integer based indexing,
analogous to ``.ix`` (GH2922_)
- Added ``.loc`` attribute, to support strict label based indexing, analagous
to ``.ix`` (GH3053_)
- Added ``.iat`` attribute, to support fast scalar access via integers
(replaces ``iget_value/iset_value``)
- Added ``.at`` attribute, to support fast scalar access via labels (replaces
``get_value/set_value``)
- Moved functionaility from ``irow,icol,iget_value/iset_value`` to ``.iloc`` indexer
(via ``_ixs`` methods in each object)
- Added support for expression evaluation using the ``numexpr`` library
- Added ``convert=boolean`` to ``take`` routines to translate negative
indices to positive, defaults to True
- Added to_series() method to indices, to facilitate the creation of indexeres
(GH3275_)
**Improvements to existing features**
- Improved performance of df.to_csv() by up to 10x in some cases. (GH3059_)
- added ``blocks`` attribute to DataFrames, to return a dict of dtypes to
homogeneously dtyped DataFrames
- added keyword ``convert_numeric`` to ``convert_objects()`` to try to
convert object dtypes to numeric types (default is False)
- ``convert_dates`` in ``convert_objects`` can now be ``coerce`` which will
return a datetime64[ns] dtype with non-convertibles set as ``NaT``; will
preserve an all-nan object (e.g. strings), default is True (to perform
soft-conversion
- Series print output now includes the dtype by default
- Optimize internal reindexing routines (GH2819_, GH2867_)
- ``describe_option()`` now reports the default and current value of options.
- Add ``format`` option to ``pandas.to_datetime`` with faster conversion of
strings that can be parsed with datetime.strptime
- Add ``axes`` property to ``Series`` for compatibility
- Add ``xs`` function to ``Series`` for compatibility
- Allow setitem in a frame where only mixed numerics are present (e.g. int
and float), (GH3037_)
- ``HDFStore``
- Provide dotted attribute access to ``get`` from stores
(e.g. store.df == store['df'])
- New keywords ``iterator=boolean``, and ``chunksize=number_in_a_chunk``
are provided to support iteration on ``select`` and
``select_as_multiple`` (GH3076_)
- support ``read_hdf/to_hdf`` API similar to ``read_csv/to_csv`` (GH3222_)
- Add ``squeeze`` method to possibly remove length 1 dimensions from an
object.
.. ipython:: python
p = Panel(randn(3,4,4),items=['ItemA','ItemB','ItemC'],
major_axis=date_range('20010102',periods=4),
minor_axis=['A','B','C','D'])
p
p.reindex(items=['ItemA']).squeeze()
p.reindex(items=['ItemA'],minor=['B']).squeeze()
- Improvement to Yahoo API access in ``pd.io.data.Options`` (GH2758_)
- added option `display.max_seq_items` to control the number of
elements printed per sequence pprinting it. (GH2979_)
- added option `display.chop_threshold` to control display of small numerical
values. (GH2739_)
- added option `display.max_info_rows` to prevent verbose_info from being
calculated for frames above 1M rows (configurable). (GH2807_, GH2918_)
- value_counts() now accepts a "normalize" argument, for normalized
histograms. (GH2710_).
- DataFrame.from_records now accepts not only dicts but any instance of
the collections.Mapping ABC.
- Allow selection semantics via a string with a datelike index to work in both
Series and DataFrames (GH3070_)
.. ipython:: python
idx = date_range("2001-10-1", periods=5, freq='M')
ts = Series(np.random.rand(len(idx)),index=idx)
ts['2001']
df = DataFrame(dict(A = ts))
df['2001']
- added option `display.mpl_style` providing a sleeker visual style
for plots. Based on https://gist.github.com/huyng/816622 (GH3075_).
- Improved performance across several core functions by taking memory
ordering of arrays into account. Courtesy of @stephenwlin (GH3130_)
- Improved performance of groupby transform method (GH2121_)
- Handle "ragged" CSV files missing trailing delimiters in rows with missing
fields when also providing explicit list of column names (so the parser
knows how many columns to expect in the result) (GH2981_)
- On a mixed DataFrame, allow setting with indexers with ndarray/DataFrame
on rhs (GH3216_)
- Treat boolean values as integers (values 1 and 0) for numeric
operations. (GH2641_)
- Add ``time`` method to DatetimeIndex (GH3180_)
- Return NA when using Series.str[...] for values that are not long enough
(GH3223_)
- Display cursor coordinate information in time-series plots (GH1670_)
- to_html() now accepts an optional "escape" argument to control reserved
HTML character escaping (enabled by default) and escapes ``&``, in addition
to ``<`` and ``>``. (GH2919_)
**API Changes**
- Do not automatically upcast numeric specified dtypes to ``int64`` or
``float64`` (GH622_ and GH797_)
- DataFrame construction of lists and scalars, with no dtype present, will
result in casting to ``int64`` or ``float64``, regardless of platform.
This is not an apparent change in the API, but noting it.
- Guarantee that ``convert_objects()`` for Series/DataFrame always returns a
copy
- groupby operations will respect dtypes for numeric float operations
(float32/float64); other types will be operated on, and will try to cast
back to the input dtype (e.g. if an int is passed, as long as the output
doesn't have nans, then an int will be returned)
- backfill/pad/take/diff/ohlc will now support ``float32/int16/int8``
operations
- Block types will upcast as needed in where/masking operations (GH2793_)
- Series now automatically will try to set the correct dtype based on passed
datetimelike objects (datetime/Timestamp)
- timedelta64 are returned in appropriate cases (e.g. Series - Series,
when both are datetime64)
- mixed datetimes and objects (GH2751_) in a constructor will be cast
correctly
- astype on datetimes to object are now handled (as well as NaT
conversions to np.nan)
- all timedelta like objects will be correctly assigned to ``timedelta64``
with mixed ``NaN`` and/or ``NaT`` allowed
- arguments to DataFrame.clip were inconsistent to numpy and Series clipping
(GH2747_)
- util.testing.assert_frame_equal now checks the column and index names (GH2964_)
- Constructors will now return a more informative ValueError on failures
when invalid shapes are passed
- Don't suppress TypeError in GroupBy.agg (GH3238_)
- Methods return None when inplace=True (GH1893_)
- ``HDFStore``
- added the method ``select_column`` to select a single column from a table as a Series.
- deprecated the ``unique`` method, can be replicated by ``select_column(key,column).unique()``
- ``min_itemsize`` parameter will now automatically create data_columns for passed keys
- Downcast on pivot if possible (GH3283_), adds argument ``downcast`` to ``fillna``
- Introduced options `display.height/width` for explicitly specifying terminal
height/width in characters. Deprecated display.line_width, now replaced by display.width.
These defaults are in effect for scripts as well, so unless disabled, previously
very wide output will now be output as "expand_repr" style wrapped output.
- Various defaults for options (including display.max_rows) have been revised,
after a brief survey concluded they were wrong for everyone. Now at w=80,h=60.
- HTML repr output in IPython qtconsole is once again controlled by the option
`display.notebook_repr_html`, and on by default.
**Bug Fixes**
- Fix seg fault on empty data frame when fillna with ``pad`` or ``backfill``
(GH2778_)
- Single element ndarrays of datetimelike objects are handled
(e.g. np.array(datetime(2001,1,1,0,0))), w/o dtype being passed
- 0-dim ndarrays with a passed dtype are handled correctly
(e.g. np.array(0.,dtype='float32'))
- Fix some boolean indexing inconsistencies in Series.__getitem__/__setitem__
(GH2776_)
- Fix issues with DataFrame and Series constructor with integers that
overflow ``int64`` and some mixed typed type lists (GH2845_)
- ``HDFStore``
- Fix weird PyTables error when using too many selectors in a where
also correctly filter on any number of values in a Term expression
(so not using numexpr filtering, but isin filtering)
- Internally, change all variables to be private-like (now have leading
underscore)
- Fixes for query parsing to correctly interpret boolean and != (GH2849_, GH2973_)
- Fixes for pathological case on SparseSeries with 0-len array and
compression (GH2931_)
- Fixes bug with writing rows if part of a block was all-nan (GH3012_)
- Exceptions are now ValueError or TypeError as needed
- A table will now raise if min_itemsize contains fields which are not queryables
- Bug showing up in applymap where some object type columns are converted (GH2909_)
had an incorrect default in convert_objects
- TimeDeltas
- Series ops with a Timestamp on the rhs was throwing an exception (GH2898_)
added tests for Series ops with datetimes,timedeltas,Timestamps, and datelike
Series on both lhs and rhs
- Fixed subtle timedelta64 inference issue on py3 & numpy 1.7.0 (GH3094_)
- Fixed some formatting issues on timedelta when negative
- Support null checking on timedelta64, representing (and formatting) with NaT
- Support setitem with np.nan value, converts to NaT
- Support min/max ops in a Dataframe (abs not working, nor do we error on non-supported ops)
- Support idxmin/idxmax/abs/max/min in a Series (GH2989_, GH2982_)
- Bug on in-place putmasking on an ``integer`` series that needs to be converted to
``float`` (GH2746_)
- Bug in argsort of ``datetime64[ns]`` Series with ``NaT`` (GH2967_)
- Bug in value_counts of ``datetime64[ns]`` Series (GH3002_)
- Fixed printing of ``NaT` in an index
- Bug in idxmin/idxmax of ``datetime64[ns]`` Series with ``NaT`` (GH2982__)
- Bug in ``icol, take`` with negative indicies was producing incorrect return
values (see GH2922_, GH2892_), also check for out-of-bounds indices (GH3029_)
- Bug in DataFrame column insertion when the column creation fails, existing frame is left in
an irrecoverable state (GH3010_)
- Bug in DataFrame update, combine_first where non-specified values could cause
dtype changes (GH3016_, GH3041_)
- Bug in groupby with first/last where dtypes could change (GH3041_, GH2763_)
- Formatting of an index that has ``nan`` was inconsistent or wrong (would fill from
other values), (GH2850_)
- Unstack of a frame with no nans would always cause dtype upcasting (GH2929_)
- Fix scalar datetime.datetime parsing bug in read_csv (GH3071_)
- Fixed slow printing of large Dataframes, due to inefficient dtype
reporting (GH2807_)
- Fixed a segfault when using a function as grouper in groupby (GH3035_)
- Fix pretty-printing of infinite data structures (closes GH2978_)
- Fixed exception when plotting timeseries bearing a timezone (closes GH2877_)
- str.contains ignored na argument (GH2806_)
- Substitute warning for segfault when grouping with categorical grouper
of mismatched length (GH3011_)
- Fix exception in SparseSeries.density (GH2083_)
- Fix upsampling bug with closed='left' and daily to daily data (GH3020_)
- Fixed missing tick bars on scatter_matrix plot (GH3063_)
- Fixed bug in Timestamp(d,tz=foo) when d is date() rather then datetime() (GH2993_)
- series.plot(kind='bar') now respects pylab color schem (GH3115_)
- Fixed bug in reshape if not passed correct input, now raises TypeError (GH2719_)
- Fixed a bug where Series ctor did not respect ordering if OrderedDict passed in (GH3282_)
- Fix NameError issue on RESO_US (GH2787_)
- Allow selection in an *unordered* timeseries to work similary
to an *ordered* timeseries (GH2437_).
- Fix implemented ``.xs`` when called with ``axes=1`` and a level parameter (GH2903_)
- Timestamp now supports the class method fromordinal similar to datetimes (GH3042_)
- Fix issue with indexing a series with a boolean key and specifiying a 1-len list on the rhs (GH2745_)
or a list on the rhs (GH3235_)
- Fixed bug in groupby apply when kernel generate list of arrays having unequal len (GH1738_)
- fixed handling of rolling_corr with center=True which could produce corr>1 (GH3155_)
- Fixed issues where indices can be passed as 'index/column' in addition to 0/1 for the axis parameter
- PeriodIndex.tolist now boxes to Period (GH3178_)
- PeriodIndex.get_loc KeyError now reports Period instead of ordinal (GH3179_)
- df.to_records bug when handling MultiIndex (GH3189)
- Fix Series.__getitem__ segfault when index less than -length (GH3168_)
- Fix bug when using Timestamp as a date parser (GH2932_)
- Fix bug creating date range from Timestamp with time zone and passing same
time zone (GH2926_)
- Add comparison operators to Period object (GH2781_)
- Fix bug when concatenating two Series into a DataFrame when they have the
same name (GH2797_)
- Fix automatic color cycling when plotting consecutive timeseries
without color arguments (GH2816_)
- fixed bug in the pickling of PeriodIndex (GH2891_)
- Upcast/split blocks when needed in a mixed DataFrame when setitem
with an indexer (GH3216_)
- Invoking df.applymap on a dataframe with dupe cols now raises a ValueError (GH2786_)
- Apply with invalid returned indices raise correct Exception (GH2808_)
- Fixed a bug in plotting log-scale bar plots (GH3247_)
- df.plot() grid on/off now obeys the mpl default style, just like
series.plot(). (GH3233_)
- Fixed a bug in the legend of plotting.andrews_curves() (GH3278_)
- Produce a series on apply if we only generate a singular series and have
a simple index (GH2893_)
- Fix Python ascii file parsing when integer falls outside of floating point
spacing (GH3258_)
- fixed pretty priniting of sets (GH3294_)
- Panel() and Panel.from_dict() now respects ordering when give OrderedDict (GH3303_)
- DataFrame where with a datetimelike incorrectly selecting (GH3311_)
- Ensure index casts work even in Int64Index
- Fix set_index segfault when passing MultiIndex (GH3308_)
- Ensure pickles created in py2 can be read in py3
- Insert ellipsis in MultiIndex summary repr (GH3348_)
- Groupby will handle mutation among an input groups columns (and fallback
to non-fast apply) (GH3380_)
- Eliminated unicode errors on FreeBSD when using MPL GTK backend (GH3360_)
- Period.strftime should return unicode strings always (GH3363_)
- Respect passed read_* chunksize in get_chunk function (GH3406_)
.. _GH3294: https://github.com/pydata/pandas/issues/3294
.. _GH622: https://github.com/pydata/pandas/issues/622
.. _GH3348: https://github.com/pydata/pandas/issues/3348
.. _GH797: https://github.com/pydata/pandas/issues/797
.. _GH1893: https://github.com/pydata/pandas/issues/1893
.. _GH1978: https://github.com/pydata/pandas/issues/1978
.. _GH3360: https://github.com/pydata/pandas/issues/3360
.. _GH3363: https://github.com/pydata/pandas/issues/3363
.. _GH2758: https://github.com/pydata/pandas/issues/2758
.. _GH3275: https://github.com/pydata/pandas/issues/3275
.. _GH2121: https://github.com/pydata/pandas/issues/2121
.. _GH3247: https://github.com/pydata/pandas/issues/3247
.. _GH2809: https://github.com/pydata/pandas/issues/2809
.. _GH2810: https://github.com/pydata/pandas/issues/2810
.. _GH2837: https://github.com/pydata/pandas/issues/2837
.. _GH2898: https://github.com/pydata/pandas/issues/2898
.. _GH3233: https://github.com/pydata/pandas/issues/3233
.. _GH3035: https://github.com/pydata/pandas/issues/3035
.. _GH3020: https://github.com/pydata/pandas/issues/3020
.. _GH2978: https://github.com/pydata/pandas/issues/2978
.. _GH2877: https://github.com/pydata/pandas/issues/2877
.. _GH2739: https://github.com/pydata/pandas/issues/2739
.. _GH2710: https://github.com/pydata/pandas/issues/2710
.. _GH2806: https://github.com/pydata/pandas/issues/2806
.. _GH2807: https://github.com/pydata/pandas/issues/2807
.. _GH3278: https://github.com/pydata/pandas/issues/3278
.. _GH2891: https://github.com/pydata/pandas/issues/2891
.. _GH2918: https://github.com/pydata/pandas/issues/2918
.. _GH3011: https://github.com/pydata/pandas/issues/3011
.. _GH2745: https://github.com/pydata/pandas/issues/2745
.. _GH622: https://github.com/pydata/pandas/issues/622
.. _GH797: https://github.com/pydata/pandas/issues/797
.. _GH1670: https://github.com/pydata/pandas/issues/1670
.. _GH2681: https://github.com/pydata/pandas/issues/2681
.. _GH2719: https://github.com/pydata/pandas/issues/2719
.. _GH2746: https://github.com/pydata/pandas/issues/2746
.. _GH2747: https://github.com/pydata/pandas/issues/2747
.. _GH2751: https://github.com/pydata/pandas/issues/2751
.. _GH2763: https://github.com/pydata/pandas/issues/2763
.. _GH2776: https://github.com/pydata/pandas/issues/2776
.. _GH2778: https://github.com/pydata/pandas/issues/2778
.. _GH2781: https://github.com/pydata/pandas/issues/2781
.. _GH2786: https://github.com/pydata/pandas/issues/2786
.. _GH2787: https://github.com/pydata/pandas/issues/2787
.. _GH3282: https://github.com/pydata/pandas/issues/3282
.. _GH2437: https://github.com/pydata/pandas/issues/2437
.. _GH2753: https://github.com/pydata/pandas/issues/2753
.. _GH2793: https://github.com/pydata/pandas/issues/2793
.. _GH2795: https://github.com/pydata/pandas/issues/2795
.. _GH2797: https://github.com/pydata/pandas/issues/2797
.. _GH2819: https://github.com/pydata/pandas/issues/2819
.. _GH2845: https://github.com/pydata/pandas/issues/2845
.. _GH2867: https://github.com/pydata/pandas/issues/2867
.. _GH2803: https://github.com/pydata/pandas/issues/2803
.. _GH2807: https://github.com/pydata/pandas/issues/2807
.. _GH2808: https://github.com/pydata/pandas/issues/2808
.. _GH2849: https://github.com/pydata/pandas/issues/2849
.. _GH2850: https://github.com/pydata/pandas/issues/2850
.. _GH2898: https://github.com/pydata/pandas/issues/2898
.. _GH2892: https://github.com/pydata/pandas/issues/2892
.. _GH2893: https://github.com/pydata/pandas/issues/2893
.. _GH2902: https://github.com/pydata/pandas/issues/2902
.. _GH2903: https://github.com/pydata/pandas/issues/2903
.. _GH2909: https://github.com/pydata/pandas/issues/2909
.. _GH2922: https://github.com/pydata/pandas/issues/2922
.. _GH2926: https://github.com/pydata/pandas/issues/2926
.. _GH2929: https://github.com/pydata/pandas/issues/2929
.. _GH2931: https://github.com/pydata/pandas/issues/2931
.. _GH2932: https://github.com/pydata/pandas/issues/2932
.. _GH2973: https://github.com/pydata/pandas/issues/2973
.. _GH2967: https://github.com/pydata/pandas/issues/2967
.. _GH2981: https://github.com/pydata/pandas/issues/2981
.. _GH2982: https://github.com/pydata/pandas/issues/2982
.. _GH2989: https://github.com/pydata/pandas/issues/2989
.. _GH2993: https://github.com/pydata/pandas/issues/2993
.. _GH3002: https://github.com/pydata/pandas/issues/3002
.. _GH3155: https://github.com/pydata/pandas/issues/3155
.. _GH3010: https://github.com/pydata/pandas/issues/3010
.. _GH1738: https://github.com/pydata/pandas/issues/1738
.. _GH3012: https://github.com/pydata/pandas/issues/3012
.. _GH3029: https://github.com/pydata/pandas/issues/3029
.. _GH3037: https://github.com/pydata/pandas/issues/3037
.. _GH3041: https://github.com/pydata/pandas/issues/3041
.. _GH3042: https://github.com/pydata/pandas/issues/3042
.. _GH3053: https://github.com/pydata/pandas/issues/3053
.. _GH3070: https://github.com/pydata/pandas/issues/3070
.. _GH3076: https://github.com/pydata/pandas/issues/3076
.. _GH3063: https://github.com/pydata/pandas/issues/3063
.. _GH3059: https://github.com/pydata/pandas/issues/3059
.. _GH2993: https://github.com/pydata/pandas/issues/2993
.. _GH3115: https://github.com/pydata/pandas/issues/3115
.. _GH3070: https://github.com/pydata/pandas/issues/3070
.. _GH3075: https://github.com/pydata/pandas/issues/3075
.. _GH3094: https://github.com/pydata/pandas/issues/3094
.. _GH3130: https://github.com/pydata/pandas/issues/3130
.. _GH3168: https://github.com/pydata/pandas/issues/3168
.. _GH3178: https://github.com/pydata/pandas/issues/3178
.. _GH3179: https://github.com/pydata/pandas/issues/3179
.. _GH3189: https://github.com/pydata/pandas/issues/3189
.. _GH2751: https://github.com/pydata/pandas/issues/2751
.. _GH2747: https://github.com/pydata/pandas/issues/2747
.. _GH2816: https://github.com/pydata/pandas/issues/2816
.. _GH3216: https://github.com/pydata/pandas/issues/3216
.. _GH3222: https://github.com/pydata/pandas/issues/3222
.. _GH2641: https://github.com/pydata/pandas/issues/2641
.. _GH3223: https://github.com/pydata/pandas/issues/3223
.. _GH3238: https://github.com/pydata/pandas/issues/3238
.. _GH3258: https://github.com/pydata/pandas/issues/3258
.. _GH3283: https://github.com/pydata/pandas/issues/3283
.. _GH2919: https://github.com/pydata/pandas/issues/2919
.. _GH3308: https://github.com/pydata/pandas/issues/3308
.. _GH3311: https://github.com/pydata/pandas/issues/3311
.. _GH3380: https://github.com/pydata/pandas/issues/3380
.. _GH3406: https://github.com/pydata/pandas/issues/3406
pandas 0.10.1
=============
**Release date:** 2013-01-22
**New features**
- Add data inferface to World Bank WDI pandas.io.wb (GH2592_)
**API Changes**
- Restored inplace=True behavior returning self (same object) with
deprecation warning until 0.11 (GH1893_)
- ``HDFStore``
- refactored HFDStore to deal with non-table stores as objects, will allow future enhancements
- removed keyword ``compression`` from ``put`` (replaced by keyword
``complib`` to be consistent across library)
- warn `PerformanceWarning` if you are attempting to store types that will be pickled by PyTables
**Improvements to existing features**
- ``HDFStore``
- enables storing of multi-index dataframes (closes GH1277_)
- support data column indexing and selection, via ``data_columns`` keyword
in append
- support write chunking to reduce memory footprint, via ``chunksize``
keyword to append
- support automagic indexing via ``index`` keyword to append
- support ``expectedrows`` keyword in append to inform ``PyTables`` about
the expected tablesize
- support ``start`` and ``stop`` keywords in select to limit the row
selection space
- added ``get_store`` context manager to automatically import with pandas
- added column filtering via ``columns`` keyword in select
- added methods append_to_multiple/select_as_multiple/select_as_coordinates
to do multiple-table append/selection
- added support for datetime64 in columns
- added method ``unique`` to select the unique values in an indexable or
data column
- added method ``copy`` to copy an existing store (and possibly upgrade)
- show the shape of the data on disk for non-table stores when printing the
store
- added ability to read PyTables flavor tables (allows compatiblity to
other HDF5 systems)
- Add ``logx`` option to DataFrame/Series.plot (GH2327_, GH2565_)
- Support reading gzipped data from file-like object
- ``pivot_table`` aggfunc can be anything used in GroupBy.aggregate (GH2643_)
- Implement DataFrame merges in case where set cardinalities might overflow
64-bit integer (GH2690_)
- Raise exception in C file parser if integer dtype specified and have NA
values. (GH2631_)
- Attempt to parse ISO8601 format dates when parse_dates=True in read_csv for
major performance boost in such cases (GH2698_)
- Add methods ``neg`` and ``inv`` to Series
- Implement ``kind`` option in ``ExcelFile`` to indicate whether it's an XLS
or XLSX file (GH2613_)
**Bug fixes**
- Fix read_csv/read_table multithreading issues (GH2608_)
- ``HDFStore``
- correctly handle ``nan`` elements in string columns; serialize via the
``nan_rep`` keyword to append
- raise correctly on non-implemented column types (unicode/date)
- handle correctly ``Term`` passed types (e.g. ``index<1000``, when index
is ``Int64``), (closes GH512_)
- handle Timestamp correctly in data_columns (closes GH2637_)
- contains correctly matches on non-natural names
- correctly store ``float32`` dtypes in tables (if not other float types in
the same table)
- Fix DataFrame.info bug with UTF8-encoded columns. (GH2576_)
- Fix DatetimeIndex handling of FixedOffset tz (GH2604_)
- More robust detection of being in IPython session for wide DataFrame
console formatting (GH2585_)
- Fix platform issues with ``file:///`` in unit test (GH2564_)
- Fix bug and possible segfault when grouping by hierarchical level that
contains NA values (GH2616_)
- Ensure that MultiIndex tuples can be constructed with NAs (GH2616_)
- Fix int64 overflow issue when unstacking MultiIndex with many levels
(GH2616_)
- Exclude non-numeric data from DataFrame.quantile by default (GH2625_)
- Fix a Cython C int64 boxing issue causing read_csv to return incorrect
results (GH2599_)
- Fix groupby summing performance issue on boolean data (GH2692_)
- Don't bork Series containing datetime64 values with to_datetime (GH2699_)
- Fix DataFrame.from_records corner case when passed columns, index column,
but empty record list (GH2633_)
- Fix C parser-tokenizer bug with trailing fields. (GH2668_)
- Don't exclude non-numeric data from GroupBy.max/min (GH2700_)
- Don't lose time zone when calling DatetimeIndex.drop (GH2621_)
- Fix setitem on a Series with a boolean key and a non-scalar as value
(GH2686_)
- Box datetime64 values in Series.apply/map (GH2627_, GH2689_)
- Upconvert datetime + datetime64 values when concatenating frames (GH2624_)
- Raise a more helpful error message in merge operations when one DataFrame
has duplicate columns (GH2649_)
- Fix partial date parsing issue occuring only when code is run at EOM
(GH2618_)
- Prevent MemoryError when using counting sort in sortlevel with
high-cardinality MultiIndex objects (GH2684_)
- Fix Period resampling bug when all values fall into a single bin (GH2070_)
- Fix buggy interaction with usecols argument in read_csv when there is an
implicit first index column (GH2654_)
.. _GH512: https://github.com/pydata/pandas/issues/512
.. _GH1277: https://github.com/pydata/pandas/issues/1277
.. _GH2070: https://github.com/pydata/pandas/issues/2070
.. _GH2327: https://github.com/pydata/pandas/issues/2327
.. _GH2565: https://github.com/pydata/pandas/issues/2565
.. _GH2585: https://github.com/pydata/pandas/issues/2585
.. _GH2599: https://github.com/pydata/pandas/issues/2599
.. _GH2604: https://github.com/pydata/pandas/issues/2604
.. _GH2576: https://github.com/pydata/pandas/issues/2576
.. _GH2608: https://github.com/pydata/pandas/issues/2608
.. _GH2613: https://github.com/pydata/pandas/issues/2613
.. _GH2616: https://github.com/pydata/pandas/issues/2616
.. _GH2621: https://github.com/pydata/pandas/issues/2621
.. _GH2624: https://github.com/pydata/pandas/issues/2624
.. _GH2625: https://github.com/pydata/pandas/issues/2625
.. _GH2627: https://github.com/pydata/pandas/issues/2627
.. _GH2631: https://github.com/pydata/pandas/issues/2631
.. _GH2633: https://github.com/pydata/pandas/issues/2633
.. _GH2637: https://github.com/pydata/pandas/issues/2637
.. _GH2643: https://github.com/pydata/pandas/issues/2643
.. _GH2649: https://github.com/pydata/pandas/issues/2649
.. _GH2654: https://github.com/pydata/pandas/issues/2654
.. _GH2668: https://github.com/pydata/pandas/issues/2668
.. _GH2684: https://github.com/pydata/pandas/issues/2684
.. _GH2689: https://github.com/pydata/pandas/issues/2689
.. _GH2690: https://github.com/pydata/pandas/issues/2690
.. _GH2692: https://github.com/pydata/pandas/issues/2692
.. _GH2698: https://github.com/pydata/pandas/issues/2698
.. _GH2699: https://github.com/pydata/pandas/issues/2699
.. _GH2700: https://github.com/pydata/pandas/issues/2700
.. _GH2686: https://github.com/pydata/pandas/issues/2686
.. _GH2618: https://github.com/pydata/pandas/issues/2618
.. _GH2592: https://github.com/pydata/pandas/issues/2592
.. _GH2564: https://github.com/pydata/pandas/issues/2564
.. _GH2616: https://github.com/pydata/pandas/issues/2616
pandas 0.10.0
=============
**Release date:** 2012-12-17
**New features**
- Brand new high-performance delimited file parsing engine written in C and
Cython. 50% or better performance in many standard use cases with a
fraction as much memory usage. (GH407_, GH821_)
- Many new file parser (read_csv, read_table) features:
- Support for on-the-fly gzip or bz2 decompression (`compression` option)
- Ability to get back numpy.recarray instead of DataFrame
(`as_recarray=True`)
- `dtype` option: explicit column dtypes
- `usecols` option: specify list of columns to be read from a file. Good
for reading very wide files with many irrelevant columns (GH1216_ GH926_, GH2465_)
- Enhanced unicode decoding support via `encoding` option
- `skipinitialspace` dialect option
- Can specify strings to be recognized as True (`true_values`) or False
(`false_values`)
- High-performance `delim_whitespace` option for whitespace-delimited
files; a preferred alternative to the '\s+' regular expression delimiter
- Option to skip "bad" lines (wrong number of fields) that would otherwise
have caused an error in the past (`error_bad_lines` and `warn_bad_lines`
options)
- Substantially improved performance in the parsing of integers with
thousands markers and lines with comments
- Easy of European (and other) decimal formats (`decimal` option) (GH584_, GH2466_)
- Custom line terminators (e.g. lineterminator='~') (GH2457_)
- Handling of no trailing commas in CSV files (GH2333_)
- Ability to handle fractional seconds in date_converters (GH2209_)
- read_csv allow scalar arg to na_values (GH1944_)
- Explicit column dtype specification in read_* functions (GH1858_)
- Easier CSV dialect specification (GH1743_)
- Improve parser performance when handling special characters (GH1204_)
- Google Analytics API integration with easy oauth2 workflow (GH2283_)
- Add error handling to Series.str.encode/decode (GH2276_)
- Add ``where`` and ``mask`` to Series (GH2337_)
- Grouped histogram via `by` keyword in Series/DataFrame.hist (GH2186_)
- Support optional ``min_periods`` keyword in ``corr`` and ``cov``
for both Series and DataFrame (GH2002_)
- Add ``duplicated`` and ``drop_duplicates`` functions to Series (GH1923_)
- Add docs for ``HDFStore table`` format
- 'density' property in `SparseSeries` (GH2384_)
- Add ``ffill`` and ``bfill`` convenience functions for forward- and
backfilling time series data (GH2284_)
- New option configuration system and functions `set_option`, `get_option`,
`describe_option`, and `reset_option`. Deprecate `set_printoptions` and
`reset_printoptions` (GH2393_).
You can also access options as attributes via ``pandas.options.X``
- Wide DataFrames can be viewed more easily in the console with new
`expand_frame_repr` and `line_width` configuration options. This is on by
default now (GH2436_)
- Scikits.timeseries-like moving window functions via ``rolling_window`` (GH1270_)
**Experimental Features**
- Add support for Panel4D, a named 4 Dimensional stucture
- Add support for ndpanel factory functions, to create custom,
domain-specific N-Dimensional containers
**API Changes**
- The default binning/labeling behavior for ``resample`` has been changed to
`closed='left', label='left'` for daily and lower frequencies. This had
been a large source of confusion for users. See "what's new" page for more
on this. (GH2410_)
- Methods with ``inplace`` option now return None instead of the calling
(modified) object (GH1893_)
- The special case DataFrame - TimeSeries doing column-by-column broadcasting
has been deprecated. Users should explicitly do e.g. df.sub(ts, axis=0)
instead. This is a legacy hack and can lead to subtle bugs.
- inf/-inf are no longer considered as NA by isnull/notnull. To be clear, this
is legacy cruft from early pandas. This behavior can be globally re-enabled
using the new option ``mode.use_inf_as_null`` (GH2050_, GH1919_)
- ``pandas.merge`` will now default to ``sort=False``. For many use cases
sorting the join keys is not necessary, and doing it by default is wasteful
- Specify ``header=0`` explicitly to replace existing column names in file in
read_* functions.
- Default column names for header-less parsed files (yielded by read_csv,
etc.) are now the integers 0, 1, .... A new argument `prefix` has been
added; to get the v0.9.x behavior specify ``prefix='X'`` (GH2034_). This API
change was made to make the default column names more consistent with the
DataFrame constructor's default column names when none are specified.
- DataFrame selection using a boolean frame now preserves input shape