@@ -22,6 +22,224 @@ Where to get it
22
22
* Binary installers on PyPI: http://pypi.python.org/pypi/pandas
23
23
* Documentation: http://pandas.sourceforge.net
24
24
25
+ pandas 0.7.0
26
+ ============
27
+
28
+ **Release date: ** NOT YET RELEASED
29
+
30
+ **New features / modules **
31
+
32
+ - New ``merge `` function for efficiently performing full gamut of database /
33
+ relational-algebra operations. Refactored existing join methods to use the
34
+ new infrastructure, resulting in substantial performance gains (GH #220,
35
+ #249, #267)
36
+ - New ``concat `` function for concatenating DataFrame or Panel objects along
37
+ an axis. Can form union or intersection of the other axes. Improves
38
+ performance of ``DataFrame.append `` (#468, #479, #273)
39
+ - Handle differently-indexed output values in ``DataFrame.apply `` (GH #498)
40
+ - Can pass list of dicts (e.g., a list of shallow JSON objects) to DataFrame
41
+ constructor (GH #526)
42
+ - Add ``reorder_levels `` method to Series and DataFrame (PR #534)
43
+ - Add dict-like ``get `` function to DataFrame and Panel (PR #521)
44
+ - ``DataFrame.iterrows `` method for efficiently iterating through the rows of
45
+ a DataFrame
46
+ - Added ``DataFrame.to_panel `` with code adapted from ``LongPanel.to_long ``
47
+ - ``reindex_axis `` method added to DataFrame
48
+ - Add ``level `` option to binary arithmetic functions on ``DataFrame `` and
49
+ ``Series ``
50
+ - Add ``level `` option to the ``reindex `` and ``align `` methods on Series and
51
+ DataFrame for broadcasting values across a level (GH #542, PR #552, others)
52
+ - Add attribute-based item access to ``Panel `` and add IPython completion (PR
53
+ #554)
54
+ - Add ``logy `` option to ``Series.plot `` for log-scaling on the Y axis
55
+ - Add ``index ``, ``header ``, and ``justify `` options to
56
+ ``DataFrame.to_string ``. Add option to (GH #570, GH #571)
57
+ - Can pass multiple DataFrames to ``DataFrame.join `` to join on index (GH #115)
58
+ - Can pass multiple Panels to ``Panel.join `` (GH #115)
59
+ - Can pass multiple DataFrames to `DataFrame.append ` to concatenate (stack)
60
+ and multiple Series to ``Series.append `` too
61
+ - Added ``justify `` argument to ``DataFrame.to_string `` to allow different
62
+ alignment of column headers
63
+ - Add ``sort `` option to GroupBy to allow disabling sorting of the group keys
64
+ for potential speedups (GH #595)
65
+ - Can pass MaskedArray to Series constructor (PR #563)
66
+ - Add Panel item access via attributes and IPython completion (GH #554)
67
+ - Implement ``DataFrame.lookup ``, fancy-indexing analogue for retrieving
68
+ values given a sequence of row and column labels (GH #338)
69
+ - Add ``verbose `` option to ``read_csv `` and ``read_table `` to show number of
70
+ NA values inserted in non-numeric columns (GH #614)
71
+ - Can pass a list of dicts or Series to ``DataFrame.append `` to concatenate
72
+ multiple rows (GH #464)
73
+ - Add ``level `` argument to ``DataFrame.xs `` for selecting data from other
74
+ MultiIndex levels. Can take one or more levels with potentially a tuple of
75
+ keys for flexible retrieval of data (GH #371, GH #629)
76
+ - New ``crosstab `` function for easily computing frequency tables (GH #170)
77
+
78
+ **API Changes **
79
+
80
+ - Label-indexing with integer indexes now raises KeyError if a label is not
81
+ found instead of falling back on location-based indexing
82
+ - Label-based slicing via ``ix `` or ``[] `` on Series will now only work if
83
+ exact matches for the labels are found or if the index is monotonic (for
84
+ range selections)
85
+ - Label-based slicing and sequences of labels can be passed to ``[] `` on a
86
+ Series for both getting and setting (GH #86)
87
+ - `[] ` operator (``__getitem__ `` and ``__setitem__ ``) will raise KeyError
88
+ with integer indexes when an index is not contained in the index. The prior
89
+ behavior would fall back on position-based indexing if a key was not found
90
+ in the index which would lead to subtle bugs. This is now consistent with
91
+ the behavior of ``.ix `` on DataFrame and friends (GH #328)
92
+ - Rename ``DataFrame.delevel `` to ``DataFrame.reset_index `` and add
93
+ deprecation warning
94
+ - `Series.sort ` (an in-place operation) called on a Series which is a view on
95
+ a larger array (e.g. a column in a DataFrame) will generate an Exception to
96
+ prevent accidentally modifying the data source (GH #316)
97
+ - Refactor to remove deprecated ``LongPanel `` class (PR #552)
98
+ - Deprecated ``Panel.to_long ``, renamed to ``to_frame ``
99
+ - Deprecated ``colSpace `` argument in ``DataFrame.to_string ``, renamed to
100
+ ``col_space ``
101
+ - Rename ``precision `` to ``accuracy `` in engineering float formatter (GH
102
+ #395)
103
+
104
+ **Improvements to existing features **
105
+
106
+ - Better error message in DataFrame constructor when passed column labels
107
+ don't match data (GH #497)
108
+ - Substantially improve performance of multi-GroupBy aggregation when a
109
+ Python function is passed, reuse ndarray object in Cython (GH #496)
110
+ - Can store objects indexed by tuples and floats in HDFStore (GH #492)
111
+ - Don't print length by default in Series.to_string, add `length ` option (GH
112
+ #489)
113
+ - Improve Cython code for multi-groupby to aggregate without having to sort
114
+ the data (GH #93)
115
+ - Improve MultiIndex reindexing speed by storing tuples in the MultiIndex,
116
+ test for backwards unpickling compatibility
117
+ - Improve column reindexing performance by using specialized Cython take
118
+ function
119
+ - Further performance tweaking of Series.__getitem__ for standard use cases
120
+ - Avoid Index dict creation in some cases (i.e. when getting slices, etc.),
121
+ regression from prior versions
122
+ - Friendlier error message in setup.py if NumPy not installed
123
+ - Use common set of NA-handling operations (sum, mean, etc.) in Panel class
124
+ also (GH #536)
125
+ - Default name assignment when calling ``reset_index `` on DataFrame with a
126
+ regular (non-hierarchical) index (GH #476)
127
+ - Use Cythonized groupers when possible in Series/DataFrame stat ops with
128
+ ``level `` parameter passed (GH #545)
129
+ - Ported skiplist data structure to C to speed up ``rolling_median `` by about
130
+ 5-10x in most typical use cases (GH #374)
131
+ - Some performance enhancements in constructing a Panel from a dict of
132
+ DataFrame objects
133
+ - Made ``Index._get_duplicates `` a public method by removing the underscore
134
+ - Prettier printing of floats, and column spacing fix (GH #395, GH #571)
135
+ - Add ``bold_rows `` option to DataFrame.to_html (GH #586)
136
+ - Improve the performance of ``DataFrame.sort_index `` by up to 5x or more
137
+ when sorting by multiple columns
138
+ - Substantially improve performance of DataFrame and Series constructors when
139
+ passed a nested dict or dict, respectively (GH #540, GH #621)
140
+ - Modified setup.py so that pip / setuptools will install dependencies (GH
141
+ #507, various pull requests)
142
+ - Unstack called on DataFrame with non-MultiIndex will return Series (GH
143
+ #477)
144
+ - Improve DataFrame.to_string and console formatting to be more consistent in
145
+ the number of displayed digits (GH #395)
146
+ - Use bottleneck if available for performing NaN-friendly statistical
147
+ operations that it implemented (GH #91)
148
+ - Can pass a list of functions to aggregate with groupby on a DataFrame,
149
+ yielding an aggregated result with hierarchical columns (GH #166)
150
+ - Monkey-patch context to traceback in ``DataFrame.apply `` to indicate which
151
+ row/column the function application failed on (GH #614)
152
+ - Improved ability of read_table and read_clipboard to parse
153
+ console-formatted DataFrames (can read the row of index names, etc.)
154
+
155
+ **Bug fixes **
156
+
157
+ - Raise exception in out-of-bounds indexing of Series instead of
158
+ seg-faulting, regression from earlier releases (GH #495)
159
+ - Fix error when joining DataFrames of different dtypes within the same
160
+ typeclass (e.g. float32 and float64) (GH #486)
161
+ - Fix bug in Series.min/Series.max on objects like datetime.datetime (GH
162
+ #487)
163
+ - Preserve index names in Index.union (GH #501)
164
+ - Fix bug in Index joining causing subclass information (like DateRange type)
165
+ to be lost in some cases (GH #500)
166
+ - Accept empty list as input to DataFrame constructor, regression from 0.6.0
167
+ (GH #491)
168
+ - Can output DataFrame and Series with ndarray objects in a dtype=object
169
+ array (GH #490)
170
+ - Return empty string from Series.to_string when called on empty Series (GH
171
+ #488)
172
+ - Fix exception passing empty list to DataFrame.from_records
173
+ - Fix Index.format bug (excluding name field) with datetimes with time info
174
+ - Fix scalar value access in Series to always return NumPy scalars,
175
+ regression from prior versions (GH #510)
176
+ - Handle rows skipped at beginning of file in read_* functions (GH #505)
177
+ - Handle improper dtype casting in ``set_value `` methods
178
+ - Unary '-' / __neg__ operator on DataFrame was returning integer values
179
+ - Unbox 0-dim ndarrays from certain operators like all, any in Series
180
+ - Fix handling of missing columns (was combine_first-specific) in
181
+ DataFrame.combine for general case (GH #529)
182
+ - Fix type inference logic with boolean lists and arrays in DataFrame indexing
183
+ - Use centered sum of squares in R-square computation if entity_effects=True
184
+ in panel regression
185
+ - Handle all NA case in Series.{corr, cov}, was raising exception (GH #548)
186
+ - Aggregating by multiple levels with ``level `` argument to DataFrame, Series
187
+ stat method, was broken (GH #545)
188
+ - Fix Cython buf when converter passed to read_csv produced a numeric array
189
+ (buffer dtype mismatch when passed to Cython type inference function) (GH
190
+ #546)
191
+ - Fix exception when setting scalar value using .ix on a DataFrame with a
192
+ MultiIndex (GH #551)
193
+ - Fix outer join between two DateRanges with different offsets that returned
194
+ an invalid DateRange
195
+ - Cleanup DataFrame.from_records failure where index argument is an integer
196
+ - Fix Data.from_records failure when passed a dictionary
197
+ - Fix NA handling in {Series, DataFrame}.rank with non-floating point dtypes
198
+ - Fix bug related to integer type-checking in .ix-based indexing
199
+ - Handle non-string index name passed to DataFrame.from_records
200
+ - DataFrame.insert caused the columns name(s) field to be discarded (GH #527)
201
+ - Fix erroneous in monotonic many-to-one left joins
202
+ - Fix DataFrame.to_string to remove extra column white space (GH #571)
203
+ - Format floats to default to same number of digits (GH #395)
204
+ - Added decorator to copy docstring from one function to another (GH #449)
205
+ - Fix error in monotonic many-to-one left joins
206
+ - Fix __eq__ comparison between DateOffsets with different relativedelta
207
+ keywords passed
208
+ - Fix exception caused by parser converter returning strings (GH #583)
209
+ - Fix MultiIndex formatting bug with integer names (GH #601)
210
+ - Fix bug in handling of non-numeric aggregates in Series.groupby (GH #612)
211
+ - Fix TypeError with tuple subclasses (e.g. namedtuple) in
212
+ DataFrame.from_records (GH #611)
213
+ - Catch misreported console size when running IPython within Emacs
214
+ - Fix minor bug in pivot table margins, loss of index names and length-1
215
+ 'All' tuple in row labels
216
+
217
+ Thanks
218
+ ------
219
+ - Craig Austin
220
+ - Marius Cobzarenco
221
+ - Mario Gamboa-Cavazos
222
+ - Arthur Gerigk
223
+ - Yaroslav Halchenko
224
+ - Jeff Hammerbacher
225
+ - Matt Harrison
226
+ - Andreas Hilboll
227
+ - Luc Kesters
228
+ - Adam Klein
229
+ - Gregg Lind
230
+ - Solomon Negusse
231
+ - Wouter Overmeire
232
+ - Christian Prinoth
233
+ - Sam Reckoner
234
+ - Craig Reeson
235
+ - Jan Schulz
236
+ - Ted Square
237
+ - Graham Taylor
238
+ - Chris Uga
239
+ - Dieter Vandenbussche
240
+ - Texas P.
241
+ - Pinxing Ye
242
+
25
243
pandas 0.6.1
26
244
============
27
245
@@ -85,6 +303,7 @@ pandas 0.6.1
85
303
- MultiIndex.get_level_values can take the level name
86
304
- More helpful error message when DataFrame.plot fails on one of the columns
87
305
(GH #478)
306
+ - Improve performance of DataFrame.{index, columns} attribute lookup
88
307
89
308
**Bug fixes **
90
309
0 commit comments