neurodebian
diff --git a/‎README.rst
+2 b/‎README.rst
+2
diff --git a/‎RELEASE.rst
+140-1 b/‎RELEASE.rst
+140-1
diff --git a/‎TODO.rst
+7 b/‎TODO.rst
+7
diff --git a/‎bench/zoo_bench.R
+25 b/‎bench/zoo_bench.R
+25
diff --git a/‎bench/zoo_bench.py
+40 b/‎bench/zoo_bench.py
+40
diff --git a/‎pandas/core/api.py
+1-1 b/‎pandas/core/api.py
+1-1
@@ -69,6 +69,8 @@ Dependencies
 Optional dependencies
 ~~~~~~~~~~~~~~~~~~~~~
 
+  * `Cython <http://www.cython.org>`__: Only necessary to build development
+    version
   * `SciPy <http://www.scipy.org>`__: miscellaneous statistical functions
   * `PyTables <http://www.pytables.org>`__: necessary for HDF5-based storage
   * `matplotlib <http://matplotlib.sourceforge.net/>`__: for plotting
 
@@ -5,11 +5,146 @@ Release Notes
 This is the list of changes to pandas between each release. For full details,
 see the commit logs at http://github.com/wesm/pandas
 
+pandas 0.4.3
+============
+
+**Release date:** not yet released
+
+This is largely a bugfix release from 0.4.2 but also includes a handful of new
+and enhanced features. Also, pandas can now be installed and used on Python 3
+(thanks Thomas Kluyver!).
+
+**New features / modules**
+
+  - Python 3 support using 2to3 (PR #200, Thomas Kluyver)
+  - Add `name` attribute to `Series` and added relevant logic and tests. Name
+    now prints as part of `Series.__repr__`
+  - Add `name` attribute to standard Index so that stacking / unstacking does
+    not discard names and so that indexed DataFrame objects can be reliably
+    round-tripped to flat files, pickle, HDF5, etc.
+  - Add `isnull` and `notnull` as instance methods on Series (PR #209, GH #203)
+
+**Improvements to existing features**
+
+  - Skip xlrd-related unit tests if not installed
+  - `Index.append` and `MultiIndex.append` can accept a list of Index objects to
+    concatenate together
+  - Altered binary operations on differently-indexed SparseSeries objects to use
+    the integer-based (dense) alignment logic which is faster with a larger
+    number of blocks (GH #205)
+  - Refactored `Series.__repr__` to be a bit more clean and consistent
+
+**API Changes**
+
+  - `Series.describe` and `DataFrame.describe` now bring the 25% and 75%
+    quartiles instead of the 10% and 90% deciles. The other outputs have not
+    changed
+  - `Series.toString` will print deprecation warning, has been de-camelCased to
+    `to_string`
+
+**Bug fixes**
+
+  - Fix broken interaction between `Index` and `Int64Index` when calling
+    intersection. Implement `Int64Index.intersection`
+  - `MultiIndex.sortlevel` discarded the level names (GH #202)
+  - Fix bugs in groupby, join, and append due to improper concatenation of
+    `MultiIndex` objects (GH #201)
+  - Fix regression from 0.4.1, `isnull` and `notnull` ceased to work on other
+    kinds of Python scalar objects like `datetime.datetime`
+  - Raise more helpful exception when attempting to write empty DataFrame or
+    LongPanel to `HDFStore` (GH #204)
+  - Use stdlib csv module to properly escape strings with commas in
+    `DataFrame.to_csv` (PR #206, Thomas Kluyver)
+  - Fix Python ndarray access in Cython code for sparse blocked index integrity
+    check
+  - Fix bug writing Series to CSV in Python 3 (PR #209)
+  - Miscellaneous Python 3 bugfixes
+
+Thanks
+------
+
+  - Thomas Kluyver
+  - rsamson
+
+pandas 0.4.2
+============
+
+**Release date:** 10/3/2011
+
+This is a performance optimization release with several bug fixes. The new
+Int64Index and new merging / joining Cython code and related Python
+infrastructure are the main new additions
+
+**New features / modules**
+
+  - Added fast `Int64Index` type with specialized join, union,
+    intersection. Will result in significant performance enhancements for
+    int64-based time series (e.g. using NumPy's datetime64 one day) and also
+    faster operations on DataFrame objects storing record array-like data.
+  - Refactored `Index` classes to have a `join` method and associated data
+    alignment routines throughout the codebase to be able to leverage optimized
+    joining / merging routines.
+  - Added `Series.align` method for aligning two series with choice of join
+    method
+  - Wrote faster Cython data alignment / merging routines resulting in
+    substantial speed increases
+  - Added `is_monotonic` property to `Index` classes with associated Cython
+    code to evaluate the monotonicity of the `Index` values
+  - Add method `get_level_values` to `MultiIndex`
+  - Implemented shallow copy of `BlockManager` object in `DataFrame` internals
+
+**Improvements to existing features**
+
+  - Improved performance of `isnull` and `notnull`, a regression from v0.3.0
+    (GH #187)
+  - Wrote templating / code generation script to auto-generate Cython code for
+    various functions which need to be available for the 4 major data types
+    used in pandas (float64, bool, object, int64)
+  - Refactored code related to `DataFrame.join` so that intermediate aligned
+    copies of the data in each `DataFrame` argument do not need to be
+    created. Substantial performance increases result (GH #176)
+  - Substantially improved performance of generic `Index.intersection` and
+    `Index.union`
+  - Improved performance of `DateRange.union` with overlapping ranges and
+    non-cacheable offsets (like Minute). Implemented analogous fast
+    `DateRange.intersection` for overlapping ranges.
+  - Implemented `BlockManager.take` resulting in significantly faster `take`
+    performance on mixed-type `DataFrame` objects (GH #104)
+  - Improved performance of `Series.sort_index`
+  - Significant groupby performance enhancement: removed unnecessary integrity
+    checks in DataFrame internals that were slowing down slicing operations to
+    retrieve groups
+  - Added informative Exception when passing dict to DataFrame groupby
+    aggregation with axis != 0
+
+**API Changes**
+
+None
+
+**Bug fixes**
+
+  - Fixed minor unhandled exception in Cython code implementing fast groupby
+    aggregation operations
+  - Fixed bug in unstacking code manifesting with more than 3 hierarchical
+    levels
+  - Throw exception when step specified in label-based slice (GH #185)
+  - Fix isnull to correctly work with np.float32. Fix upstream bug described in
+    GH #182
+  - Finish implementation of as_index=False in groupby for DataFrame
+    aggregation (GH #181)
+  - Raise SkipTest for pre-epoch HDFStore failure. Real fix will be sorted out
+    via datetime64 dtype
+
+Thanks
+------
+
+- Uri Laserson
+- Scott Sinclair
 
 pandas 0.4.1
 ============
 
-**Release date:** Not yet released
+**Release date:** 9/25/2011
 
 This is primarily a bug fix release but includes some new features and
 improvements
@@ -42,6 +177,10 @@ improvements
   - Optimized `_ensure_index` function resulting in performance savings in
     type-checking Index objects
 
+**API Changes**
+
+None
+
 **Bug fixes**
 
   - Fixed DataFrame constructor bug causing downstream problems (e.g. .copy()
 
@@ -0,0 +1,7 @@
+- SparseSeries name integration + tests
+- Refactor Series.repr
+- .name pickling / unpicking / HDFStore handling
+- Is there a way to write hierarchical columns to csv?
+- Possible to blow away existing name when creating MultiIndex?
+- prettytable output with index names
+- Add load/save functions to top level pandas namespace
@@ -0,0 +1,25 @@
+library(zoo)
+library(xts)
+
+indices = rep(NA, 100000)
+for (i in 1:100000)
+  indices[i] <- paste(sample(letters, 10), collapse="")
+
+timings <- numeric()
+
+## x <- zoo(rnorm(100000), indices)
+## y <- zoo(rnorm(90000), indices[sample(1:100000, 90000)])
+
+## indices <- as.POSIXct(1:100000)
+
+indices <- as.POSIXct(Sys.Date()) + 1:1000000
+
+x <- xts(rnorm(1000000), indices)
+y <- xts(rnorm(900000), indices[sample(1:1000000, 900000)])
+
+for (i in 1:10) {
+  gc()
+  timings[i] = system.time(x + y)[3]
+}
+
+mean(timings)
@@ -0,0 +1,40 @@
+from pandas import *
+from pandas.util.testing import rands
+
+from la import larry
+
+n = 100000
+indices = Index([rands(10) for _ in xrange(n)])
+
+def sample(values, k):
+    from random import shuffle
+    sampler = np.arange(len(values))
+    shuffle(sampler)
+    return values.take(sampler[:k])
+
+subsample_size = 90000
+
+# x = Series(np.random.randn(100000), indices)
+# y = Series(np.random.randn(subsample_size),
+#            index=sample(indices, subsample_size))
+
+
+# lx = larry(np.random.randn(100000), [list(indices)])
+# ly = larry(np.random.randn(subsample_size), [list(y.index)])
+
+stamps = np.random.randint(1000000000, 1000000000000, 2000000)
+
+idx1 = np.sort(sample(stamps, 1000000))
+idx2 = np.sort(sample(stamps, 1000000))
+
+ts1 = Series(np.random.randn(1000000), idx1)
+ts2 = Series(np.random.randn(1000000), idx2)
+
+# Benchmark 1: Two 1-million length time series (int64-based index) with
+# randomly chosen timestamps
+
+# Benchmark 2: Join two 5-variate time series DataFrames (outer and inner join)
+
+df1 = DataFrame(np.random.randn(1000000, 5), idx1, columns=range(5))
+df2 = DataFrame(np.random.randn(1000000, 5), idx2, columns=range(5, 10))
+
@@ -6,7 +6,7 @@
 import pandas.core.datetools as datetools
 
 from pandas.core.common import isnull, notnull, set_printoptions
-from pandas.core.index import Index, Factor, MultiIndex
+from pandas.core.index import Index, Int64Index, Factor, MultiIndex
 from pandas.core.daterange import DateRange
 from pandas.core.series import Series, TimeSeries
 from pandas.core.frame import DataFrame