|
6 | 6 |
|
7 | 7 | ## Bug Fixes
|
8 | 8 |
|
9 |
| -# 0.18.0 |
10 |
| - |
11 |
| -Please see https://github.com/rapidsai/cudf/releases/tag/branch-0.18-latest for the latest changes to this development branch. |
| 9 | +# cuDF 0.18.0 (24 Feb 2021) |
| 10 | + |
| 11 | +## Breaking Changes 🚨 |
| 12 | + |
| 13 | +- Default `groupby` to `sort=False` (#7180) @isVoid |
| 14 | +- Add libcudf API for parsing of ORC statistics (#7136) @vuule |
| 15 | +- Replace ORC writer api with class (#7099) @rgsl888prabhu |
| 16 | +- Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec |
| 17 | +- Replace parquet writer api with class (#7058) @rgsl888prabhu |
| 18 | +- Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt |
| 19 | +- Fix default parameter values of `write_csv` and `write_parquet` (#6967) @vuule |
| 20 | +- Align `Series.groupby` API to match Pandas (#6964) @kkraus14 |
| 21 | +- Share `factorize` implementation with Index and cudf module (#6885) @brandon-b-miller |
| 22 | + |
| 23 | +## Bug Fixes 🐛 |
| 24 | + |
| 25 | +- Remove incorrect std::move call on return variable (#7319) @davidwendt |
| 26 | +- Fix failing CI ORC test (#7313) @vuule |
| 27 | +- Disallow constructing frames from a ColumnAccessor (#7298) @shwina |
| 28 | +- fix java cuFile tests (#7296) @rongou |
| 29 | +- Fix style issues related to NumPy (#7279) @shwina |
| 30 | +- Fix bug when `iloc` slice terminates at before-the-zero position (#7277) @isVoid |
| 31 | +- Fix copying dtype metadata after calling libcudf functions (#7271) @shwina |
| 32 | +- Move lists utility function definition out of header (#7266) @mythrocks |
| 33 | +- Throw if bool column would cause incorrect result when writing to ORC (#7261) @vuule |
| 34 | +- Use `uvector` in `replace_nulls`; Fix `sort_helper::grouped_value` doc (#7256) @isVoid |
| 35 | +- Remove floating point types from cudf::sort fast-path (#7250) @davidwendt |
| 36 | +- Disallow picking output columns from nested columns. (#7248) @devavret |
| 37 | +- Fix `loc` for Series with a MultiIndex (#7243) @shwina |
| 38 | +- Fix Arrow column test leaks (#7241) @tgravescs |
| 39 | +- Fix test column vector leak (#7238) @kuhushukla |
| 40 | +- Fix some bugs in java scalar support for decimal (#7237) @revans2 |
| 41 | +- Improve `assert_eq` handling of scalar (#7220) @isVoid |
| 42 | +- Fix missing null_count() comparison in test framework and related failures (#7219) @nvdbaranec |
| 43 | +- Remove floating point types from radix sort fast-path (#7215) @davidwendt |
| 44 | +- Fixing parquet benchmarks (#7214) @rgsl888prabhu |
| 45 | +- Handle various parameter combinations in `replace` API (#7207) @galipremsagar |
| 46 | +- Export mock aws credentials for s3 tests (#7176) @ayushdg |
| 47 | +- Add `MultiIndex.rename` API (#7172) @isVoid |
| 48 | +- Fix importing list & struct types in `from_arrow` (#7162) @galipremsagar |
| 49 | +- Fixing parquet precision writing failing if scale is equal to precision (#7146) @hyperbolic2346 |
| 50 | +- Update s3 tests to use moto_server (#7144) @ayushdg |
| 51 | +- Fix JIT cache multi-process test flakiness in slow drives (#7142) @devavret |
| 52 | +- Fix compilation errors in libcudf (#7138) @galipremsagar |
| 53 | +- Fix compilation failure caused by `-Wall` addition. (#7134) @codereport |
| 54 | +- Add informative error message for `sep` in CSV writer (#7095) @galipremsagar |
| 55 | +- Add JIT cache per compute capability (#7090) @devavret |
| 56 | +- Implement `__hash__` method for ListDtype (#7081) @galipremsagar |
| 57 | +- Only upload packages that were built (#7077) @raydouglass |
| 58 | +- Fix comparisons between Series and cudf.NA (#7072) @brandon-b-miller |
| 59 | +- Handle `nan` values correctly in `Series.one_hot_encoding` (#7059) @galipremsagar |
| 60 | +- Add `unstack()` support for non-multiindexed dataframes (#7054) @isVoid |
| 61 | +- Fix `read_orc` for decimal type (#7034) @rgsl888prabhu |
| 62 | +- Fix backward compatibility of loading a 0.16 pkl file (#7033) @galipremsagar |
| 63 | +- Decimal casts in JNI became a NOOP (#7032) @revans2 |
| 64 | +- Restore usual instance/subclass checking to cudf.DateOffset (#7029) @shwina |
| 65 | +- Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt |
| 66 | +- Fix to_csv delimiter handling of timestamp format (#7023) @davidwendt |
| 67 | +- Pin librdkakfa to gcc 7 compatible version (#7021) @raydouglass |
| 68 | +- Fix `fillna` & `dropna` to also consider `np.nan` as a missing value (#7019) @galipremsagar |
| 69 | +- Fix round operator's HALF_EVEN computation for negative integers (#7014) @nartal1 |
| 70 | +- Skip Thrust sort patch if already applied (#7009) @harrism |
| 71 | +- Fix `cudf::hash_partition` for `decimal32` and `decimal64` (#7006) @codereport |
| 72 | +- Fix Thrust unroll patch command (#7002) @harrism |
| 73 | +- Fix loc behaviour when key of incorrect type is used (#6993) @shwina |
| 74 | +- Fix int to datetime conversion in csv_read (#6991) @kaatish |
| 75 | +- fix excluding cufile tests by default (#6988) @rongou |
| 76 | +- Fix java cufile tests when cufile is not installed (#6987) @revans2 |
| 77 | +- Make `cudf::round` for `fixed_point` when `scale = -decimal_places` a no-op (#6975) @codereport |
| 78 | +- Fix type comparison for java (#6970) @revans2 |
| 79 | +- Fix default parameter values of `write_csv` and `write_parquet` (#6967) @vuule |
| 80 | +- Align `Series.groupby` API to match Pandas (#6964) @kkraus14 |
| 81 | +- Fix timestamp parsing in ORC reader for timezones without transitions (#6959) @vuule |
| 82 | +- Fix typo in numerical.py (#6957) @rgsl888prabhu |
| 83 | +- `fixed_point_value` double-shifts in `fixed_point` construction (#6950) @codereport |
| 84 | +- fix libcu++ include path for jni (#6948) @rongou |
| 85 | +- Fix groupby agg/apply behaviour when no key columns are provided (#6945) @shwina |
| 86 | +- Avoid inserting null elements into join hash table when nulls are treated as unequal (#6943) @hyperbolic2346 |
| 87 | +- Fix cudf::merge gtest for dictionary columns (#6942) @davidwendt |
| 88 | +- Pass numeric scalars of the same dtype through numeric binops (#6938) @brandon-b-miller |
| 89 | +- Fix N/A detection for empty fields in CSV reader (#6922) @vuule |
| 90 | +- Fix rmm_mode=managed parameter for gtests (#6912) @davidwendt |
| 91 | +- Fix nullmask offset handling in parquet and orc writer (#6889) @kaatish |
| 92 | +- Correct the sampling range when sampling with replacement (#6884) @ChrisJar |
| 93 | +- Handle nested string columns with no children in contiguous_split. (#6864) @nvdbaranec |
| 94 | +- Fix `columns` & `index` handling in dataframe constructor (#6838) @galipremsagar |
| 95 | + |
| 96 | +## Documentation 📖 |
| 97 | + |
| 98 | +- Update readme (#7318) @shwina |
| 99 | +- Fix typo in cudf.core.column.string.extract docs (#7253) @adelevie |
| 100 | +- Update doxyfile project number (#7161) @davidwendt |
| 101 | +- Update 10 minutes to cuDF and CuPy with new APIs (#7158) @ChrisJar |
| 102 | +- Cross link RMM & libcudf Doxygen docs (#7149) @ajschmidt8 |
| 103 | +- Add documentation for support dtypes in all IO formats (#7139) @galipremsagar |
| 104 | +- Add groupby docs (#7100) @shwina |
| 105 | +- Update cudf python docstrings with new null representation (`<NA>`) (#7050) @galipremsagar |
| 106 | +- Make Doxygen comments formatting consistent (#7041) @vuule |
| 107 | +- Add docs for working with missing data (#7010) @galipremsagar |
| 108 | +- Remove warning in from_dlpack and to_dlpack methods (#7001) @miguelusque |
| 109 | +- libcudf Developer Guide (#6977) @harrism |
| 110 | +- Add JNI wrapper for the cuFile API (GDS) (#6940) @rongou |
| 111 | + |
| 112 | +## New Features 🚀 |
| 113 | + |
| 114 | +- Support `numeric_only` field for `rank()` (#7213) @isVoid |
| 115 | +- Add support for `cudf::binary_operation` `TRUE_DIV` for `decimal32` and `decimal64` (#7198) @codereport |
| 116 | +- Implement COLLECT rolling window aggregation (#7189) @mythrocks |
| 117 | +- Add support for array-like inputs in `cudf.get_dummies` (#7181) @galipremsagar |
| 118 | +- Default `groupby` to `sort=False` (#7180) @isVoid |
| 119 | +- Add libcudf lists column count_elements API (#7173) @davidwendt |
| 120 | +- Implement `cudf::group_by` (sort) for `decimal32` and `decimal64` (#7169) @codereport |
| 121 | +- Add encoding and compression argument to CSV writer (#7168) @VibhuJawa |
| 122 | +- `cudf::rolling_window` `SUM` support for `decimal32` and `decimal64` (#7147) @codereport |
| 123 | +- Adding support for explode to cuDF (#7140) @hyperbolic2346 |
| 124 | +- Add libcudf API for parsing of ORC statistics (#7136) @vuule |
| 125 | +- update GDS/cuFile location for 0.9 release (#7131) @rongou |
| 126 | +- Add Segmented sort (#7122) @karthikeyann |
| 127 | +- Add `cudf::binary_operation` `NULL_MIN`, `NULL_MAX` & `NULL_EQUALS` for `decimal32` and `decimal64` (#7119) @codereport |
| 128 | +- Add `scale` and `value` methods to `fixed_point` (#7109) @codereport |
| 129 | +- Replace ORC writer api with class (#7099) @rgsl888prabhu |
| 130 | +- Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec |
| 131 | +- Improve `digitize` API (#7071) @isVoid |
| 132 | +- Add List types support in data generator (#7064) @galipremsagar |
| 133 | +- `cudf::scan` support for `decimal32` and `decimal64` (#7063) @codereport |
| 134 | +- `cudf::rolling` `ROW_NUMBER` support for `decimal32` and `decimal64` (#7061) @codereport |
| 135 | +- Replace parquet writer api with class (#7058) @rgsl888prabhu |
| 136 | +- Support contains() on lists of primitives (#7039) @mythrocks |
| 137 | +- Implement `cudf::rolling` for `decimal32` and `decimal64` (#7037) @codereport |
| 138 | +- Add `ffill` and `bfill` to string columns (#7036) @isVoid |
| 139 | +- Enable round in cudf for DataFrame and Series (#7022) @ChrisJar |
| 140 | +- Extend `replace_nulls_policy` to `string` and `dictionary` type (#7004) @isVoid |
| 141 | +- Add segmented_gather(list_column, gather_list) (#7003) @karthikeyann |
| 142 | +- Add `method` field to `fillna` for fixed width columns (#6998) @isVoid |
| 143 | +- Manual merge of branch 0.17 into branch 0.18 (#6995) @shwina |
| 144 | +- Implement `cudf::reduce` for `decimal32` and `decimal64` (part 2) (#6980) @codereport |
| 145 | +- Add Ufunc alias look up for appropriate numpy ufunc dispatching (#6973) @VibhuJawa |
| 146 | +- Add pytest-xdist to dev environment.yml (#6958) @galipremsagar |
| 147 | +- Add `Index.set_names` api (#6929) @galipremsagar |
| 148 | +- Add `replace_null` API with `replace_policy` parameter, `fixed_width` column support (#6907) @isVoid |
| 149 | +- Share `factorize` implementation with Index and cudf module (#6885) @brandon-b-miller |
| 150 | +- Implement update() function (#6883) @skirui-source |
| 151 | +- Add groupby idxmin, idxmax aggregation (#6856) @karthikeyann |
| 152 | +- Implement `cudf::reduce` for `decimal32` and `decimal64` (part 1) (#6814) @codereport |
| 153 | +- Implement cudf.DateOffset for months (#6775) @brandon-b-miller |
| 154 | +- Add Python DecimalColumn (#6715) @shwina |
| 155 | +- Add dictionary support to libcudf groupby functions (#6585) @davidwendt |
| 156 | + |
| 157 | +## Improvements 🛠️ |
| 158 | + |
| 159 | +- Update stale GHA with exemptions & new labels (#7395) @mike-wendt |
| 160 | +- Add GHA to mark issues/prs as stale/rotten (#7388) @Ethyling |
| 161 | +- Unpin from numpy < 1.20 (#7335) @shwina |
| 162 | +- Prepare Changelog for Automation (#7309) @galipremsagar |
| 163 | +- Prepare Changelog for Automation (#7272) @ajschmidt8 |
| 164 | +- Add JNI support for converting Arrow buffers to CUDF ColumnVectors (#7222) @tgravescs |
| 165 | +- Add coverage for `skiprows` and `num_rows` in parquet reader fuzz testing (#7216) @galipremsagar |
| 166 | +- Define and implement more behavior for merging on categorical variables (#7209) @brandon-b-miller |
| 167 | +- Add CudfSeriesGroupBy to optimize dask_cudf groupby-mean (#7194) @rjzamora |
| 168 | +- Add dictionary column support to rolling_window (#7186) @davidwendt |
| 169 | +- Modify the semantics of `end` pointers in cuIO to match standard library (#7179) @vuule |
| 170 | +- Adding unit tests for `fixed_point` with extremely large `scale`s (#7178) @codereport |
| 171 | +- Fast path single column sort (#7167) @davidwendt |
| 172 | +- Fix -Werror=sign-compare errors in device code (#7164) @trxcllnt |
| 173 | +- Refactor cudf::string_view host and device code (#7159) @davidwendt |
| 174 | +- Enable logic for GPU auto-detection in cudfjni (#7155) @gerashegalov |
| 175 | +- Java bindings for Fixed-point type support for Parquet (#7153) @razajafri |
| 176 | +- Add Java interface for the new API 'explode' (#7151) @firestarman |
| 177 | +- Replace offsets with iterators in cuIO utilities and CSV parser (#7150) @vuule |
| 178 | +- Add gbenchmarks for reduction aggregations any() and all() (#7129) @davidwendt |
| 179 | +- Update JNI for contiguous_split packed results (#7127) @jlowe |
| 180 | +- Add JNI and Java bindings for list_contains (#7125) @kuhushukla |
| 181 | +- Add Java unit tests for window aggregate 'collect' (#7121) @firestarman |
| 182 | +- verify window operations on decimal with java tests (#7120) @sperlingxx |
| 183 | +- Adds in JNI support for creating an list column from existing columns (#7112) @revans2 |
| 184 | +- Build libcudf with -Wall (#7105) @trxcllnt |
| 185 | +- Add column_device_view pointers to EncColumnDesc (#7097) @kaatish |
| 186 | +- Add `pyorc` to dev environment (#7085) @galipremsagar |
| 187 | +- JNI support for creating struct column from existing columns and fixed bug in struct with no children (#7084) @revans2 |
| 188 | +- Fastpath single strings column in cudf::sort (#7075) @davidwendt |
| 189 | +- Upgrade nvcomp to 1.2.1 (#7069) @rongou |
| 190 | +- Refactor ORC `ProtobufReader` to make it more extendable (#7055) @vuule |
| 191 | +- Add Java tests for decimal casts (#7051) @sperlingxx |
| 192 | +- Auto-label PRs based on their content (#7044) @jolorunyomi |
| 193 | +- Create sort gbenchmark for strings column (#7040) @davidwendt |
| 194 | +- Refactor io memory fetches to use hostdevice_vector methods (#7035) @ChrisJar |
| 195 | +- Spark Murmur3 hash functionality (#7024) @rwlee |
| 196 | +- Fix libcudf strings logic where size_type is used to access INT32 column data (#7020) @davidwendt |
| 197 | +- Adding decimal writing support to parquet (#7017) @hyperbolic2346 |
| 198 | +- Add compression="infer" as default for dask_cudf.read_csv (#7013) @rjzamora |
| 199 | +- Correct ORC docstring; other minor cuIO improvements (#7012) @vuule |
| 200 | +- Reduce number of hostdevice_vector allocations in parquet reader (#7005) @devavret |
| 201 | +- Check output size overflow on strings gather (#6997) @davidwendt |
| 202 | +- Improve representation of `MultiIndex` (#6992) @galipremsagar |
| 203 | +- Disable some pragma unroll statements in thrust sort.h (#6982) @davidwendt |
| 204 | +- Minor `cudf::round` internal refactoring (#6976) @codereport |
| 205 | +- Add Java bindings for URL conversion (#6972) @jlowe |
| 206 | +- Enable strict_decimal_types in parquet reading (#6969) @sperlingxx |
| 207 | +- Add in basic support to JNI for logical_cast (#6954) @revans2 |
| 208 | +- Remove duplicate file array_tests.cpp (#6953) @karthikeyann |
| 209 | +- Add null mask `fixed_point_column_wrapper` constructors (#6951) @codereport |
| 210 | +- Update Java bindings version to 0.18-SNAPSHOT (#6949) @jlowe |
| 211 | +- Use simplified `rmm::exec_policy` (#6939) @harrism |
| 212 | +- Add null count test for apply_boolean_mask (#6903) @harrism |
| 213 | +- Implement DataFrame.quantile for datetime and timedelta data types (#6902) @ChrisJar |
| 214 | +- Remove **kwargs from string/categorical methods (#6750) @shwina |
| 215 | +- Refactor rolling.cu to reduce compile time (#6512) @mythrocks |
| 216 | +- Add static type checking via Mypy (#6381) @shwina |
| 217 | +- Update to official libcu++ on Github (#6275) @trxcllnt |
12 | 218 |
|
13 | 219 | # cuDF 0.17.0 (10 Dec 2020)
|
14 | 220 |
|
|
0 commit comments