Skip to content

Commit a0589c6

Browse files
authored
Merge pull request #7463 from ajschmidt8/branch-0.19-merge-0.18
[skip-ci] Update 0.18 changelog entry
2 parents 9ae85ae + 4aceef4 commit a0589c6

File tree

1 file changed

+209
-3
lines changed

1 file changed

+209
-3
lines changed

CHANGELOG.md

Lines changed: 209 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,215 @@
66

77
## Bug Fixes
88

9-
# 0.18.0
10-
11-
Please see https://github.com/rapidsai/cudf/releases/tag/branch-0.18-latest for the latest changes to this development branch.
9+
# cuDF 0.18.0 (24 Feb 2021)
10+
11+
## Breaking Changes 🚨
12+
13+
- Default `groupby` to `sort=False` (#7180) @isVoid
14+
- Add libcudf API for parsing of ORC statistics (#7136) @vuule
15+
- Replace ORC writer api with class (#7099) @rgsl888prabhu
16+
- Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec
17+
- Replace parquet writer api with class (#7058) @rgsl888prabhu
18+
- Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt
19+
- Fix default parameter values of `write_csv` and `write_parquet` (#6967) @vuule
20+
- Align `Series.groupby` API to match Pandas (#6964) @kkraus14
21+
- Share `factorize` implementation with Index and cudf module (#6885) @brandon-b-miller
22+
23+
## Bug Fixes 🐛
24+
25+
- Remove incorrect std::move call on return variable (#7319) @davidwendt
26+
- Fix failing CI ORC test (#7313) @vuule
27+
- Disallow constructing frames from a ColumnAccessor (#7298) @shwina
28+
- fix java cuFile tests (#7296) @rongou
29+
- Fix style issues related to NumPy (#7279) @shwina
30+
- Fix bug when `iloc` slice terminates at before-the-zero position (#7277) @isVoid
31+
- Fix copying dtype metadata after calling libcudf functions (#7271) @shwina
32+
- Move lists utility function definition out of header (#7266) @mythrocks
33+
- Throw if bool column would cause incorrect result when writing to ORC (#7261) @vuule
34+
- Use `uvector` in `replace_nulls`; Fix `sort_helper::grouped_value` doc (#7256) @isVoid
35+
- Remove floating point types from cudf::sort fast-path (#7250) @davidwendt
36+
- Disallow picking output columns from nested columns. (#7248) @devavret
37+
- Fix `loc` for Series with a MultiIndex (#7243) @shwina
38+
- Fix Arrow column test leaks (#7241) @tgravescs
39+
- Fix test column vector leak (#7238) @kuhushukla
40+
- Fix some bugs in java scalar support for decimal (#7237) @revans2
41+
- Improve `assert_eq` handling of scalar (#7220) @isVoid
42+
- Fix missing null_count() comparison in test framework and related failures (#7219) @nvdbaranec
43+
- Remove floating point types from radix sort fast-path (#7215) @davidwendt
44+
- Fixing parquet benchmarks (#7214) @rgsl888prabhu
45+
- Handle various parameter combinations in `replace` API (#7207) @galipremsagar
46+
- Export mock aws credentials for s3 tests (#7176) @ayushdg
47+
- Add `MultiIndex.rename` API (#7172) @isVoid
48+
- Fix importing list & struct types in `from_arrow` (#7162) @galipremsagar
49+
- Fixing parquet precision writing failing if scale is equal to precision (#7146) @hyperbolic2346
50+
- Update s3 tests to use moto_server (#7144) @ayushdg
51+
- Fix JIT cache multi-process test flakiness in slow drives (#7142) @devavret
52+
- Fix compilation errors in libcudf (#7138) @galipremsagar
53+
- Fix compilation failure caused by `-Wall` addition. (#7134) @codereport
54+
- Add informative error message for `sep` in CSV writer (#7095) @galipremsagar
55+
- Add JIT cache per compute capability (#7090) @devavret
56+
- Implement `__hash__` method for ListDtype (#7081) @galipremsagar
57+
- Only upload packages that were built (#7077) @raydouglass
58+
- Fix comparisons between Series and cudf.NA (#7072) @brandon-b-miller
59+
- Handle `nan` values correctly in `Series.one_hot_encoding` (#7059) @galipremsagar
60+
- Add `unstack()` support for non-multiindexed dataframes (#7054) @isVoid
61+
- Fix `read_orc` for decimal type (#7034) @rgsl888prabhu
62+
- Fix backward compatibility of loading a 0.16 pkl file (#7033) @galipremsagar
63+
- Decimal casts in JNI became a NOOP (#7032) @revans2
64+
- Restore usual instance/subclass checking to cudf.DateOffset (#7029) @shwina
65+
- Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt
66+
- Fix to_csv delimiter handling of timestamp format (#7023) @davidwendt
67+
- Pin librdkakfa to gcc 7 compatible version (#7021) @raydouglass
68+
- Fix `fillna` & `dropna` to also consider `np.nan` as a missing value (#7019) @galipremsagar
69+
- Fix round operator's HALF_EVEN computation for negative integers (#7014) @nartal1
70+
- Skip Thrust sort patch if already applied (#7009) @harrism
71+
- Fix `cudf::hash_partition` for `decimal32` and `decimal64` (#7006) @codereport
72+
- Fix Thrust unroll patch command (#7002) @harrism
73+
- Fix loc behaviour when key of incorrect type is used (#6993) @shwina
74+
- Fix int to datetime conversion in csv_read (#6991) @kaatish
75+
- fix excluding cufile tests by default (#6988) @rongou
76+
- Fix java cufile tests when cufile is not installed (#6987) @revans2
77+
- Make `cudf::round` for `fixed_point` when `scale = -decimal_places` a no-op (#6975) @codereport
78+
- Fix type comparison for java (#6970) @revans2
79+
- Fix default parameter values of `write_csv` and `write_parquet` (#6967) @vuule
80+
- Align `Series.groupby` API to match Pandas (#6964) @kkraus14
81+
- Fix timestamp parsing in ORC reader for timezones without transitions (#6959) @vuule
82+
- Fix typo in numerical.py (#6957) @rgsl888prabhu
83+
- `fixed_point_value` double-shifts in `fixed_point` construction (#6950) @codereport
84+
- fix libcu++ include path for jni (#6948) @rongou
85+
- Fix groupby agg/apply behaviour when no key columns are provided (#6945) @shwina
86+
- Avoid inserting null elements into join hash table when nulls are treated as unequal (#6943) @hyperbolic2346
87+
- Fix cudf::merge gtest for dictionary columns (#6942) @davidwendt
88+
- Pass numeric scalars of the same dtype through numeric binops (#6938) @brandon-b-miller
89+
- Fix N/A detection for empty fields in CSV reader (#6922) @vuule
90+
- Fix rmm_mode=managed parameter for gtests (#6912) @davidwendt
91+
- Fix nullmask offset handling in parquet and orc writer (#6889) @kaatish
92+
- Correct the sampling range when sampling with replacement (#6884) @ChrisJar
93+
- Handle nested string columns with no children in contiguous_split. (#6864) @nvdbaranec
94+
- Fix `columns` & `index` handling in dataframe constructor (#6838) @galipremsagar
95+
96+
## Documentation 📖
97+
98+
- Update readme (#7318) @shwina
99+
- Fix typo in cudf.core.column.string.extract docs (#7253) @adelevie
100+
- Update doxyfile project number (#7161) @davidwendt
101+
- Update 10 minutes to cuDF and CuPy with new APIs (#7158) @ChrisJar
102+
- Cross link RMM & libcudf Doxygen docs (#7149) @ajschmidt8
103+
- Add documentation for support dtypes in all IO formats (#7139) @galipremsagar
104+
- Add groupby docs (#7100) @shwina
105+
- Update cudf python docstrings with new null representation (`<NA>`) (#7050) @galipremsagar
106+
- Make Doxygen comments formatting consistent (#7041) @vuule
107+
- Add docs for working with missing data (#7010) @galipremsagar
108+
- Remove warning in from_dlpack and to_dlpack methods (#7001) @miguelusque
109+
- libcudf Developer Guide (#6977) @harrism
110+
- Add JNI wrapper for the cuFile API (GDS) (#6940) @rongou
111+
112+
## New Features 🚀
113+
114+
- Support `numeric_only` field for `rank()` (#7213) @isVoid
115+
- Add support for `cudf::binary_operation` `TRUE_DIV` for `decimal32` and `decimal64` (#7198) @codereport
116+
- Implement COLLECT rolling window aggregation (#7189) @mythrocks
117+
- Add support for array-like inputs in `cudf.get_dummies` (#7181) @galipremsagar
118+
- Default `groupby` to `sort=False` (#7180) @isVoid
119+
- Add libcudf lists column count_elements API (#7173) @davidwendt
120+
- Implement `cudf::group_by` (sort) for `decimal32` and `decimal64` (#7169) @codereport
121+
- Add encoding and compression argument to CSV writer (#7168) @VibhuJawa
122+
- `cudf::rolling_window` `SUM` support for `decimal32` and `decimal64` (#7147) @codereport
123+
- Adding support for explode to cuDF (#7140) @hyperbolic2346
124+
- Add libcudf API for parsing of ORC statistics (#7136) @vuule
125+
- update GDS/cuFile location for 0.9 release (#7131) @rongou
126+
- Add Segmented sort (#7122) @karthikeyann
127+
- Add `cudf::binary_operation` `NULL_MIN`, `NULL_MAX` & `NULL_EQUALS` for `decimal32` and `decimal64` (#7119) @codereport
128+
- Add `scale` and `value` methods to `fixed_point` (#7109) @codereport
129+
- Replace ORC writer api with class (#7099) @rgsl888prabhu
130+
- Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec
131+
- Improve `digitize` API (#7071) @isVoid
132+
- Add List types support in data generator (#7064) @galipremsagar
133+
- `cudf::scan` support for `decimal32` and `decimal64` (#7063) @codereport
134+
- `cudf::rolling` `ROW_NUMBER` support for `decimal32` and `decimal64` (#7061) @codereport
135+
- Replace parquet writer api with class (#7058) @rgsl888prabhu
136+
- Support contains() on lists of primitives (#7039) @mythrocks
137+
- Implement `cudf::rolling` for `decimal32` and `decimal64` (#7037) @codereport
138+
- Add `ffill` and `bfill` to string columns (#7036) @isVoid
139+
- Enable round in cudf for DataFrame and Series (#7022) @ChrisJar
140+
- Extend `replace_nulls_policy` to `string` and `dictionary` type (#7004) @isVoid
141+
- Add segmented_gather(list_column, gather_list) (#7003) @karthikeyann
142+
- Add `method` field to `fillna` for fixed width columns (#6998) @isVoid
143+
- Manual merge of branch 0.17 into branch 0.18 (#6995) @shwina
144+
- Implement `cudf::reduce` for `decimal32` and `decimal64` (part 2) (#6980) @codereport
145+
- Add Ufunc alias look up for appropriate numpy ufunc dispatching (#6973) @VibhuJawa
146+
- Add pytest-xdist to dev environment.yml (#6958) @galipremsagar
147+
- Add `Index.set_names` api (#6929) @galipremsagar
148+
- Add `replace_null` API with `replace_policy` parameter, `fixed_width` column support (#6907) @isVoid
149+
- Share `factorize` implementation with Index and cudf module (#6885) @brandon-b-miller
150+
- Implement update() function (#6883) @skirui-source
151+
- Add groupby idxmin, idxmax aggregation (#6856) @karthikeyann
152+
- Implement `cudf::reduce` for `decimal32` and `decimal64` (part 1) (#6814) @codereport
153+
- Implement cudf.DateOffset for months (#6775) @brandon-b-miller
154+
- Add Python DecimalColumn (#6715) @shwina
155+
- Add dictionary support to libcudf groupby functions (#6585) @davidwendt
156+
157+
## Improvements 🛠️
158+
159+
- Update stale GHA with exemptions & new labels (#7395) @mike-wendt
160+
- Add GHA to mark issues/prs as stale/rotten (#7388) @Ethyling
161+
- Unpin from numpy < 1.20 (#7335) @shwina
162+
- Prepare Changelog for Automation (#7309) @galipremsagar
163+
- Prepare Changelog for Automation (#7272) @ajschmidt8
164+
- Add JNI support for converting Arrow buffers to CUDF ColumnVectors (#7222) @tgravescs
165+
- Add coverage for `skiprows` and `num_rows` in parquet reader fuzz testing (#7216) @galipremsagar
166+
- Define and implement more behavior for merging on categorical variables (#7209) @brandon-b-miller
167+
- Add CudfSeriesGroupBy to optimize dask_cudf groupby-mean (#7194) @rjzamora
168+
- Add dictionary column support to rolling_window (#7186) @davidwendt
169+
- Modify the semantics of `end` pointers in cuIO to match standard library (#7179) @vuule
170+
- Adding unit tests for `fixed_point` with extremely large `scale`s (#7178) @codereport
171+
- Fast path single column sort (#7167) @davidwendt
172+
- Fix -Werror=sign-compare errors in device code (#7164) @trxcllnt
173+
- Refactor cudf::string_view host and device code (#7159) @davidwendt
174+
- Enable logic for GPU auto-detection in cudfjni (#7155) @gerashegalov
175+
- Java bindings for Fixed-point type support for Parquet (#7153) @razajafri
176+
- Add Java interface for the new API 'explode' (#7151) @firestarman
177+
- Replace offsets with iterators in cuIO utilities and CSV parser (#7150) @vuule
178+
- Add gbenchmarks for reduction aggregations any() and all() (#7129) @davidwendt
179+
- Update JNI for contiguous_split packed results (#7127) @jlowe
180+
- Add JNI and Java bindings for list_contains (#7125) @kuhushukla
181+
- Add Java unit tests for window aggregate 'collect' (#7121) @firestarman
182+
- verify window operations on decimal with java tests (#7120) @sperlingxx
183+
- Adds in JNI support for creating an list column from existing columns (#7112) @revans2
184+
- Build libcudf with -Wall (#7105) @trxcllnt
185+
- Add column_device_view pointers to EncColumnDesc (#7097) @kaatish
186+
- Add `pyorc` to dev environment (#7085) @galipremsagar
187+
- JNI support for creating struct column from existing columns and fixed bug in struct with no children (#7084) @revans2
188+
- Fastpath single strings column in cudf::sort (#7075) @davidwendt
189+
- Upgrade nvcomp to 1.2.1 (#7069) @rongou
190+
- Refactor ORC `ProtobufReader` to make it more extendable (#7055) @vuule
191+
- Add Java tests for decimal casts (#7051) @sperlingxx
192+
- Auto-label PRs based on their content (#7044) @jolorunyomi
193+
- Create sort gbenchmark for strings column (#7040) @davidwendt
194+
- Refactor io memory fetches to use hostdevice_vector methods (#7035) @ChrisJar
195+
- Spark Murmur3 hash functionality (#7024) @rwlee
196+
- Fix libcudf strings logic where size_type is used to access INT32 column data (#7020) @davidwendt
197+
- Adding decimal writing support to parquet (#7017) @hyperbolic2346
198+
- Add compression="infer" as default for dask_cudf.read_csv (#7013) @rjzamora
199+
- Correct ORC docstring; other minor cuIO improvements (#7012) @vuule
200+
- Reduce number of hostdevice_vector allocations in parquet reader (#7005) @devavret
201+
- Check output size overflow on strings gather (#6997) @davidwendt
202+
- Improve representation of `MultiIndex` (#6992) @galipremsagar
203+
- Disable some pragma unroll statements in thrust sort.h (#6982) @davidwendt
204+
- Minor `cudf::round` internal refactoring (#6976) @codereport
205+
- Add Java bindings for URL conversion (#6972) @jlowe
206+
- Enable strict_decimal_types in parquet reading (#6969) @sperlingxx
207+
- Add in basic support to JNI for logical_cast (#6954) @revans2
208+
- Remove duplicate file array_tests.cpp (#6953) @karthikeyann
209+
- Add null mask `fixed_point_column_wrapper` constructors (#6951) @codereport
210+
- Update Java bindings version to 0.18-SNAPSHOT (#6949) @jlowe
211+
- Use simplified `rmm::exec_policy` (#6939) @harrism
212+
- Add null count test for apply_boolean_mask (#6903) @harrism
213+
- Implement DataFrame.quantile for datetime and timedelta data types (#6902) @ChrisJar
214+
- Remove **kwargs from string/categorical methods (#6750) @shwina
215+
- Refactor rolling.cu to reduce compile time (#6512) @mythrocks
216+
- Add static type checking via Mypy (#6381) @shwina
217+
- Update to official libcu++ on Github (#6275) @trxcllnt
12218

13219
# cuDF 0.17.0 (10 Dec 2020)
14220

0 commit comments

Comments
 (0)