Skip to content

Commit 36bba1f

Browse files
committed
add DataFrame.insert_columns
1 parent cafa8fd commit 36bba1f

File tree

1 file changed

+46
-0
lines changed

1 file changed

+46
-0
lines changed

spec/API_specification/dataframe_api/dataframe_object.py

+46
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,52 @@ def insert_column(self, loc: int, column: Column[Any]) -> DataFrame:
201201
"""
202202
...
203203

204+
def insert_columns(self, locs_and_columns: Sequence[tuple[int, Column[Any]]]) -> DataFrame:
205+
"""
206+
Insert columns into DataFrame at specified locations.
207+
208+
Like `insert_column`, but can insert multiple (independent) columns.
209+
Some implementations may be able to make use of parallelism in this
210+
case. For example instead of:
211+
212+
.. code-block::
213+
214+
new_column = df.get_column_by_name('a') + 1
215+
df = df.insert_column(0, new_column.rename('a_plus_1'))
216+
new_column = df.get_column_by_name('b') + 1
217+
df = df.insert_column(1, new_column.rename('b_plus_1'))
218+
219+
it would be better to write
220+
221+
.. code-block::
222+
223+
new_column_0 = df.get_column_by_name('a') + 1
224+
new_column_1 = df.get_column_by_name('b') + 1
225+
df = df.insert_columns(
226+
[
227+
(0, new_column_0.rename('a_plus_1')),
228+
(1, new_column_1.rename('b_plus_1')),
229+
]
230+
)
231+
232+
so that insertion can happen in parallel for some implementations.
233+
234+
Parameters
235+
----------
236+
locs_and_columns : Sequence[Tuple[int, Column]]
237+
Sequence of tuples of the kind (location, column).
238+
Must be independent of each other.
239+
Locations and column names must be unique.
240+
Column names may not already be present in the
241+
dataframe - use `DataFrame.rename` to rename them
242+
beforehand if necessary.
243+
244+
Returns
245+
-------
246+
DataFrame
247+
"""
248+
...
249+
204250
def drop_column(self, label: str) -> DataFrame:
205251
"""
206252
Drop the specified column.

0 commit comments

Comments
 (0)