Skip to content

assigning column values to empty dataframe with loc #10017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
paolini opened this issue Apr 29, 2015 · 9 comments
Closed

assigning column values to empty dataframe with loc #10017

paolini opened this issue Apr 29, 2015 · 9 comments
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode Usage Question

Comments

@paolini
Copy link

paolini commented Apr 29, 2015

I'm trying to add a column to an existing dataframe. The dataframe can be empty (0 rows) but I want the column to be added anyway.

import pandas as pd
import numpy as np

df = pd.DataFrame(index=np.arange(4))
df.loc[:,'col'] = 42 # this works fine!

df = pd.DataFrame(index=np.arange(0)) # empty dataframe
df.loc[:,'col'] = 42 # ERROR: ValueError: cannot set a frame with no defined index and a scalar
df['col'] = 42 # this works fine but seems deprecated

pd.__version__ is '0.15.2'
@jreback
Copy link
Contributor

jreback commented Apr 29, 2015

By setting the index to a 0-length array, you are actually setting the index.

This is de-facto the same thing as

df = DataFrame()
df.loc[:,'col']

This is simply not allowed setting as you don't have an index length; you can do it in the constructor directly if you really want.

In [9]: df = DataFrame(index=range(5))

In [10]: df
Out[10]: 
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3, 4]

In [11]: df.loc[:,'foo'] = 42

In [12]: df
Out[12]: 
   foo
0   42
1   42
2   42
3   42
4   42

These semantics are not usually what you would do anyhow; use .append

eg.

In [18]: DataFrame().append(Series(42,index=[1,2],name='col'))
Out[18]: 
      1   2
col  42  42

@jreback jreback closed this as completed Apr 29, 2015
@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Usage Question labels Apr 29, 2015
@paolini
Copy link
Author

paolini commented Apr 29, 2015

How do I add a column to a possibly empty, already existing dataframe?

@jreback
Copy link
Contributor

jreback commented Apr 29, 2015

In [1]: df = DataFrame()

In [2]: df['foo'] = 1

In [3]: df
Out[3]: 
Empty DataFrame
Columns: [foo]
Index: []

@paolini
Copy link
Author

paolini commented Apr 29, 2015

I thought that df['foo'] = 1 was deprecated. Because in some conditions which I'm not able to reproduce in a test, I got this warning: "A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead"

@jreback
Copy link
Contributor

jreback commented Apr 29, 2015

@paolini certainly not. You are getting the warning because you are making changes to something else that is a view. You should check your code. See the documentation on that section. If it is still showing an error, then post your code.

@paolini
Copy link
Author

paolini commented Apr 29, 2015

Ok, thanks for your support!

@AwasthiMaddy
Copy link

@jreback

In [1]: df = DataFrame()

In [2]: df['foo'] = 1

In [3]: df
Out[3]: 
Empty DataFrame
Columns: [foo]
Index: []

Here Shouldn't the df have value 1 in the column foo.
That is what one wants when he does df['foo'] = 1 instead it is adding the column but not inserting the value.

@Shreya12sahay
Copy link

frequent_itemsets = apriori(data,min_support=0.6)
frequent_itemsets
When i am running this in python 3.6 , I am getting this as output:
Empty DataFrame
Columns: [support, itemsets]
Index: []

My code is displaying correct output till this:

import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori
df=pd.read_csv("E:/gro.csv")
print(df)
citrus fruit semi-finished bread margarine margarine.1 ... Unnamed: 28 Unnamed: 29 Unnamed: 30 Unnamed: 31
0 tropical fruit yogurt coffee NaN ... NaN NaN NaN NaN
1 whole milk NaN NaN NaN ... NaN NaN NaN NaN
2 pip fruit yogurt cream cheese meat spreads ... NaN NaN NaN NaN
3 other vegetables whole milk condensed milk long life bakery product ... NaN NaN NaN NaN
4 whole milk butter yogurt rice ... NaN NaN NaN NaN
5 rolls/buns NaN NaN NaN ... NaN NaN NaN NaN
6 other vegetables UHT-milk rolls/buns bottled beer ... NaN NaN NaN NaN
7 pot plants NaN NaN NaN ... NaN NaN NaN NaN
8 whole milk cereals NaN NaN ... NaN NaN NaN NaN
9 tropical fruit other vegetables white bread bottled water ... NaN

te = TransactionEncoder()
te_ary = te.fit(df).transform(df)
data = pd.DataFrame(te_ary, columns=te.columns_)
print(data)
- . 0 1 2 3 4 5 6 7 ... e f g h i m n r s t u
0 True False False False False False False False False False False ... False True False False True False False True True True True
1 True True False False False False False False False False False ... True True False True True True True True True False False
2 False False False False False False False False False False False ... True False True False True True True True False False False
3 False False True False True False False False False False False ... True False True False True True True True False False False
4 True False False False False False False True False False False ... True False False False False True True False False False False
5 True False False False False False False False True False False ... True False False False False True True False False False False
6 True False False False False False False False False True False ... True False False False False True True False False False False
7 True False False False False False False False False False True ... True False False False False True True False False False False
8 True False False False False False False False False False False ... True False False False False True True False False False False
9 True False False False False False False False False False False ... True False False False False True True False False False False

but not getting my output when running this:

frequent_itemsets = apriori(data, min_support=0.6, use_colnames=True)
frequent_itemsets

The code above is not displaying the dataset, rather it is displaying empty dataframe but dataframe is not empty.
Empty DataFrame
Columns: [support, itemsets]
Index: []

Please help me

@rchauhan1
Copy link

I want to add N/A value in empty columns, please help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode Usage Question
Projects
None yet
Development

No branches or pull requests

5 participants