-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: integrate with openpyxl 2.0-2.1 changes #8342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Ha, we just had this discussion!: #8340 apparently So whatever fix is needed, prob should change say the 3.3 build to use the last good version as well (2.04?), though 2.6 build uses 2.03 (to make sure that WE maintain compat). |
it looks like a really minor change - maybe for now just switch the import On Sun, Sep 21, 2014 at 12:43 PM, jreback [email protected] wrote:
|
@jtratner prob, just need to do it in a try: except: because have to keep the existing one around. Its really annoying when people change the API's around like that |
hmm, might be something else as 2.0.5 seems to have been working for last few days (though its possible that what was called 2.0.5 changed), which is a no-no, but its possible |
It is version 2.1.0 that is released, not 2.0.5 I think: http://openpyxl.readthedocs.org/en/latest/changes.html |
I just pushed a potential fix. If it works, let's just stick with the minor On Sun, Sep 21, 2014 at 1:12 PM, Joris Van den Bossche <
|
@jorisvandenbossche right |
Grr. The OpenPyxl devs are making me regret getting roped into this. It's not clear that using the deprecated >>> from openpyxl.styles import Style
>>> from openpyxl.styles.numbers import NumberFormat
>>> nf = NumberFormat(format_code='0.00')
openpyxl.styles.numbers:1: UserWarning: Call to deprecated function or class NumberFormat (Number formats are strings. Use module functions).
>>> s = Style(number_format=nf)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "openpyxl/styles/__init__.py", line 45, in __init__
self.number_format = number_format
File "openpyxl/styles/hashable.py", line 54, in __setattr__
return object.__setattr__(self, *args, **kwargs)
File "openpyxl/styles/numbers.py", line 194, in __set__
super(NumberFormatDescriptor, self).__set__(instance, value)
File "openpyxl/descriptors/__init__.py", line 32, in __set__
raise TypeError('expected ' + str(self.expected_type))
TypeError: expected <type 'basestring'> It looks like the main alternatives for us are to A) build in separate compatibility for the two flavors of number formats, or B) declare 2.x.y prior to 2.1.0 unsupported, and treat " Thoughts? |
can u ping their mailing list / issues if they come back with - this is the API and it's stable - then if u can fix up otherwise maybe just revert and wait for stability |
Yeah, I'll see what I can find out. |
I've opened the hailing frequencies. |
perfect - of course this problem of crazy API changes happened in v2 so not surprised |
@neirbowj what do you think is best to do in short-run
? want to do a quick for one of these? |
I'll be in transit most of the day without access to the Internet (because I'll be damned if I ever knowingly buy GoGo Inflight Internet again). If this doesn't answer the immediate need, the next least worst option is probably to skip tests for openpyxl2, then next to revert. I'll be back online this evening. |
@neirbowj ok I merged in your changes (needed a couple of more for compat). Will leave this issue open for you to come back at some point and validate / fix for the continued openpyxl API changes. thanks! |
Guys, I believe IO modules are essential for Pandas usability (besides Numpy ;-) ). I think Pandas is a great package and for a production usage it is worth to invest more into reliability (that also means modules). |
Hi, I am trying to create an Excel File(showing the differences of two Excel files). I am getting this error:TypeError: expected <class 'openpyxl.styles.fonts.Font'>.
Any solutions please? |
@stancikcom and @Anon3 |
@jreback The version is pandas: 0.15.2. Would you suggest me to upgrade it to 0.16.0 ? |
@Anon3 well pls report |
Hi all, after upgrade, pandas works fine with new version of openpyxl. INSTALLED VERSIONScommit: None pandas: 0.16.0 |
@jreback this is what I used:
pandas: 0.16.0 |
@jreback and @stancikcom even with 0.16.0 i get this error: TypeError: expected <class 'openpyxl.styles.fonts.Font'> |
@Anon3 try to share a snippet of your code which makes you a trouble. We can try to replicate / debug it. Cheers. (ms) |
@stancikcom : here is the code that I found (and I am using for official purpose) to compare two excels: import pandas as pd Define the diff function to show the changes in each fielddef report_diff(x): We want to be able to easily tell which rows have changesdef has_change(row): Read in both excel filesdf1 = pd.read_excel('Excel1.xlsx', 'Sheet2', na_values=['NA']) Make sure we order by account number so the comparisons workdf1.sort(columns="Col1") Create a panel of the two dataframesdiff_panel = pd.Panel(dict(df1=df1,df2=df2)) #Apply the diff function Flag all the changesdiff_output['has_change'] = diff_output.apply(has_change, axis=1) #Save the changes to excel but only include the columns we care about |
@Anon3 I have tried your code on my data sets (of course I do have different data than you) and I can confirme that the pure IO from/to excel worked without any problem in my python environment. However, I think the issues are in your code logic. Currently it won't work, if the data sets are different size, it won't work if the If I were you I would refactor the logic as follows: data_set = { Account : (Col1,Col2,Col3,...Coln), ... } Consider to address this to |
@stancikcom thanks for your valuable comments. I am a newbie .. never used hasattr() function. |
@Anon3 with #python and #pandas you are in the right track... :-) pd.merge(df1,df2,on='account',how='left') |
@stancikcom , This is the additional piece of code: PS: I am sure there is something wrong in comparison.I'll try to use either of these:a.empty, a.bool(), a.item(), a.any() or a.all() |
how to compare the left and right column from the merged dataframe? I was trying to find some reference over the stackoverflow, didn't find for the merge but found a different way to compare two data frames. Here is the piece of code which works fine: Read in both excel filesdf1 = pd.read_excel('Excel1.xlsx', 'Sheet2', na_values=['NA']) Make sure we order by account number so the comparisons workdf1.sort(columns="Col1") ne_stacked = (df1 != df2).stack() changed = ne_stacked[ne_stacked] changed.index.names = ['id', 'col'] difference_locations = np.where(df1 != df2) changed_from = df1.values[difference_locations] changed_to = df2.values[difference_locations] print(pd.DataFrame({'from': changed_from, 'to': changed_to}, index=changed.index)) |
closed in favor of #10125 |
Builds are failing on Travis due to some issue with Excel (weirdly passed on merge but then started failing a few commits later). Currently investigating. C.f. https://travis-ci.org/pydata/pandas/jobs/35873580
@neirbowj - any idea why these tests stopped passing?
It's all some variation on the following:
The text was updated successfully, but these errors were encountered: