-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TODO: more pprint imporvements #3426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@y-p i can take this if you want |
please. |
some_data = np.ma.array(np.random.rand(100,100),mask=np.random.rand(100,100)>0.2)
df = pd.DataFrame(dict(example=[some_data]*100))
print df Is there some scope for calling I'm still pretty new to pandas, so maybe I've missed something. (For reference, I'm doing neuroscience: I have two or three "levels" of analysis, the top most of which, ie. the most meta-level, I would like to be doing in pandas, but I would very much like to store some of the lower level stuff in a DataFrame alongside the meta-stuff.) |
What you are doing is extremely inefficient. Pandas (and numpy) are generally best used to hold a single scalar in a cell, which can be represented by a base type (e.g. a float). Try this
If you need multiple levels, simply add a multi-index. Changing a printing routine to handle this use case is not likely to happen as it would increase the code complexity (over which it already is pretty crazy) |
I think I explained that rather poorly. Imagine my 100 arrays is a list of images of random shapes, with each image having a multiindex tuple of (person, day, hour). I compute a bunch of metrics for each image and put the results as columns in my dataframe, but also putting the images themselves as a column. I can then do my meta analysis on both groups of the raw images and/or on the scalar-valued columns...I need to be able to do both. I have already submitted a pull request which makes it easyish to render numpy arrays as images in base64 data within html img tags, but the default display mechanism is still this slow pprint call. |
well if you really really want to do this. I would simply wrap an object around the array and give it a custom printing method. Then you can do whatever you want. Pandas tries to do the right thing by printing nested things, but in this case you are putting something which pandas can render there. |
yes, that had occurred to me, but it's not as convenient and Id hoped that this kind of blob-like usage was mainstream enough to merit a line or two in the right place! |
this is not the right way to harness the power of pandas you are almost certainly better off keeping your images as numpy arrays or whatever (or frames) and simply having a references (eg a string to them in a particular column) or use the object soln you are trying to shoohorn s frame onto what you need |
Shoohorning maybe, but the result is actually quite good - I'd recommend it to anyone else doing a similar kind of analysis. |
Closed as ambiguous |
TODO
printed via pprint_thing).
- [ ] options.display.max_seq_items should have a default value != None #3391, #5120 , #5629via #5753
The text was updated successfully, but these errors were encountered: