-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add support for a pandasrc #4907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
+1 for a matplotlibish format |
both ipython and python itself have existing startup file mechanisms in place, |
@y-p lets you distribute a project with settings options in the directory to make it work the way you expect and allow you to set something for one project that you aren't doing for others. You can't unilaterally overwrite or append to people's python startup files. This is a way to get granular formatting without needing to alter those things or even know where they are. And this doesn't prevent you from overriding those settings in a startup file. |
I see, so it's not about an addition to a user's dotfiles but about making a project self-contained. |
@y-p requires imports or setting environment variables. It's also a pretty minor addition - what's below plus a bit of searching for the path to the config file: def from_object(obj):
if hasattr('items'):
set_option(*obj.items())
else:
for k in obj:
set_option(k, obj[k])
def from_file(path_or_buf):
from pandas.core.common import _get_handle
option_splitter = re.compile('\s*[:=]\s*').split
f = _get_handle(path_or_buf)
errors = []
for i, line in enumerate(f):
# allow for comments
line = line.split('#')[0].strip()
if line:
try:
split = option_splitter(line)
if len(split) == 2:
option, value = split
set_option(option, value)
else:
raise ValueError("Malformed option")
except (KeyError, ValueError) as e:
errors.append("%d: %s" % (i, e))
print errors |
and clearly better errors, etc. |
Plus, this separates config options from actual code, which is a net gain. |
Personally, I'd rather set some config file than have a bunch of calls to |
Nope, initialization code does not requires either envars or imports. I don't follow what @cpcloud means by "keeping track of" at all. configuration is re verbosity of |
I don't see how that's true re imports.
Lucky you. I have, and it's not fun.
I just meant that I'd rather have a single config file per project than having to copy paste And that's about all I'll say on that. |
okay, that's fine. I'd like to add In other words, allows you to do this (which is nearly equivalent to what the pandasrc would do in syntax, etc): config.from_object({
'io.excel.xlsx.writer': 'openpyxl',
'display.max_rows': 80
}) and this config.options['io.excel.xlsx.writer'] = 'openpyxl'
config.options['display.max_rows'] = 80 Is that okay? feels cleaner to me than a series of function calls. |
look ok by me....only requrest I have is to make setting the docs for the option easier (not sure how as you generally need/want a multi-line, so end up creating a variable to hold it.....) |
Yeah, I like that too. The python logging module supports a re supporting set/getitem, You can already get/set values directly, e.g. |
Didn't realize it supports setattr. On Sun, Sep 22, 2013 at 7:09 AM, y-p [email protected] wrote:
|
heres a perfect case for this, #2612 |
push to 0.14? |
Sure, but are we even doing this anymore? On Wed, Oct 2, 2013 at 5:36 PM, jreback [email protected] wrote:
|
can certainly close? didn't you have a use case for it? |
it would be useful to me, but @y-p's point that it's not necessary seems reasonable too. |
ok move to someday or 014 for revisiting |
whichever. On Wed, Oct 2, 2013 at 7:15 PM, jreback [email protected] wrote:
|
@jtratner did you move this to 0.13? |
I've put this for 0.13, because the feedback from |
shouldn't take too long to do either. |
and how is this different that just doing |
it's not, it just means you can turn it off for old scripts. There have been one or two issues posted on pandas about this as well - e.g. #5597 |
users who want to turn off the warnings almost certainly have an ipython startup script already. I just think a pandasrc is pretty duplicative IMHO. The point of the warning is mostly for new users in any event. |
okay, pushed to someday again |
The python/ipython startup files are less known then we might think, and there's |
I'd like to revive this issue. A |
I'd like to have a configuration file for pandas. I hate having to always place: pd.set_option('display.latex.repr', True)
pd.set_option('display.latex.longtable', True) at the start of my notebook especially since I always have to look it up. Using an ipython startup file is not a solution since I don't want pandas to always import (I also don't want to hide the import. I just want to hide the settings that I use to export to pdf). |
I am looking into this implimenting this feature. I've reviewed the mentioned guidelines for implimenting based on matplotlibrc. It searches a few spots for the rc file that mostly make sense, but with a few exceptions. Steps to find the config file.
Please let me know if I am missing anything in this analysis. Another feature that comes to mind when looking at this is if we should search for the first RC file and stop or this should be a bottom up stack approach? Basically would a user like to have global settings in there home dir for all projects and then have the ability for a local project to override some settings without replacing the global settings. Or would this get confusing? I'd impliment that by parsing every RC we find (3,2,1) and then calling pd.set_option for each setting in the files. That way if the same setting was set in two files, the last file parsed would be the setting we run with. I'd like to hear others thoughts on this. |
I've been thinking about this again in the context of extension arrays.
I suspect we'll want config options to enable things like integer-na by
default, apache arrow memory by default, etc. It'd
be nice if we had a more robust config system before those land.
There's a lot of prior art with designing configuration systems. It'd be
good to collect the strengths and weaknesses of those
before we go off and do our own.
…On Wed, Mar 13, 2019 at 1:35 PM Ben Payne ***@***.***> wrote:
I am looking into this implimenting this feature. I've reviewed the
mentioned guidelines for implimenting based on matplotlibrc. It searches a
few spots for the rc file that mostly make sense, but with a few exceptions.
Steps to find the config file.
1. check local directory: Seems like a good idea for a projec to
override settings
2. checks a env varaible (MATPLOTLIBRC) and if it exists looks for a
file at that path $MATPLOTLIBRC/matplotlibrc: Seems redundant to look for a
file at that path. Why not just have the env variable point to the file.
This would allow you to have several RC files in the same directroy and
just change your env to get different behavior.
3. Look at the users home dir and find a rc file there: No changes to
this, makes sense.
4. Check an install file location that will be over written every time
the package is installed: This seems like a bad idea. While I think a
example RC file in this location makes sense, something poeple can copy and
modify for themselves. Also something that has comments documenting all the
options. Making this file that has to be parsed for every user that imports
pandas seems to be excesive. Furthermore if developers want to change some
defaults of various options, change that in the code, not by a global
config file. So I am planning to drop this step.
Please let me know if I am missing anything in this analysis.
Another feature that comes to mind when looking at this is if we should
search for the first RC file and stop or this should be a bottom up stack
approach? Basically would a user like to have global settings in there home
dir for all projects and then have the ability for a local project to
override some settings without replacing the global settings. Or would this
get confusing? I'd impliment that by parsing every RC we find (3,2,1) and
then calling pd.set_option for each setting in the files. That way if the
same setting was set in two files, the last file parsed would be the
setting we run with. I'd like to hear others thoughts on this.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4907 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIrM_SvCb2vw_Fi4guVBw_WGkvX9mks5vWUTjgaJpZM4BBU3B>
.
|
I agree that it would be nice to leverage something existing. The config file format proposed is not very standard (key: value). It is simple, but that could be a weakness or a strength down the road. As for parsing this with standard packages, python configparser won't work, shlex could work, but a simple line by line split on ":" would probably be the easiest way to parse this. Today the option system is designed to only handle a singe value any option. So the "key -> value" paramdim is good for our use. Unless you are enviosining that this will change soon? I'm a fan of using JSON for config files. Parsing is easy, it's very flexible down the road and easy to understand no matter how experianced you are with it. Is there some specific "prior art" you had in mind that I could look at to evaluate for our uses? |
On Wed, Mar 13, 2019 at 2:44 PM Ben Payne ***@***.***> wrote:
I agree that it would be nice to leverage something existing. The config
file format proposed is not very standard (key: value). It is simple, but
that could be a weakness or a strength down the road. As for parsing this
with standard packages, python configparser won't work, shlex could work,
but a simple line by line split on ":" would probably be the easiest way to
parse this. Today the option system is designed to only handle a singe
value any option. So the "key -> value" paramdim is good for our use.
Unless you are enviosining that this will change soon?
I'm a fan of using JSON for config files. Parsing is easy, it's very
flexible down the road and easy to understand no matter how experianced you
are with it.
I'm slightly against JSON for anything that's supposed to be edited by hand
(as I suspect this config would be).
Is there some specific "prior art" you had in mind that I could look at to
evaluate for our uses?
Any Python library with a config system, as each as likely implemented
their own :) Django, IPython, Flask, Dask
… —
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4907 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIuP1jP3TDmbmDHlLcI2JuEPVebJLks5vWVUKgaJpZM4BBU3B>
.
|
The Jupyter Project developed Traitlets which handles there configuration system. There config files are regular python files. Which is super awesome because I can use 1 config file that adapts nicely to multiple different systems that I manage. One of the problems that I noticed with IPython startup files is that the startup file ran on any IPython environment that I used including ones that didn't have pandas installed. Something like a Pandas startup file that only loads on importing Pandas could work. |
I've created a tool to dump the currently support options in the file format proposed by @jtratner. See the attachment for this. So looking over the options that exist today, most are bool, int or strings. However one is a callback function (display.float_format). To actually support this in a config file the file would probably have to be python, like Jupyter. That certianly creates a powerful system. But the fact that someone could put any code in that file could create some interesting side effects. The concern I have about something like Traitlets is that over laps with code that has already been created in pandas to register options, provide callbacks when changed and a framework for validators to be supported. It will also requier reworking code around each of the nearly 50 options that are supported today. However it would take these options from a centeralize place and put them in the code that uses them. Always good or eliminate centralization in code bases... When starting this I was envisioning leaving that infrastructure in place and simply building a layer on top that reads setting from a config file and invokes the current, relativly robust, system for setting options. The fact is the feature could be as simple as adding code at startup to look for a config file (in python formating) and load that file. The File itself could be a series of set_option calls. This is probably 10 lines of code and has minimal overhead, espaecially if no file is found. I've worked with Django and Flask before. They both use python files for configuration. I'm not sure under the covers if they are doing some centralized like we are today or more like Traitlets is. IPython uses Traitlets from what I've read. Dask I brefly looked into and learned that the file format propesed in this issue (key: value) is called YAML and there is a project that support this format (PyYAML). So it seems the design decision are coming down to:
If the concensus is to go with a decentralized approach like Traitlets then this will be a much bigger change to the codebase. If that is the case we might want to seperate this into two issue. Building out the config file as the issues describes and then a larger task to rework the stoarge of options into a decentralized maner. |
Discussed this on today's dev call and the consensus was mostly-negative. @MarcoGorelli had some points about inevitable feature creep into people wanting overrides in command-line and in pyproject.toml files. If a champion steps up to implement+maintain this we can reconsider, but for now im closing as no action. |
Just using the existing configuration framework, but with a file format like matplotlib uses... See how they do it here: http://matplotlib.org/users/customizing.html.
Plus, we can document all the config options in a single file.
related #2452, #3046
The text was updated successfully, but these errors were encountered: