-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add interactive terminal to pandas website #46682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey @datapythonista, I took a look at NumPy's website and the way they are integrating their interactive shell is by using an iframe to access the JupyterLite interactive shell. I tried to import pandas using their shell, however, I faced some errors. Implementing the same shell as the NumPy's is relatively simple, however, I'm not sure if it will perform the same for pandas. @datapythonista I'm interested in this issue and I'd like to hear some feedback from you about the above statements |
Thanks for having a look @bennaaym, good findings. I think there are two separate problems.
For the first set of errors, I think we may need to generate our own environment, with what we need. I'm not sure about having access to the Internet. If that's not an option, I guess we could have a small csv file in the environemnt, or just build data with the constructor as we do in most examples. @jtpio do you have thoughts on this? |
Hi! Right, But it's possible to use the import pandas as pd
from js import fetch
URL = "https://gist.github.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv"
res = await fetch(URL)
text = await res.text()
filename = 'data.csv'
with open(filename, 'w') as f:
f.write(text)
data = pd.read_csv(filename, sep=';')
data Sample from jupyterlite/jupyterlite#119 (comment) Which gives the following: Agree it's not as natural as passing the URL to pandas directly but it allows to pull data from other places. |
Also for the pandas docs, we could probably use Since the getting started page seems to be located here in this repo? https://github.com/pandas-dev/pandas/blob/main/doc/source/getting_started/index.rst |
Thanks @jtpio, that's very useful to know. There are two getting started pages, one in the website and one in the docs. For now we'll add this to the website, which is not sphinx but a markdown: https://github.com/pandas-dev/pandas/blob/main/web/pandas/getting_started.md Using fetch sounds good, but we can't use it directly, since what we want is to let users play a bit with a very simplepandas example. So we need to have the import and a very simple code to give them some data, like the But probably better to start with something easier. What I'd do for now is to create a simple dataframe with the constructor, so users can play a bit with it. And once that's working, we can consider improvements. What do you think @bennaaym? |
@datapythonista sounds good. So something like the below example is the target for now, right? |
Yes, I'd start with something like that. Maybe a dataframe with few more columns and samples, so it's still simple but users can do a groupby, or use a date function. But that's the idea. |
@datapythonista FWIW, I have tried to collect various workarounds for loading files etc here on the off-chance they might suggest a way to you of simplifying |
Thanks @psychemedia that looks good. The problem here is that we don't want to read a csv, but to show to users how to do it in pandas. So, the code must be the one we want users to learn, the regular pandas one with no change at all. So, the code in the example will be just the original failing one, and what we need to change if anything are the packaged files that JupyterLite is using. Monkeypatching the downloading of files. Or if it makes sense (not sure it does), updating pandas so it can use different libraries to download files from the Internet (not sure if there is any other use case where this would be useful). |
@datapythonista Does pandas work out of the can to read and write CSV files in an arbitrary |
I assume it doesn't. We can also add the csv file to the JupyterLite distribution, so we don't need to download from the Internet, maybe that makes more sense if downloading it is too much trouble. |
@bennaaym @datapythonista this looks exciting, let me know if you need any help setting that up! |
pandas uses fsspec for all of it's I/O operations other than local and http[s]. I wonder if we could come implement an fsspec backend that uses pyodide APIs to do the IO? As a prototype it might use a That said, I think that the fsspec APIs that pandas calls might use some threads, which weren't supported by pyodide last I looked. cc @martindurant from the fsspec. |
I was testing this a bit further, and looks like from js import fetch
resp = await fetch('https://<third-party-domain>')
content = await resp.text()
content Raises this exception:
The example works fine when the url is in the same domain as the page. I'd be +1 to make pandas IO functions work with |
Hi, can I contribute to this issue if there's no problem? Also, can you please guide to this issue as I'm new here. |
I note that the JupyterLite PR jupyterlite/jupyterlite#655 means that The simplest way I've found to download files is to use something like the following, although this does require the use of the with open("test.txt", "w") as f:
f.write(pyodide.open_url(URL).read()) |
Yes this will make it possible to use Will make a new release when this PR gets merged, and will let you know here. |
FYI a new release of See this example notebook for more information: https://github.com/jupyterlite/jupyterlite/blob/main/examples/pyolite/virtual_drive.ipynb |
Linking to the repo / JupyterLite deployment used for the Sympy website (https://www.sympy.org/en/shell.html) for reference: https://github.com/sympy/live. They enable some optimizations to only deploy the REPL app which could be relevant for There could be a similar repo in the |
take |
I called the code, but in the first call |
Thanks @hamedgibago for working on this. With the latest release of JupyterLite it shouldn't be necessary to manually fetch the file anymore (see #46682 (comment)). Probably the simplest would be to create a custom JupyterLite deployment as detailed in https://jupyterlite.readthedocs.io/en/latest/quickstart/deploy.html. And then add the example csv file to the contents. |
Just opened #47428 to get this started. |
We discussed in the past about making pandas examples in the documentation runnable. The original idea was to use Binder for it, which requires a decent amount of hosting, besides setting up things in our end.
There is now a new alternative, based on webassembly, Jupyter Lite. The idea is that there is no backend to run the code, but it's a WebAssembly Python interpreter in the client browser who executes the code.
NumPy is already using this in their home page. The terminal takes few seconds to load, but after that, seems to work just fine, and it already has pandas installed in the environment, so
import pandas
works.Would be nice to get the same for pandas. I'd start adding a new section to our Getting started page for the interactive terminal, see how it works, and when we've got this working fine, we can consider adding it to the home page, making examples runnable...
The text was updated successfully, but these errors were encountered: