-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Add sphinx spelling extension #21109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Just gave a quick skim.
doc/source/10min.rst
Outdated
@@ -645,7 +645,7 @@ the quarter end: | |||
ts.index = (prng.asfreq('M', 'e') + 1).asfreq('H', 's') + 9 | |||
ts.head() | |||
|
|||
Categoricals | |||
Categorical | |||
------------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this, to not break links.
doc/source/advanced.rst
Outdated
@@ -387,7 +387,7 @@ Furthermore you can *set* the values using the following methods. | |||
df2.loc(axis=0)[:, :, ['C1', 'C3']] = -10 | |||
df2 | |||
|
|||
You can use a right-hand-side of an alignable object as well. | |||
You can use a right-hand-side of an align object as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think alignable is OK.
doc/source/advanced.rst
Outdated
@@ -593,7 +593,7 @@ Take Methods | |||
|
|||
Similar to NumPy ndarrays, pandas Index, Series, and DataFrame also provides | |||
the ``take`` method that retrieves elements along a given axis at the given | |||
indices. The given indices must be either a list or an ndarray of integer | |||
indexes. The given indexes must be either a list or an ndarray of integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer indices
, since that's the argument name to take
.
doc/source/advanced.rst
Outdated
@@ -711,7 +711,7 @@ order is ``cab``). | |||
|
|||
df2.sort_index() | |||
|
|||
Groupby operations on the index will preserve the index nature as well. | |||
Group by operations on the index will preserve the index nature as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can probably add groupby to the wordlist.
doc/source/cookbook.rst
Outdated
@@ -723,7 +723,7 @@ Rolling Apply to multiple columns where function returns a Scalar (Volume Weight | |||
s = pd.concat([ (pd.Series(vwap(df.iloc[i:i+window]), index=[df.index[i+window]])) for i in range(len(df)-window) ]); | |||
s.round(2) | |||
|
|||
Timeseries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeseries to the wordlist?
Do you have a rough measure of how long this takes to run on the docs? |
Codecov Report
@@ Coverage Diff @@
## master #21109 +/- ##
=======================================
Coverage 91.85% 91.85%
=======================================
Files 153 153
Lines 49546 49546
=======================================
Hits 45509 45509
Misses 4037 4037
Continue to review full report at Codecov.
|
I apologise for taking so long to reply but this took me longer than I expected. I've added all the words to the wordlist (this file is now huge) and corrected some stuff, I will need your help in regards to some changes that I have done:
The spelling extension is funny as it divides words by special characters (anything like I've also given it some thought and I think we could fail the build by raising an exception if there are any words in this output.txt file, the idea is to:
Finally, the time it takes to spellcheck. If every single file needs to be generated then it will take about 2-3minutes to generate and read the files and about 1 minute to spellcheck and output misspelt words. If the files are already generated then it runs in 1 minute. This is with the |
1217397
to
fc9a960
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't gotten all the way through yet but I see some consistent themes that could use updating and have labeled as such.
To answer your other question, stick with just DOC updates for the time being. If you wanted to go back and change variable names that should be a separate change if at all
doc/source/10min.rst
Outdated
@@ -663,7 +663,7 @@ Convert the raw grades to a categorical data type. | |||
df["grade"] | |||
|
|||
Rename the categories to more meaningful names (assigning to | |||
``Series.cat.categories`` is inplace!). | |||
``Series.cat.categories`` is in place!). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be making an exception for inplace
doc/source/advanced.rst
Outdated
@@ -182,7 +182,7 @@ For example: | |||
df[['foo','qux']].columns # sliced | |||
|
|||
This is done to avoid a recomputation of the levels in order to make slicing | |||
highly performant. If you want to see only the used levels, you can use the | |||
highly efficient. If you want to see only the used levels, you can use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still stick with performant - there's some overlap but still not quite perfect synonyms
doc/source/advanced.rst
Outdated
@@ -559,7 +559,7 @@ return a copy of the data rather than a view: | |||
|
|||
.. _advanced.unsorted: | |||
|
|||
Furthermore if you try to index something that is not fully lexsorted, this can raise: | |||
Furthermore if you try to index something that is not fully lex-sorted, this can raise: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen "lex-sorted" before, so I'd stick with the original spelling here
doc/source/advanced.rst
Outdated
@@ -593,7 +593,7 @@ Take Methods | |||
|
|||
Similar to NumPy ndarrays, pandas Index, Series, and DataFrame also provides | |||
the ``take`` method that retrieves elements along a given axis at the given | |||
indices. The given indices must be either a list or an ndarray of integer | |||
indexes. The given indices must be either a list or an ndarray of integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change from indices to indexes?
doc/source/cookbook.rst
Outdated
@@ -1029,7 +1029,7 @@ Skip row between header and data | |||
01.01.1990 05:00;21;11;12;13 | |||
""" | |||
|
|||
Option 1: pass rows explicitly to skiprows | |||
Option 1: pass rows explicitly to skip rows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have to be careful when changing headers - I think you now need another double quote on the line below for proper rendering
doc/source/io.rst
Outdated
@@ -1812,7 +1812,7 @@ Writing to a file, with a date index and a date column: | |||
dfj2.to_json('test.json') | |||
open('test.json').read() | |||
|
|||
Fallback Behavior | |||
Fall-back Behavior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same header comment - if you update these you need to change the line below to reflect new length
doc/source/io.rst
Outdated
@@ -2237,16 +2237,16 @@ A few notes on the generated table schema: | |||
name is ``values`` | |||
+ For ``DataFrames``, the stringified version of the column name is used | |||
+ For ``Index`` (not ``MultiIndex``), ``index.name`` is used, with a | |||
fallback to ``index`` if that is None. | |||
fall-back to ``index`` if that is None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe fallback is a word
doc/source/io.rst
Outdated
the preserveration of metadata such as dtypes and index names in a | ||
round-trippable manner. | ||
the preservation of metadata such as dtypes and index names in a | ||
round-trip manner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm would still stick with trippable here
doc/source/io.rst
Outdated
@@ -3467,7 +3467,7 @@ Fixed Format | |||
'''''''''''' | |||
|
|||
The examples above show storing using ``put``, which write the HDF5 to ``PyTables`` in a fixed array format, called | |||
the ``fixed`` format. These types of stores are **not** appendable once written (though you can simply | |||
the ``fixed`` format. These types of stores are **not** able to be appended once written (though you can simply |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appendable would read easier
doc/source/io.rst
Outdated
@@ -3820,7 +3820,7 @@ indexed dimension as the ``where``. | |||
|
|||
.. note:: | |||
|
|||
Indexes are automagically created on the indexables | |||
Indexes are auto-magically created on the indexables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stick with automagically
Thanks for the feedback @WillAyd some of these changes where made yesterday and after working today on it I got a better understanding of a few terms. I also forgot to double check the previous commit, my apologies for it. I will revert the changes from efficient back to performant. I'll try to do all the changes before submitting another commit so I'll wait a bit 😃👍 |
Hello @FabioRosado! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on June 02, 2018 at 13:56 Hours UTC |
c2383e1
to
a2fe8b9
Compare
I got some time to work on this PR and here is the updated version:
As for the MultiIndex I changed two terms but as I searched for multi-index it seems that this term appears a fair amount of time so I'm not sure if I should change ALL of them from multi-index to MultiIndex or just leave it as it is. On the documentation about the spell check should I explain better about the output.txt file and how to find the word that is being marked as spell check? Finally, should I raise an exception if any errors were found after running the spellcheck command? I think the best way to do this would be opening the output.txt file, check if it's empty and if not raise an exception. Not sure if this i the best way to approach this problem though 😃 |
Yes, ideally. Do you know how long the spellchecker takes to run over our docs? We're close to our time limit on Travis for the doc build. |
Okay I’ll add that bit of code when i get home later on. The spell checker takes 1 minute to run but if the docs weren’t generated yet they are generated first so it takes up to 6 minutes to run |
The code will not raise an exception if there are misspelt words. I was unsure which exception to raise so I went with SyntaxError with a message that perhaps will be enough to let people know why the exception was raised 😃 --EDIT-- |
doc/source/categorical.rst
Outdated
@@ -370,7 +370,7 @@ Renaming categories is done by assigning new values to the | |||
|
|||
.. note:: | |||
|
|||
Be aware that assigning new categories is an inplace operation, while most other operations | |||
Be aware that assigning new categories is an in place operation, while most other operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still think inplace as a term is fine, especially since it aligns with the keyword used for the concept throughout pandas
doc/source/comparison_with_sql.rst
Outdated
@@ -228,9 +228,9 @@ Grouping by more than one column is done by passing a list of columns to the | |||
JOIN | |||
---- | |||
JOINs can be performed with :meth:`~pandas.DataFrame.join` or :meth:`~pandas.merge`. By default, | |||
:meth:`~pandas.DataFrame.join` will join the DataFrames on their indices. Each method has | |||
:meth:`~pandas.DataFrame.join` will join the DataFrames on their indexes. Each method has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I missed it before but any reason we changed these? indices
seemed preferable to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw both terms in the documentation and since the spelling marked indices as wrong I changed a lot of them that's the reason why. I can revert these changes 😃 👍
|
||
The spellcheck will take a few minutes to run (between 1 to 6 minutes). Sphinx will alert you | ||
with warnings and misspelt words - these misspelt words will be added to a file called | ||
``output.txt`` and you can find it on your local directory ``pandas/doc/build/spelling/``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not saying this is a bad thing but any reason you chose to output to a text file instead of to STDOUT? All of the other checks I can think of off the top of my head would write to the latter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I didn't choose such approach. The spelling library is coded in this way (I checked the source code), it will always output a text file with all the misspelt words. To be honest I would much rather work with STDOUT and avoid calling open on a file to check if it was empty
|
||
Running the spell check is easy. Just navigate to your local ``pandas/doc/`` directory and run:: | ||
|
||
python make.py spellcheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we alternately add a rule to the Makefile so that make spellcheck
works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't think about this my apologies I'll add the spellcheck on the makefile 👍
doc/source/spelling_wordlist.txt
Outdated
Hoese | ||
Stansby | ||
Kamau | ||
Niederhut |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not saying we need to do this here but wonder if its cleaner to have a separate contributors file containing their names that could be pulled here in as well as into the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to dig into the source code of the Sphix spelling extension to see if this was possible. Apparently, it is possible to read from additional files but the documentation didn't explain how to do this.
I will try a few things and see if it works (which it should) 😄
doc/source/whatsnew/v0.15.2.txt
Outdated
- Regression in ``Timestamp`` does not parse 'Z' zone designator for UTC (:issue:`8771`) | ||
- Bug in `StataWriter` the produces writes strings with 244 characters irrespective of actual size (:issue:`8969`) | ||
- Fixed ValueError raised by cummin/cummax when datetime64 Series contains NaT. (:issue:`8965`) | ||
- Bug in Datareader returns object dtype if there are missing values (:issue:`8980`) | ||
- Bug in Data reader returns object dtype if there are missing values (:issue:`8980`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a class that could just be placed in backticks like StatWriter
above. Also capitalize R so DataReader
doc/source/whatsnew/v0.18.0.txt
Outdated
@@ -414,7 +414,7 @@ New Behavior: | |||
df.loc[ix, 'b'] = df.loc[ix, 'b'] | |||
df.dtypes | |||
|
|||
When a DataFrame's integer slice is partially updated with a new slice of floats that could potentially be downcasted to integer without losing precision, the dtype of the slice will be set to float instead of integer. | |||
When a DataFrame's integer slice is partially updated with a new slice of floats that could potentially be down casted to integer without losing precision, the dtype of the slice will be set to float instead of integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
down-casted?
I am going on holidays for the next 4-5 days so I won't be able to work on these issues. But when I'm back home on Saturday I will update the PR with all the requested changes 👍 |
d6dc480
to
18dc19c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments but otherwise lgtm. If you want to create a make rule for this could be a logical follow up *(separate PR)
doc/source/release.rst
Outdated
@@ -2582,7 +2582,7 @@ Improvements to existing features | |||
docstrings for Panel flex methods. (:issue:`5336`) | |||
- ``NDFrame.drop()``, ``NDFrame.dropna()``, and ``.drop_duplicates()`` all | |||
accept ``inplace`` as a keyword argument; however, this only means that the | |||
wrapper is updated inplace, a copy is still made internally. | |||
wrapper is updated in place, a copy is still made internally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
doc/source/release.rst
Outdated
@@ -2625,7 +2625,7 @@ API Changes | |||
:issue:`5744`, :issue:`5756`) | |||
- Default export for ``to_clipboard`` is now csv with a sep of `\t` for | |||
compat (:issue:`3368`) | |||
- ``at`` now will enlarge the object inplace (and return the same) | |||
- ``at`` now will enlarge the object in place (and return the same) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too
Ops I thought I had reverted all the terms but I missed those two sorry for that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks for tackling these updates!
@@ -23,3 +23,4 @@ doc: | |||
cd doc; \ | |||
python make.py clean; \ | |||
python make.py html | |||
python make.py spellcheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm well I was thinking with would be a separate rule but the way you've done it doc is fine and may even be preferable. Just calling out here in case another reviewer has a differing opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this added to the build?
Seabold | ||
Carrucciu | ||
Hoyer | ||
Pascoe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these look like the names of our contributors? can we simply generate this?
something like: git log --format='%an' | sort|uniq
(I guess would have to split on whitespace as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spelling extension checks every single file for text/docstrings so the names of the contributors needed to be whitelisted in order to be ignored by the spellchecker, so the name of every contributor is added to the build.
How would you advise on generating this? The reason why I'm asking is that the spelling extension uses path.join
on anything that you pass as a wordlist (one or two files) so we only get an exception when trying to use let's say a list of names.
I suppose we could run the command and write it to file on every run? Perhaps using something like a tempfile to get all those names and use that file on every run?
Also, I have encountered a few issues when decoding and splitting every name - the spelling extension makes some weird word divisions at times, when I try to decode and split the list Yan Facai name will be split like this: ['颜发才(Yan', 'Facai)']
since the wordlist expects one line per word, then this name will always be marked as wrong due to how the split divided the name. I suppose using regex could help with this matter but not sure.
We already have
https://github.com/pandas-dev/pandas/blob/master/scripts/announce.py for
getting a list of contributors.
We could refactor that to update the wordlist of contributors when it's rn.
…On Mon, Jun 4, 2018 at 12:51 AM, Fábio Rosado ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In doc/source/names_wordlist.txt
<#21109 (comment)>:
> +Skolasinski
+Rinoc
+Gieseke
+Safia
+Abdalla
+Saumitra
+Shahapure
+Pölsterl
+Rubbert
+Sinhrks
+Siu
+Kwan
+Seabold
+Carrucciu
+Hoyer
+Pascoe
The spelling extension checks every single file for text/docstrings so the
names of the contributors needed to be whitelisted in order to be ignored
by the spellchecker.
How would you advise on generating this? The reason why I'm asking is that
the spelling extension uses path.join on anything that you pass as a
wordlist (one or two files) so we only get an exception when trying to use
let's say a list of names.
I suppose we could run the command and write it to file on every run?
Perhaps using something like a tempfile to get all those names and use that
file on every run?
Also, I have encountered a few issues when decoding and splitting every
name - the spelling extension makes some weird word divisions at times,
when I try to decode and split the list Yan Facai name will be split like
this: ['颜发才(Yan', 'Facai)'] since the wordlist expects one line per word,
then this name will always be marked as wrong due to how the split divided
the name. I suppose using regex could help with this matter but not sure.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21109 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIgtosWu7GK3xKWP3HG9wrMuwvp7pks5t5MrcgaJpZM4UDwfG>
.
|
I see. But the names need to be added to a wordlist like the file named It makes sense that perhaps the contributor names should be in a separate file like WillAyd suggested. Also, the extension needs very specific ways to whitelist terms - one term per line. As I pointed out, the problem with scrappy contributors name is that the names will need to be split into both newlines + spaces so every name is a whitelisted word. Due to the fact that the extension divides words by special characters, it could potentially cause trouble with names such as the one from Yan Facai (example: 颜发才(Yan Facai) ) so the name needs to be added with both the special characters and normal characters - you can see this on the wordlist. So I'm unsure what to do now in regards to jreback suggestions. If anyone else could give a hand in order for me to solve this issue and the PR merged I'll be happy to fix/modify anything 😃 👍 |
Agreed. I would
I've been meaning to change the docs around so that 4 is done anyone. I'll hopefully get to it by the end of the week. |
@TomAugspurger that would be nice but I'm of the opinion that should be a separate PR, if only because that will introduce some code changes that exceed what we would normally allow in a DOC change. OK to merge this and have that as a follow up? |
I'm fine with doing it as a followup, as long as the doc build is currently
passing. I image this will be hard to keep up to date with master.
…On Mon, Jun 4, 2018 at 1:50 PM, William Ayd ***@***.***> wrote:
@TomAugspurger <https://github.com/TomAugspurger> that would be nice but
I'm of the opinion that should be a separate PR, if only because that will
introduce some code changes that exceed what we would normally allow in a
DOC change. OK to merge this and have that as a follow up?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21109 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIl2AXfcDek417vmU2BdfcvVi5xLbks5t5YFzgaJpZM4UDwfG>
.
|
i am ok with only merging the actual spelling changes the machinery needs to be done in a maintenance aware manner (no hard coding) so if this were split up then ok |
What hard coding are you referring to, the contributors list?
It'll have to always be "hard coded" in the sense that it's a static text
file. IIUC, in the meantime if we have a new contributor name, the doc
build will fail, and they'll need to manually add their name to the
contributors list?
…On Mon, Jun 4, 2018 at 2:20 PM, Jeff Reback ***@***.***> wrote:
i am ok with only merging the actual spelling changes
the machinery needs to be done in a maintenance aware manner (no hard
coding)
so if this were split up then ok
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21109 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIiIFSDkyEJsegXWS7wUCC9S8qE8Mks5t5Yh2gaJpZM4UDwfG>
.
|
yes my point is that it needs to be auto generated if u want to make this a manual action that is ok too but it should not be part of he build (is it? my question was never answered) |
It's not currently part of the doc build on Travis, so for now it's the ability to run spellcheck + fixing a bunch of typos. Opened #21354 for the followup. |
Thanks @FabioRosado! If you're interested in the followup changes to the build processes feel free to claim it on #21354. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a quick look through the diff, and some adverbs seem to have fallen through the cracks. Also "closedness".
Furthermore, there's an inconsistency with "MultiIndex" vs "multi-index" (and potentially vs. "multi-value").
@@ -781,7 +781,7 @@ Enhancements | |||
noon, January 1, 4713 BC. Because nanoseconds are used to define the time | |||
in pandas the actual range of dates that you can use is 1678 AD to 2262 AD. (:issue:`4041`) | |||
- ``DataFrame.to_stata`` will now check data for compatibility with Stata data types | |||
and will upcast when needed. When it is not possible to losslessly upcast, a warning | |||
and will upcast when needed. When it is not possible to lossless upcast, a warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is wrong - needs to be an adverb.
@@ -370,7 +370,7 @@ Updated PyTables Support | |||
df1.get_dtype_counts() | |||
|
|||
- performance improvements on table writing | |||
- support for arbitrarily indexed dimensions | |||
- support for arbitrary indexed dimensions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should stay an adverb.
@@ -2356,7 +2356,7 @@ Read a URL and match a table that contains specific text: | |||
|
|||
Specify a header row (by default ``<th>`` or ``<td>`` elements located within a | |||
``<thead>`` are used to form the column index, if multiple rows are contained within | |||
``<thead>`` then a multiindex is created); if specified, the header row is taken | |||
``<thead>`` then a multi-index is created); if specified, the header row is taken |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All above changes were to MultiIndex
- here multi-index
?
@@ -133,7 +133,7 @@ groupby operations on the index will preserve the index nature as well | |||
reindexing operations, will return a resulting index based on the type of the passed | |||
indexer, meaning that passing a list will return a plain-old-``Index``; indexing with | |||
a ``Categorical`` will return a ``CategoricalIndex``, indexed according to the categories | |||
of the PASSED ``Categorical`` dtype. This allows one to arbitrarly index these even with | |||
of the PASSED ``Categorical`` dtype. This allows one to arbitrary index these even with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adverb
@@ -462,7 +462,7 @@ Selecting via a scalar value that is contained *in* the intervals. | |||
Other Enhancements | |||
^^^^^^^^^^^^^^^^^^ | |||
|
|||
- ``DataFrame.rolling()`` now accepts the parameter ``closed='right'|'left'|'both'|'neither'`` to choose the rolling window-endpoint closedness. See the :ref:`documentation <stats.rolling_window.endpoints>` (:issue:`13965`) | |||
- ``DataFrame.rolling()`` now accepts the parameter ``closed='right'|'left'|'both'|'neither'`` to choose the rolling window-endpoint closed. See the :ref:`documentation <stats.rolling_window.endpoints>` (:issue:`13965`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Closedness" might not be the prettiest word, but "to choose the [...] closed" is definitely wrong
@@ -4115,7 +4115,7 @@ Bug Fixes | |||
datetime64 when calling DataFrame.apply. (:issue:`2374`) | |||
- Raise exception when calling to_panel on non uniquely-indexed frame (:issue:`2441`) | |||
- Improved detection of console encoding on IPython zmq frontends (:issue:`2458`) | |||
- Preserve time zone when .append-ing two time series (:issue:`2260`) | |||
- Preserve time zone when .appending two time series (:issue:`2260`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hyphen was surely intentional...
@h-vetinari this is merged so your comments are probably going to be lost.
Feel free to make a new PR with the fixes.
…On Thu, Jun 7, 2018 at 9:35 AM, h-vetinari ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In doc/source/release.rst
<#21109 (comment)>:
> @@ -4115,7 +4115,7 @@ Bug Fixes
datetime64 when calling DataFrame.apply. (:issue:`2374`)
- Raise exception when calling to_panel on non uniquely-indexed frame (:issue:`2441`)
- Improved detection of console encoding on IPython zmq frontends (:issue:`2458`)
-- Preserve time zone when .append-ing two time series (:issue:`2260`)
+- Preserve time zone when .appending two time series (:issue:`2260`)
The hyphen was surely intentional...
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#21109 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIo53cmeI07hXfZMFz0BtJEBk2_39ks5t6TpDgaJpZM4UDwfG>
.
|
Even though this PR was already merged I'm happy to fix these issues. Also apologies for replying here :) |
MultiIndex is what we want |
* Add sphinx spelling extension, fix typos and add wordlist
This is my first time contributing to a project this big so I hope everything is okay
I am still working on fixing bugs and polishing a few things but here is what I have done so far:
I've re-used the _sphix_build method to run the spelling command, to do this I had to update the _sphix_build method to include 'spelling' as a kind. I'd like to know if this is the best approach of If I should just replicate the code.
I have added a few configuring options in order to get better results when using the spellcheck, the first one was to use the wordlist text file, the second was to ignore known PyPI packages names and finally to show suggestions of a misspelt word.
At the moment the spellcheck will run if we type the command:
python make.py spellcheck
and the spellcheck method is called, I haven't yet figured out how to fail the build when there are exceptions like the Issue 21079 suggested.git diff upstream/master -u -- "*.py" | flake8 --diff