-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH/DOC: wide_to_long performance and docstring clarification #14779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
you can simply push to this pr as you update FYI |
new = df[id_vars].set_index(i).join(mstubs) | ||
|
||
try: | ||
new.index.set_levels(new.index.levels[-1].astype(int), level=-1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when / why does this raise? can you provide a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just the same int conversion attempt done in the original code here, since this "time" column may contain strings not necessarily can be converted to integers. Like the original the index is set to [i, j]
, which is why this operation is done on the index at the end. I will add a comment, sorry for the confusion
pls add an asv for this, see: http://pandas.pydata.org/pandas-docs/stable/contributing.html#running-the-performance-test-suite |
asv added:
|
Current coverage is 85.27% (diff: 96.15%)@@ master #14779 diff @@
==========================================
Files 144 144
Lines 50981 50989 +8
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43470 43481 +11
+ Misses 7511 7508 -3
Partials 0 0
|
new = df[id_vars].set_index(i).join(mstubs) | ||
|
||
# The index of the new dataframe is [i, j], if the j column is a time | ||
# variable, try to convert this to integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand what you are doing. can you show the index before / after
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jeff, here is an example:
In [8]: N = 3
...: df = pd.DataFrame({ 'A 2010': np.random.rand(N),
...: 'A 2011': np.random.rand(N),
...: 'B 2010': np.random.rand(N),
...: 'B 2011': np.random.rand(N),
...: 'X' : np.random.randint(N, size=N),
...: })
...: df['id'] = df.index
...: df
...:
Out[8]:
A 2010 A 2011 B 2010 B 2011 X id
0 0.731823 0.790627 0.236080 0.727762 1 0
1 0.820396 0.474342 0.614218 0.363226 0 1
2 0.463291 0.210859 0.332595 0.061011 0 2
before the Try/Except
In [9]: before = pd.wide_to_long(df, ['A', 'B'], i='id', j='year')
...: before.index
...:
Out[9]:
MultiIndex(levels=[[0, 1, 2], [u' 2010', u' 2011']],
labels=[[0, 1, 2, 0, 1, 2], [0, 0, 0, 1, 1, 1]],
names=[u'id', u'year'])
after
In [10]: after= pd.wide_to_long(df, ['A', 'B'], i='id', j='year')
...: after.index
...:
Out[10]:
MultiIndex(levels=[[0, 1, 2], [2010, 2011]],
labels=[[0, 1, 2, 0, 1, 2], [0, 0, 0, 1, 1, 1]],
names=[u'id', u'year'])
which is the same as on the master branch
In [11]: master = pd.wide_to_long(df, ['A', 'B'], i='id', j='year')
...: master.index
...:
Out[11]:
MultiIndex(levels=[[0, 1, 2], [2010, 2011]],
labels=[[0, 1, 2, 0, 1, 2], [0, 0, 0, 1, 1, 1]],
names=[u'id', u'year'])
Why the original author did the Try before converting to int
:
In [13]: df2 = pd.DataFrame({ 'A one': np.random.rand(N),
...: 'A two': np.random.rand(N),
...: 'B one': np.random.rand(N),
...: 'B two': np.random.rand(N),
...: 'X' : np.random.randint(N, size=N),
...: })
...: df2
...:
Out[13]:
A one A two B one B two X
0 0.315281 0.684260 0.397193 0.531613 1
1 0.156044 0.749942 0.923540 0.383348 0
2 0.577983 0.507933 0.226466 0.937341 0
In long format:
In [15]: df2['id'] = df2.index
...: pd.wide_to_long(df2, ['A', 'B'], i='id', j='year')
...:
Out[15]:
X A B
id year
0 one 1 0.315281 0.397193
1 one 0 0.156044 0.923540
2 one 0 0.577983 0.226466
0 two 1 0.684260 0.531613
1 two 0 0.749942 0.383348
2 two 0 0.507933 0.937341
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the auto coercing of the strings -> ints. This is not very idiomatic and unexpected. I would leave the columns as strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, but regarding a character that separates the stub name from the variable part:
In [7]: df = pd.DataFrame({ 'A.2010': np.random.rand(N),
...: 'A.2011': np.random.rand(N),
...: 'B.2010': np.random.rand(N),
...: 'B.2011': np.random.rand(N),
...: 'X' : np.random.randint(N, size=N),
...: })
...:
...: df
...:
Out[7]:
A.2010 A.2011 B.2010 B.2011 X
0 0.873404 0.467946 0.569808 0.358077 1
1 0.780154 0.554582 0.668437 0.810530 1
2 0.884003 0.555784 0.246305 0.038423 2
In [8]: df['id'] = df.index
...: pd.wide_to_long(df, ['A.', 'B.'], i='id', j='year')
...:
Out[8]:
X A. B.
id year
0 2010 1 0.873404 0.569808
1 2010 1 0.780154 0.668437
2 2010 2 0.884003 0.246305
0 2011 1 0.467946 0.358077
1 2011 1 0.554582 0.810530
2 2011 2 0.555784 0.038423
A user might expect the new separating character (.
) to be stripped, like reshape
in R
does.
needs a whatsnew entry (0.20.) |
you could add an argument to specify the split (or make it take a regex) yes it should get stripped |
@@ -88,6 +88,7 @@ Removal of prior version deprecations/changes | |||
Performance Improvements | |||
~~~~~~~~~~~~~~~~~~~~~~~~ | |||
|
|||
- Improved performance of ``wide_to_long`` (:issue:`14779`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pd.wide_to_long()
@@ -875,7 +875,7 @@ def lreshape(data, groups, dropna=True, label=None): | |||
return DataFrame(mdata, columns=id_cols + pivot_cols) | |||
|
|||
|
|||
def wide_to_long(df, stubnames, i, j): | |||
def wide_to_long(df, stubnames, i, j, sep=""): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make it sep='\s+'
whitespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm strange, rstrip
doesn't seem to recognise that?
In [13]: 'A (quarterly) '.rstrip('\s+')
Out[13]: 'A (quarterly) '
In [14]: 'A (quarterly) '.rstrip(" ")
Out[14]: 'A (quarterly)'
@@ -890,8 +890,9 @@ def wide_to_long(df, stubnames, i, j): | |||
The name of the id variable. | |||
j : str | |||
The name of the subobservation variable. | |||
stubend : str | |||
Regex to match for the end of the stubs. | |||
sep : str, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specify what the default is
exp_frame = exp_frame.set_index(['id', 'year'])[["X", "A", "B"]] | ||
long_frame = wide_to_long(df, ['A', 'B'], 'id', 'year') | ||
tm.assert_frame_equal(long_frame, exp_frame) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some tests with sep (and maybe some that have an invalid sep)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What input where you thinking about?
if a nonsense separator is passed nothing is stripped:
In [15]: df = pd.DataFrame({'A.2010': np.random.rand(3),
...: 'A.2011': np.random.rand(3),
...: 'B.2010': np.random.rand(3),
...: 'B.2011': np.random.rand(3),
...: 'X' : np.random.randint(3, size=3)})
...: df['id'] = df.index
...: pd.wide_to_long(df, ['A.', 'B.'], i='id', j='year', sep="nope")
...:
Out[15]:
X A. B.
id year
0 2010 2 0.330193 0.728615
1 2010 0 0.710791 0.601923
2 2010 1 0.066218 0.618455
0 2011 2 0.597949 0.324131
1 2011 0 0.024911 0.968051
2 2011 1 0.310596 0.866798
In [16]: pd.wide_to_long(df, ['A.', 'B.'], i='id', j='year', sep=",,")
Out[16]:
X A. B.
id year
0 2010 2 0.330193 0.728615
1 2010 0 0.710791 0.601923
2 2010 1 0.066218 0.618455
0 2011 2 0.597949 0.324131
1 2011 0 0.024911 0.968051
2 2011 1 0.310596 0.866798
further, not sure if http://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-by-melt |
that's a regex |
Sorry, I didn't get that? And |
@Nuffe I was referring to you are splitting on the
is probably reasonable |
To keep things simple I propose we break the API of
I.e. the user passes the names of the time varying column as a list based on In [3]: df = pd.DataFrame({"A 1970" : {0 : "a", 1 : "b", 2 : "c"},
...: "A 1980" : {0 : "d", 1 : "e", 2 : "f"},
...: "B 1970" : {0 : 2.5, 1 : 1.2, 2 : .7},
...: "B 1980" : {0 : 3.2, 1 : 1.3, 2 : .1},
...: "X" : dict(zip(range(3), np.random.randn(3)))
...: })
...: df['id'] = df.index
...: df
...:
Out[3]:
A 1970 A 1980 B 1970 B 1980 X id
0 a d 2.5 3.2 0.136953 0
1 b e 1.2 1.3 -1.238109 1
2 c f 0.7 0.1 1.249809 2 In [4]: varying = ['A 1970', 'A 1980', 'B 1970', 'B 1980']
...: pd.wide_to_long(df, varying, i='id', j='year', sep=' ')
...:
Out[4]:
X A B
id year
0 1970 0.136953 a 2.5
1 1970 -1.238109 b 1.2
2 1970 1.249809 c 0.7
0 1980 0.136953 d 3.2
1 1980 -1.238109 e 1.3
2 1980 1.249809 f 0.1 The user can easily construct the If the existing columns does not adhere to the above specification, they need to be changed to a suitable format first. A doc example can show how this can be easily done with a regex with a backreference. What do you think? |
the varying should be a list of tuples but otherwise looks ok can this be backward compat? |
I didn't understand the first comment: And I do not think this can be made backward compat because the The old doc example were there is no single character separator, f.eks df.columns.str.replace('([A-B])', '\\1.')
Index([u'A.1970', u'A.1980', u'B.1970', u'B.1980', u'X', u'id'], dtype='object') then calling I do not know if this really is considered too unwieldy...? R's The original function author seemed to have tried to mimic So if we should preserve the original function authors intention, where the user only supplies stubnames as in Stata, we need to impose some strict assumptions on the column names passed, like the only kind of column names we can have (that are varying) are of the type Perhaps it is just better to make this implicit assumption explicit and keep its "Stata like" interface? And make it robust to this specificaton ( (sorry for the messyness here but I ended up spending some time familiarizing myself more with |
So here is an attempt to make the original interface more robust, these two examples fail on the master branch, but should be able to produce the correct result which is: In [12]: df = pd.DataFrame({
...: 'A11': ['a11', 'a22', 'a33'],
...: 'A12': ['a21', 'a22', 'a23'],
...: 'B11': ['b11', 'b12', 'b13'],
...: 'B12': ['b21', 'b22', 'b23'],
...: 'BB11': [1, 2, 3],
...: 'BB12': [4, 5, 6],
...: 'BBBX' : [91, 92, 93],
...: 'BBBZ' : [91, 92, 93]
...: })
...: df['id'] = df.index
...: df
...:
Out[12]:
A11 A12 B11 B12 BB11 BB12 BBBX BBBZ id
0 a11 a21 b11 b21 1 4 91 91 0
1 a22 a22 b12 b22 2 5 92 92 1
2 a33 a23 b13 b23 3 6 93 93 2
In [13]: pd.wide_to_long(df, ['A', 'B', 'BB'], i='id', j='year')
Out[13]:
BBBX BBBZ A B BB
id year
0 11 91 91 a11 b11 1
1 11 92 92 a22 b12 2
2 11 93 93 a33 b13 3
0 12 91 91 a21 b21 4
1 12 92 92 a22 b22 5
2 12 93 93 a23 b23 6 In [14]: df = pd.DataFrame({
...: 'A(quarterly)2011': ['a11', 'a22', 'a33'],
...: 'A(quarterly)2012': ['a21', 'a22', 'a23'],
...: 'B(quarterly)2011': ['b11', 'b12', 'b13'],
...: 'B(quarterly)2012': ['b21', 'b22', 'b23'],
...: 'BB(quarterly)2011': [1, 2, 3],
...: 'BB(quarterly)2012': [4, 5, 6],
...: 'BBBX' : [91, 92, 93],
...: 'BBBZ' : [91, 92, 93]
...: })
...: df['id'] = df.index
...: df
...:
Out[14]:
A(quarterly)2011 A(quarterly)2012 B(quarterly)2011 B(quarterly)2012 \
0 a11 a21 b11 b21
1 a22 a22 b12 b22
2 a33 a23 b13 b23
BB(quarterly)2011 BB(quarterly)2012 BBBX BBBZ id
0 1 4 91 91 0
1 2 5 92 92 1
2 3 6 93 93 2
In [15]: pd.wide_to_long(df, ['A(quarterly)', 'B(quarterly)', 'BB(quarterly)'], i='id', j='year')
Out[15]:
BBBX BBBZ A(quarterly) B(quarterly) BB(quarterly)
id year
0 2011 91 91 a11 b11 1
1 2011 92 92 a22 b12 2
2 2011 93 93 a33 b13 3
0 2012 91 91 a21 b21 4
1 2012 92 92 a22 b22 5
2 2012 93 93 a23 b23 6 The first one fails because the regex confuses the same substrings in the Assuming a [16]: df = pd.DataFrame({
...: 'A11': ['a11', 'a22', 'a33'],
...: 'A12': ['a21', 'a22', 'a23'],
...: 'B11': ['b11', 'b12', 'b13'],
...: 'B12': ['b21', 'b22', 'b23'],
...: 'BB11': [1, 2, 3],
...: 'BB12': [4, 5, 6],
...: 'Acat' : [91, 92, 93],
...: 'BBBZ' : [91, 92, 93]
...: })
...: df['id'] = df.index
...: df
...:
Out[16]:
A11 A12 Acat B11 B12 BB11 BB12 BBBZ id
0 a11 a21 91 b11 b21 1 4 91 0
1 a22 a22 92 b12 b22 2 5 92 1
2 a33 a23 93 b13 b23 3 6 93 2 raises a While the following works df = pd.DataFrame({
...: 'A-11': ['a11', 'a22', 'a33'],
...: 'A-12': ['a21', 'a22', 'a23'],
...: 'B-11': ['b11', 'b12', 'b13'],
...: 'B-12': ['b21', 'b22', 'b23'],
...: 'BB-11': [1, 2, 3],
...: 'BB-12': [4, 5, 6],
...: 'Acat' : [91, 92, 93],
...: 'BBBZ' : [91, 92, 93]
...: })
...: df['id'] = df.index
...: df
...:
Out[18]:
A-11 A-12 Acat B-11 B-12 BB-11 BB-12 BBBZ id
0 a11 a21 91 b11 b21 1 4 91 0
1 a22 a22 92 b12 b22 2 5 92 1
2 a33 a23 93 b13 b23 3 6 93 2
In [19]: pd.wide_to_long(df, ['A', 'B', 'BB'], i='id', j='year', sep='-')
Out[19]:
Acat BBBZ A B BB
id year
0 11 91 91 a11 b11 1
1 11 92 92 a22 b12 2
2 11 93 93 a33 b13 3
0 12 91 91 a21 b21 4
1 12 92 92 a22 b22 5
2 12 93 93 a23 b23 6 |
I have maintained the user friendly (and evidently Stata inspired) interface (and stated what structure this function assumes on the column names), and tried to fix mistakes that arise with various "pathological" input, for example if the |
Notes | ||
----- | ||
All extra variables are treated as extra id variables. This simply uses | ||
`pandas.melt` under the hood, but is hard-coded to "do the right thing" | ||
in a typicaly case. | ||
""" | ||
# For robustess, escape every user input string we use in a regex | ||
import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be imported at the top of the file
# For ex. AA2011, AA2012, AAkitten have inconsistent postfix | ||
for k, vars in enumerate(value_vars): | ||
stripped = map(lambda x: x.replace(stubs[k], ""), vars) | ||
is_digit = [s.isdigit() for s in stripped] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have tests for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering the comment below on not using a regex to find the id_vars
: perhaps just formulate a consistency check, and use warnings
and warn the user if for example an inferred value_var
has different types?
For example: at the end check if the new
data frame's 'j' index contains both ints and strings and warn about this? If stubnames
supplied is ['AA2011', 'AA2012']
and df
contains a column named Acat
then the new
dataframe's j
column will have levels 2011, 2012, cat
. And likewise if stubnames
contains ['CatOne', 'CatTwo']
and df
has a colum named Cat3000
the new
j
index will have levels One, Two, 3000
.
The only way to disambiguate the first case is to take an optional stubendtype
parameter denoting the stubends are numbers. The second case is not possible to disambiguate (tried in Stata)
# two resulting value_vars lists | ||
if len(value_vars_flattened + id_vars) != len(df.columns): | ||
value_vars_augmented = map(lambda x: get_var_names( | ||
df, "^{0}".format(re.escape(x))), stubnames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks fragile. I would just raise here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or instead of doing a search for the id_vars
in the first place, would it not be simpler just to do:
id_vars = set(df.columns.tolist()).difference(value_vars_flattened)
?
(then do some consistency checks)
# This regex is needed to avoid multiple "greedy" matches with stubs | ||
# that have overlapping substrings | ||
# (for example A2011, A2012 are separate from AA2011, AA2012) | ||
value_vars = list(map(lambda x: get_var_names( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally you would just look for a match of a letter followed by a non-letter (or vice versa), I think that is more robust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in the case of string stems, the three groups here will not be captured:
Aone, Atwo, Bone, Btwo, BBone, BBtwo
A negative lookahead ^B(?!B)
could be more robust? I.e. the regex would be "^{0}(?!{1})".format(re.escape(x), x[-1]))
. That one would capture the three groups here and ignore for example BBBrating
I found another "Stata like" use case (I am going to add the option of supplying a list of 'id' variables, it will require another short rewrite since I have to move from using |
Sometimes AppVeyor/Travis fails with unrelated tests (like |
can you rebase, problem with appeveyor which i just fixed |
in the wide format, to be stripped from the names in the long format. | ||
For example, if your column names are A-suffix1, A-suffix2, you | ||
can strip the hypen by specifying `sep`='-' | ||
numeric_suffix : bool, default True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather call this suffix='\d+'
, IOW use a regex to match this, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes more sense
Going from long back to wide just takes some creative use of `unstack` | ||
|
||
>>> w = l.reset_index().set_index(['famid', 'birth', 'age']).unstack() | ||
>>> w.columns = [name + suffix for name, suffix in wide.columns.tolist()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use this:
In [28]: Index(w.columns).str.join('')
Out[28]: Index(['ht1', 'ht2'], dtype='object')
if any(map(lambda s: s in df.columns.tolist(), stubnames)): | ||
raise ValueError("stubname can't be identical to a column name") | ||
|
||
if not isinstance(stubnames, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually use is_list_like
, IOW, you can pass a non-string iterable, can you update the doc-string as well
@@ -716,6 +716,204 @@ def test_stubs(self): | |||
|
|||
self.assertEqual(stubs, ['inc', 'edu']) | |||
|
|||
def test_separating_character(self): | |||
np.random.seed(123) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add this issue number as a comment
lgtm. some minor comments. |
Use is_list_like Add GH ticket #
@jreback another minor issue: sphinx doesn't print the |
You can indeed do that, but normally then the escaping should not be needed |
had to escape them)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't follow the full discussion above, but there was some talk about backwards compatibility. What is the conclusion on that? Is the last version back compat or are there changes in behaviour?
|
||
def setup(self): | ||
vars = 'ABCD' | ||
nyrs = 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you fix up the indentation here?
idobs = dict(zip(range(nidvars), np.random.rand(nidvars, N))) | ||
|
||
self.df = pd.concat([pd.DataFrame(idobs), pd.DataFrame(yearobs)], | ||
axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can also do something like DataFrame(np.random.randn(N, nidvars + len(yrvars)), columns=list(range(nidvars)) + yrvars
to make it a bit simpler
@jorisvandenbossche Yes this version is back compat. The PR got a bit lengthy because I did more than I anticipated (was originally a simple PR for a quick speed improvement - but I discovered afterwards that there where several use cases the original function couldn't handle). |
thanks @Nuffe very nice PR, and you were very responsive! if you want to tackle other issues would be much appreciated! |
closes pandas-dev#14778 Please see regex search on long columns by first converting to Categorical, avoid melting all dataframes with all the id variables, and wait with trying to convert the "time" variable to `int` until last), and clear up the docstring. Author: nuffe <[email protected]> Closes pandas-dev#14779 from nuffe/wide2longfix and squashes the following commits: df1edf8 [nuffe] asv_bench: fix indentation and simplify dc13064 [nuffe] Set docstring to raw literal to allow backslashes to be printed (still had to escape them) 295d1e6 [nuffe] Use pd.Index in doc example 1c49291 [nuffe] Can of course get rid negative lookahead now that suffix is a regex 54c5920 [nuffe] Specify the suffix with a regex 5747a25 [nuffe] ENH/DOC: wide_to_long performance and functionality improvements (pandas-dev#14779)
git diff upstream/master | flake8 --diff
Please see #14778 for details.
I make
wide_to_long
a bit faster (avoid slow regex search on long columns by first converting to Categorical, avoid melting all dataframes with all the id variables, and wait with trying to convert the "time" variable toint
until last), and clear up the docstring.