Skip to content

Add nanosecond attribute to DateTime for compatibility with pandas #246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Dmitrii-I opened this issue Jun 29, 2018 · 7 comments
Closed

Comments

@Dmitrii-I
Copy link

Constructing a DataFrame from pendulum DateTime fails because of missing nanosecond attribute:

import pandas
import pendulum
pandas.DataFrame({'time': [pendulum.now()]})

...
  File "pandas/_libs/tslibs/conversion.pyx", line 178, in pandas._libs.tslibs.conversion.datetime_to_datetime64
  File "pandas/_libs/tslibs/conversion.pyx", line 387, in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject
AttributeError: 'DateTime' object has no attribute 'nanosecond'

However suing naive() it works:

pandas.DataFrame({'time': [pendulum.now().naive()]})
Out[5]: 
                     time
0  2018-06-29T13:30:30.540966

Would it be possible to add the nanosecond attribute to DateTime class, event if it always returns 0 for now?

@sdispater
Copy link
Collaborator

I don't see why the DateTime class should have a nanosecond property. This is an issue with pandas since both pendulum.now() and pendulum.now().naive() are DateTime instances.

@jwkvam
Copy link

jwkvam commented Jul 10, 2018

It seems that pandas tries to convert the timezone aware datetime to it's own timestamp representation, but that conversion isn't triggered with the naive datetime. I don't think anyone is at fault here, as a user of both pendulum and pandas it makes it difficult to use them together.

Examples:

In [23]: pd.Series(datetime.now())[0]
Out[23]: Timestamp('2018-07-10 13:44:37.867100')  # converted to pandas timestamp
In [24]: pd.Series(pendulum.now().naive())[0]
Out[24]: DateTime(2018, 7, 10, 13, 44, 39, 851374)  # pandas dtype is object
In [25]: pd.Series(pendulum.now())
pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.datetime_to_datetime64()
pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject()
AttributeError: 'DateTime' object has no attribute 'nanosecond'

@Dmitrii-I
Copy link
Author

OK, let's close then.

@jwkvam
Copy link

jwkvam commented Jul 27, 2018

@Dmitrii-I Sorry, I didn't necessarily mean you should close it. I was just trying to point out that even with naive DateTimes pandas doesn't parse them into pandas datetime objects.

It would be nice if there were interchangeable.

Likely related: pandas-dev/pandas#15986

@Dmitrii-I
Copy link
Author

@jwkvam

I can re-open it, but I got the impression from @sdispater that this is not a pendulum issue.

@liquidgenius
Copy link

liquidgenius commented Sep 30, 2018

Issue Example

import pendulum
import pandas as pd

start = pendulum.datetime(2018, 4, 17)
end = pendulum.datetime(2018, 4,25)

period = pendulum.period(start, end)

period_days = [dt for dt in period.range('days')]
period_days = pd.Series(period_days)

Console Output

python3.6:
>     922                 from pandas import DatetimeIndex
>     923 
> --> 924                 values, tz = conversion.datetime_to_datetime64(v)
>     925                 return DatetimeIndex(values).tz_localize(
>     926                     'UTC').tz_convert(tz=tz)
> 
> pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.datetime_to_datetime64()
> pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject()
> AttributeError: 'DateTime' object has no attribute 'nanosecond'

It appears as though pandas converts the datetime to a datetime64 behind the scenes, and validation of the datetime is throwing an exception.

Solution

You can manually reformat the date without the nanonseconds so that it passes pandas validation:

import pendulum
import pandas as pd

def pendulum_to_pandas(pend_dt):
    """Converts a Pendulum datetime to a Pandas datetime

    Parameters
    -----------
    pend_dt (Pendulum.datetime): Any Pendulum datetime. Pendulum datetimes include 
        nanoseconds that Pandas does not support.

    Returns
    --------
    results (Pandas friendly datetime): A Pandas friendly datetime excluding nanoseconds.
    """
    
    # Drop nanoseconds
    results = pend_dt.strftime("%Y-%m-%d %H:%M:%S %z")
    
    return results

Usage

# Create a pendulum datetime for the current point in time
now = pendulum.now()

# Convert to pandas friendly datetime
pandas_friendly_datetime = pendulum_to_pandas(now)

Console Output

> 2018-04-17 00:00:00 +0000

Alternatively with a list comprehension

start = pendulum.datetime(2018, 4, 17)
end = pendulum.datetime(2018, 4,25)

period = pendulum.period(start, end)

period_days = [pendulum_to_pandas(dt) for dt in period.range('days')]
period_days = pd.Series(period_days)

Console Output

> 0    2018-04-17 00:00:00 +0000
> 1    2018-04-18 00:00:00 +0000
> 2    2018-04-19 00:00:00 +0000
> 3    2018-04-20 00:00:00 +0000
> 4    2018-04-21 00:00:00 +0000
> 5    2018-04-22 00:00:00 +0000
> 6    2018-04-23 00:00:00 +0000
> 7    2018-04-24 00:00:00 +0000
> 8    2018-04-25 00:00:00 +0000

This said, It would just be nice to have both tools work happily together. @jwkvam @Dmitrii-I

@liquidgenius
Copy link

liquidgenius commented Nov 8, 2018

    def pendulum_datetimes(data, pandas_friendly=True):
        """
        Converts all standard library datetimes in a Dictionary to Pendulum datetime instances. 
        Default is to return a Pandas friendly Pendulum datetime.
        
        Parameters
        ----------
        data (Dictionary): Dictionary to be parsed for Pendulum datetime conversion.
        pandas_friendly (Boolean): Ensure pandas compatibility by excluding nanoseconds,
            defaults to True.

        Returns
        -------
        data (Dictionary): Updated record with Pendulum (and possibly Pandas friendly)
            datetimes.

        """
        
        # Iterate key, value pairs
        for key, value in data.items():
        
            # Check every field for datetimes
            if type(value) is datetime.datetime:

                # Convert datetimes to Pendulum datetimes
                pendulum_datetime = pendulum.instance(value)
                
                # Ensure Pandas compatibility (loses nanoseconds, see here: http://bit.ly/2RH9hwk )
                if pandas_friendly:
                    data[key] = pendulum_datetime.strftime("%Y-%m-%d %H:%M:%S %z")
                else:
                     data[key] = pendulum_datetime
    
        return data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants