Skip to content

form_blocks vs make_block inconsistency #19179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Jan 11, 2018 · 1 comment · Fixed by #19189
Closed

form_blocks vs make_block inconsistency #19179

jbrockmendel opened this issue Jan 11, 2018 · 1 comment · Fixed by #19189
Labels
Clean Internals Related to non-user accessible pandas implementation
Milestone

Comments

@jbrockmendel
Copy link
Member

Taking a cue from #19174 to revisit some logic in core.internals. form_blocks and make_block have some very similar logic. The question here is: are the discrepancies between then intentional?

Taking some liberties to make it more obvious how the logic is shared, the current code looks like:

def make_block(values, placement, klass=None, ndim=None, dtype=None,
               fastpath=False):
    [...]
        dtype = dtype or values.dtype
        vtype = dtype.type

        if isinstance(values, SparseArray):
            block_type = 'sparse'
        elif issubclass(vtype, np.floating):
            block_type = 'float'
        elif (issubclass(vtype, np.integer) and
              issubclass(vtype, np.timedelta64)):
            block_type = 'timedelta'
        elif (issubclass(vtype, np.integer) and
              not issubclass(vtype, np.datetime64)):
            block_type = 'int'
        elif dtype == np.bool_:
            block_type = 'bool'
        elif issubclass(vtype, np.datetime64):
            assert not hasattr(values, 'tz')
            block_type = 'datetime'
        elif is_datetimetz(values):
            block_type = 'datetime_tz'
        elif issubclass(vtype, np.complexfloating):
            block_type = 'complex'
        elif is_categorical(values):
            block_type = 'cat'
        else:
            block_type = 'object'
[...]

def form_blocks(arrays, names, axes):
    [...]
        if is_sparse(v):
            block_type = 'sparse'
        elif issubclass(vtype, np.floating):
            block_type = 'float'
        elif issubclass(vtype, np.complexfloating):
            block_type = 'complex'
        elif issubclass(vtype, np.datetime64):
            assert not is_datetimetz(v)
            block_type = 'datetime'
        elif is_datetimetz(v):
            block_type = 'datetime_tz'
        elif issubclass(vtype, np.integer):
            block_type = 'int'
        elif dtype == np.bool_:
            block_type = 'bool'
        elif is_categorical(v):
            block_type = 'cat'
        else:
            block_type = 'object'

[...]

The two main differences here are 1) is_sparse encompasses slightly more than isinstance(values, SparseArray) and 2)timedelta case is missing form form_blocks. Anyone know why?

@jreback
Copy link
Contributor

jreback commented Jan 11, 2018

the make_block logic is more correct. you could remove form_blocks in favor of this I think.

@jreback jreback added Internals Related to non-user accessible pandas implementation Clean labels Jan 11, 2018
@jreback jreback added this to the 0.23.0 milestone Jan 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants