[Fix #1091] Index.document() does not set index on Document class #1099

safwanrahman · 2018-12-30T21:07:01Z

This will fix #1091

In the previous versions, the Index.document() method added the Index to the Document class. But it was removed in 69f5ca2 in order to implement the inheritance document.

This patch will add Index to the Document class moreover also set None to Document._index property if Index Innesclass is not in set.

@honzakral r?

honzakral · 2018-12-30T22:08:11Z

Thank you for working on this, I was hoping to get to it next week but this is much better.

I think the ideal solution here would be to detect somehow that the Index instance stored in ._index was just the default one and have .document() replace it only if so. A check for name == '*' might do it or maybe tag it with some attr? What do you think?

I would like to avoid the situation where _index is ever None as that would lead to too many exception in both our and our users' code.

Does that make sense?

Thank you again for picking this up!

safwanrahman · 2018-12-30T22:16:42Z

Thanks a lot @honzakral for taking it in the vacation time.
I think, if the user does not set a index, we should not set that implicitly. Either they should set it in the inner Index class or they should set it with the Index.document.

The get_index method also validates it. If we set name=='*' or something like that, it should also raise error when anyone want to run some query or anything.

From my prespective, I think we should handle the document inheritance like django handle model inheritance. We can add abstract=True in the Meta inner class. By doing that, we can avoid having a user declaring a document class without any index.

What do you think?

I would like to avoid the situation where _index is ever None as that would lead to too many exception in both our and our users' code.

Thats true. But I think we need to fix this from our end as well as from user end.

honzakral · 2018-12-30T22:32:13Z

Thank you for the discussion, very useful!

The get_index method also validates it. If we set name=='*' or something like that, it should also raise error when anyone want to run some query or anything.

There is no problem with running a query on *, it works by querying on everything which seems to be what the user asked for. Many times, in simple use cases, people would only have essentially a single Document type in their cluster and then just using * is perfectly fine assuming they create their index explicitly.

You can always have a Document class with a wildcard index name as that is a very common pattern (time- or project- based indices for example) so we have to be able to deal with that situation anyway and it adds no extra complexity to be able to handle * properly.

I am also a huge fan of the simplicity of defining a Document class and then using it on any index, even if the index is not managed by elasticsearch-dsl, for example I often do it for existing indices just to get the deserialization for datetimes or add a custom method to my results, it seems like specifying class Index should not be required there either.

In other cases I like to just be able to access the _index property to get access to the index level operations without having to check for None.

Overall I really think that having _index always be an instance of Index is a good thing.

That leaves us with several options:

Use some detection (name == '*' or explicit tagging with cls._index.is_default) and only override it in that case
give up on Index.document overriding the ._index of a Document and just have it copy the Document mappings onto itself
always have Index.document override the ._index of a Document to self
a hybrid where we introduce Index.document(override=True) to switch between options 2 and 3

I am leaning towards 1 but am certainly open to any of these.

What do you think?

safwanrahman · 2018-12-30T22:51:21Z

I understand your point.
But if you want to query on all the indexes, you can just create a document class with Index innerclass having name=*.
Like

class AllDocument(Document):
    class Index:
        name = '*'

Do you think having it implicit is better than explicit?

If you dont want user to add the wildcard index name explicitly, I also think going with option 1 is better idea. But also having the option 4 makes sense as it keeps the method flexible and more explicit!

safwanrahman · 2018-12-30T23:12:50Z

Overall I really think that having _index always be an instance of Index is a good thing.

I also agree with this. In that case we can have a index with name None?
what do you think?
Index(name=None)

safwanrahman · 2018-12-30T23:45:14Z

@honzakral I think having Index(name=None) will be better if user does not set anything.
If we set Index(name=None), it will set _all while making the query.

By doing this, we can differenciate in the Index.document() method as well as the user experience will not break!

What do you think?

safwanrahman · 2019-03-05T19:37:54Z

@honzakral possible to have a review again?

lost-person · 2019-04-22T14:24:47Z

Hi! @safwanrahman I met the similar problem when i use Index.document like this:

test_index1 = Index('test_index1')
test_index2 = Index('test_index2')

ascii_fold = analyzer(
    'ascii_fold',
    # we don't want to split O'Brian or Toulouse-Lautrec
    tokenizer='whitespace',
    filter=[
        'lowercase',
        token_filter('ascii_fold', 'asciifolding')
    ]
)

@test_index1.document
@test_index2.document
class testDoc(Document):
      user_name = Completion(analyzer = ascii_fold)
          
      def save(self, ** kwargs):
             self.update_time = int(time.time() * 1000)
             kwargs['index'] = 'test_index1'
             return super().save(** kwargs)

......

if __name__ == '__main__':
      testDoc.init()

Could you tell me how to resolve it? Thanks! And my elasticsearch-dsl version is 6.3.1

honzakral · 2019-04-25T23:02:40Z

Sorry for the delays, finally pushed as 29863b3

Thank you for working on this, much appreciated!

safwanrahman changed the title ~~[Fix #1091] document method of Index does not set index on Document class~~ [Fix #1091] Index.document() does not set index on Document class Dec 30, 2018

[Fix elastic#1091] Index.document() does not set index on Document class

c47e35a

safwanrahman force-pushed the document_method branch from c8e405f to c47e35a Compare December 30, 2018 21:09

safwanrahman mentioned this pull request Dec 30, 2018

[Fix #115] Support elasticsearch-dsl > 6.1.0 django-es/django-elasticsearch-dsl#136

Merged

safwanrahman added 4 commits December 31, 2018 03:23

more fixup

778d7f5

adding index in document

43412cf

fixing more tests

d0a0dee

fixing tests

ca2ad8e

safwanrahman added 2 commits December 31, 2018 05:49

set index name to None

55ed31a

more update

7d71f61

felixedel mentioned this pull request Feb 27, 2019

Feature: add a prefix to elastic index bmwcarit/zubbi#37

Closed

honzakral closed this Apr 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix #1091] Index.document() does not set index on Document class #1099

[Fix #1091] Index.document() does not set index on Document class #1099

safwanrahman commented Dec 30, 2018

honzakral commented Dec 30, 2018

safwanrahman commented Dec 30, 2018 •

edited

Loading

honzakral commented Dec 30, 2018

safwanrahman commented Dec 30, 2018

safwanrahman commented Dec 30, 2018

safwanrahman commented Dec 30, 2018

safwanrahman commented Mar 5, 2019

lost-person commented Apr 22, 2019

honzakral commented Apr 25, 2019

[Fix #1091] Index.document() does not set index on Document class #1099

[Fix #1091] Index.document() does not set index on Document class #1099

Conversation

safwanrahman commented Dec 30, 2018

honzakral commented Dec 30, 2018

safwanrahman commented Dec 30, 2018 • edited Loading

honzakral commented Dec 30, 2018

safwanrahman commented Dec 30, 2018

safwanrahman commented Dec 30, 2018

safwanrahman commented Dec 30, 2018

safwanrahman commented Mar 5, 2019

lost-person commented Apr 22, 2019

honzakral commented Apr 25, 2019

safwanrahman commented Dec 30, 2018 •

edited

Loading