Skip to content

Updating index do not respect get_queryset method #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
safwanrahman opened this issue Jun 12, 2018 · 7 comments · Fixed by #301
Closed

Updating index do not respect get_queryset method #111

safwanrahman opened this issue Jun 12, 2018 · 7 comments · Fixed by #301
Labels

Comments

@safwanrahman
Copy link
Collaborator

While populating ES index, it calls get_queryset from the DocType. So if I overwrite the get_queryset method and filter the queryset to index only specific kind of object, it works perfectly only when I run from the management command (rebuild or populate)

But, while new objects get created it does not respect that queryset. So though I add a filter in the get_queryset method, all objects get indexed.

Below is the example

# models.py
class Foo(model.Model):
    name = models.CharField(max_length=100)
    age = models.IntegerField()

# documents.py
@foo_index.doc_type
class FooDocument(DocType):
    class Meta:
         model = Foo
         fields = ('name', 'age')

     def get_queryset(self):
         queryset = super().get_queryset()
         return queryset.filter(age__gt=10)

So while running ./manage.py search_index --rebuild, it indexes only the object that have age more than 10. But If we add another Foo object with age=9, it also get indexed.

@karpie28
Copy link

karpie28 commented Aug 9, 2019

Hey, do you have any workaround for that?

///Edit:

I'm using django-elasticsearch-dsl-drf and overwriting "get_queryset" method in ViewSet allows adding always used filters.

class UserViewSet(BaseDocumentViewSet):

def get_queryset(self):
  queryset = self.search.query("match", is_staff=False)
  queryset.model = self.document._doc_type.model
  return queryset

@bashu
Copy link

bashu commented Aug 19, 2019

@karpie28 in a DocType:

@INDEX.doc_type
class Document(es.DocType):
    ...
    def update(self, thing, refresh=None, action='index', **kwargs):
        if isinstance(thing, models.Model) and not thing.is_public:
            if action == "index":
                action = "delete"
        return super(Document, self).update(thing, refresh, action, **kwargs)

    def get_queryset(self):
        return super(Document, self).get_queryset().exclude(is_public=False)

@csdenboer
Copy link

@karpie28 in a DocType:

@INDEX.doc_type
class Document(es.DocType):
    ...
    def update(self, thing, refresh=None, action='index', **kwargs):
        if isinstance(thing, models.Model) and not thing.is_public:
            if action == "index":
                action = "delete"
        return super(Document, self).update(thing, refresh, action, **kwargs)

    def get_queryset(self):
        return super(Document, self).get_queryset().exclude(is_public=False)

This may result in 404 when saving a non-public instance that does not exist in the index.

@bashu
Copy link

bashu commented Sep 12, 2019

@csdenboer true, instead of:

    ....
    def update(self, thing, refresh=None, action='index', **kwargs):
        if isinstance(thing, models.Model) and not thing.is_public:
            if action == "index":
                action = "delete"
        return super(Document, self).update(thing, refresh, action, **kwargs)

try

    ....
    def update(self, thing, refresh=None, action="index", **kwargs):
        if isinstance(thing, models.Model) and action == "index":
            if not thing.is_public:
                return super(Document, self).update(thing, refresh, action="delete", raise_on_error=False, **kwargs)
        return super(Document, self).update(thing, refresh, action, **kwargs)

@csdenboer
Copy link

Yeah or

   def update(self, thing, refresh=None, action='index', **kwargs):
        if isinstance(thing, models.Model) and not thing.is_public and action == "index":
            action = "delete"
            kwargs = {**kwargs, 'raise_on_error': False}
        return super(Document, self).update(thing, refresh, action, **kwargs)```

@kos08001
Copy link

@csdenboer does that imply ignore_signals = True ?

safwanrahman added a commit to safwanrahman/django-elasticsearch-dsl that referenced this issue Oct 31, 2020
safwanrahman added a commit that referenced this issue Apr 17, 2021
* [Fix #111] Adding API for filtering object while indexing

* Update documents.py

* updating docstring
@QuentinJouet
Copy link

Fix #301 is cool but doesn't handle the deletion of indexed objects that should not be anymore after save.

I had to stay with this :

   def update(self, thing, refresh=None, action='index', **kwargs):
       if isinstance(thing, models.Model) and action == "index" and not self.get_queryset().filter(pk=thing.pk).exists():
           action = "delete"
           kwargs = {**kwargs, 'raise_on_error': False}
       return super().update(thing, refresh, action, **kwargs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants