-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Add modeling for intersphinx data #5289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 23 commits
83c6f40
db5fbd8
c2cd164
41b2857
806d3c2
0f82c26
639b219
67d909a
537c66a
647ae3f
5b7d599
1dfabe8
08296ae
1e087fb
12bdf04
8e015c9
444e739
1672421
462545e
f53e589
98aae46
c934046
96fa45a
d8d75d5
09b3c80
0da9283
1bc31d0
97851c4
021e12d
ab7a30c
fa72754
4ea9121
ec71993
18877c1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
"""Domain Admin classes""" | ||
from django.contrib import admin | ||
from .models import DomainData | ||
|
||
|
||
class DomainDataAdmin(admin.ModelAdmin): | ||
list_filter = ('type', 'project') | ||
raw_id_fields = ('project', 'version') | ||
search_fields = ('doc_name', 'name') | ||
|
||
|
||
admin.site.register(DomainData, DomainDataAdmin) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
"""Domain API classes""" | ||
|
||
from rest_framework import serializers | ||
|
||
from readthedocs.restapi.views.model_views import UserSelectViewSet | ||
from .models import DomainData | ||
|
||
|
||
class DomainDataSerializer(serializers.ModelSerializer): | ||
project = serializers.SlugRelatedField(slug_field='slug', read_only=True) | ||
version = serializers.SlugRelatedField(slug_field='slug', read_only=True) | ||
|
||
class Meta: | ||
model = DomainData | ||
fields = ('project', 'version', 'name', 'display_name', 'doc_type', 'doc_url') | ||
|
||
|
||
class DomainDataAdminSerializer(DomainDataSerializer): | ||
|
||
class Meta(DomainDataSerializer.Meta): | ||
fields = '__all__' | ||
|
||
|
||
class DomainDataAPIView(UserSelectViewSet): # pylint: disable=too-many-ancestors | ||
model = DomainData | ||
serializer_class = DomainDataSerializer | ||
admin_serializer_class = DomainDataAdminSerializer | ||
filter_fields = ('project__slug', 'version__slug', 'domain', 'type', 'doc_name', 'name') |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
""" | ||
Sphinx Domain modeling. | ||
|
||
http://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html | ||
""" | ||
|
||
from django.db import models | ||
from django.utils.encoding import python_2_unicode_compatible | ||
from django.utils.translation import ugettext_lazy as _ | ||
|
||
from readthedocs.builds.models import Version | ||
from readthedocs.core.resolver import resolve | ||
from readthedocs.projects.models import Project | ||
from readthedocs.projects.querysets import RelatedProjectQuerySet | ||
|
||
|
||
@python_2_unicode_compatible | ||
stsewd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
class DomainData(models.Model): | ||
|
||
""" | ||
Information from a project about it's Sphinx domains. | ||
|
||
This captures data about API objects that exist in that codebase. | ||
""" | ||
|
||
project = models.ForeignKey( | ||
Project, | ||
related_name='domain_data', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here, we will have |
||
) | ||
version = models.ForeignKey(Version, verbose_name=_('Version'), | ||
related_name='domain_data') | ||
modified_date = models.DateTimeField(_('Publication date'), auto_now=True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could start using this standard #4864 |
||
commit = models.CharField(_('Commit'), max_length=255) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we really need this? I think we already have this from the version 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, a file can be removed from a specific commit, and we need to track this to delete it. |
||
|
||
domain = models.CharField( | ||
_('Domain'), | ||
max_length=255, | ||
) | ||
name = models.CharField( | ||
_('Name'), | ||
max_length=255, | ||
) | ||
display_name = models.CharField( | ||
_('Display Name'), | ||
max_length=255, | ||
) | ||
type = models.CharField( | ||
_('Type'), | ||
max_length=255, | ||
) | ||
doc_name = models.CharField( | ||
_('Doc Name'), | ||
max_length=255, | ||
) | ||
anchor = models.CharField( | ||
_('Anchor'), | ||
max_length=255, | ||
) | ||
objects = RelatedProjectQuerySet.as_manager() | ||
|
||
def __str__(self): | ||
return f''' | ||
DomainData [{self.project.slug}:{self.version.slug}] | ||
[{self.domain}:{self.type}] {self.name} -> {self.doc_name}#{self.anchor} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should probably be wrapped. |
||
''' | ||
|
||
@property | ||
def doc_type(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if this is the correct name, I wasn't able to find something that could be used instead. I mean, this is just the full object not, the type of document. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is more like a "role name" or "directive name" in the Sphinx language as I can read from their docs. Considering that this is used as Maybe "role_name" is a better name. |
||
return f'{self.domain}:{self.type}' | ||
|
||
@property | ||
def doc_url(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd name it as |
||
path = self.doc_name | ||
if self.anchor: | ||
path += f'#{self.anchor}' | ||
full_url = resolve( | ||
project=self.project, version_slug=self.version.slug, filename=path) | ||
return full_url |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ | |
import json | ||
import logging | ||
import os | ||
import sys | ||
import shutil | ||
import socket | ||
from collections import Counter, defaultdict | ||
|
@@ -25,6 +26,8 @@ | |
from django.utils import timezone | ||
from django.utils.translation import ugettext_lazy as _ | ||
from slumber.exceptions import HttpClientError | ||
from sphinx.ext import intersphinx | ||
|
||
|
||
from readthedocs.builds.constants import ( | ||
BUILD_STATE_BUILDING, | ||
|
@@ -58,6 +61,7 @@ | |
) | ||
from readthedocs.doc_builder.loader import get_builder_class | ||
from readthedocs.doc_builder.python_environments import Conda, Virtualenv | ||
from readthedocs.domaindata.models import DomainData | ||
from readthedocs.projects.models import APIProject | ||
from readthedocs.restapi.client import api as api_v2 | ||
from readthedocs.vcs_support import utils as vcs_support_utils | ||
|
@@ -1136,6 +1140,7 @@ def fileify(version_pk, commit): | |
), | ||
) | ||
_manage_imported_files(version, path, commit) | ||
_update_intersphinx_data(version, path, commit) | ||
else: | ||
log.info( | ||
LOG_TEMPLATE.format( | ||
|
@@ -1146,6 +1151,56 @@ def fileify(version_pk, commit): | |
) | ||
|
||
|
||
def _update_intersphinx_data(version, path, commit): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not to run this function in a celery tasks instead of calling syncly? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is already run inside of a celery task. |
||
""" | ||
Update intersphinx data for this version | ||
|
||
:param version: Version instance | ||
:param path: Path to search | ||
:param commit: Commit that updated path | ||
""" | ||
object_file = os.path.join(path, 'objects.inv') | ||
|
||
class MockConfig: | ||
intersphinx_timeout = None # type: int | ||
tls_verify = False | ||
|
||
class MockApp: | ||
srcdir = '' | ||
config = MockConfig() | ||
|
||
def warn(self, msg): | ||
# type: (unicode) -> None | ||
stsewd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
print(msg, file=sys.stderr) | ||
|
||
invdata = intersphinx.fetch_inventory(MockApp(), '', object_file) | ||
for key in sorted(invdata or {}): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why these two loops need to be |
||
domain, _type = key.split(':') | ||
for name, einfo in sorted(invdata[key].items()): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't it clearer to use for key, value in invdata.items():
domain, _type = key.split(':')
for name, einfo in value: |
||
url = einfo[2] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd add a small comment with previous to this line, with the expected structure of this
or... a link to the docs. I found debugging this complicated later. |
||
if '#' in url: | ||
doc_name, anchor = url.split('#') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could use urlparse https://docs.python.org/3/library/urllib.parse.html, but the logic here is very simple, so not really sure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea, feels unnecessary. It's also not a full URL, so it might be confused by it. |
||
else: | ||
doc_name, anchor = url, '' | ||
display_name = einfo[3] | ||
obj, _ = DomainData.objects.get_or_create( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this will fail the first time because |
||
project=version.project, | ||
version=version, | ||
domain=domain, | ||
name=name, | ||
display_name=display_name, | ||
type=_type, | ||
doc_name=doc_name, | ||
anchor=anchor, | ||
) | ||
if obj.commit != commit: | ||
obj.commit = commit | ||
obj.save() | ||
DomainData.objects.filter(project=version.project, | ||
version=version | ||
).exclude(commit=commit).delete() | ||
|
||
|
||
def _manage_imported_files(version, path, commit): | ||
""" | ||
Update imported files for version. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DomainData
name conflicts with ourDomain
model to me.Considering that we are talking about Sphinx Domains here, I'd name all of these ones (including the Django App) as
SphinxDomain
. I think it will be better to get the meaning immediately when re reading about this.