-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Make it easier to spot compile-time graphs #13762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think in addition to this that there may be something wrong with the current implementation of What I would have expected to see in the linked example would be something like this: router.ex
- test.ex (compile)
- router.ex (runtime)
- some_live_view.ex (runtime)
- gettext.ex(compile) but it actually just stops here router.ex
- test.ex (compile) |
Yes, we have to stop there because, as you said, it would be an infinite loop otherwise. We are trying to print a graph as lines of texts, limitations are expected. What we could though is this:
So at least it becomes clearer. I will look into this as well. |
Even better if we can show it as (compile cycle!) but no promises. |
👍 perfect. And I'd absolutely add something like |
It is hard to avoid all cycles and runtime ones are harmless. I am also seeing there are benefits in keeping Perhaps what we really need to do is to add a function called |
Yeah, I was assuming |
cc @marcandre who may also have ideas here. |
FWIW I think that there could possibly be a way to make this visible in the existing tasks that could be minimally invasive. Specifically, if we're stopping because we've already "seen" a given node, we could continue one time if this time it's a compile time dependency whereas last time it was a runtime dependency (or was the module we started at). Then cycles would be displayed in I haven't looked at the implementation, so perhaps what I'm saying is incorrect, but I would suggest that, if possible, making
In that example, it doesn't show |
I'm not sure I understand what is so particular with cycles. In my codebases, I have a single criteria to avoid all transitive dependencies which is no "compiled-connected". It gets slightly more complicated than this as there are usually a few really basic files refer to themselves compile-time, but to nothing else so the actual 100% fool-proof way is no compiled-connected dependency when removing that core group, and no (runtime) dependencies from that core group, i.e.:
I believe that you can make the above work if and only if a codebase will not recompile needlessly files (unless one touches Personally, I would rather improve Are there cases where the above is not the correct approach but looking at cycles is? |
You are correct, the root cause is still a compile-connected dependency somewhere. However, imagine you have 30 compile-connected dependencies in a project. The ones that will give you the biggest benefit in addressing are the ones that also form a cycle, because if you have a compile dependency in a cycle, it means that all runtime dependencies of any element in the graph now becomes a compile-time dependency. |
Thinking more about it, here's why I think it's a mistake to focus on cycles instead of compile-connected dependencies. Imagine a code base in good order. It has a file Now Anton modifies This is a compile-connected dependency, and there is no (new) cycle. You might say "But Marc-André, I don't really care if only A month later, Bob adds a runtime dependency in Another month passes, and Christian wants to modify The issue is that Christian did nothing wrong and has no context on what needs fixing. It might make a lot of sense for I believe that transitive dependencies, as "innocent" as they might be at first, can become issues later on and are better fixed as soon as introduced, and although cycles make them much worse, they are not necessary. |
I was writing at the same time...
I'm not convinced there's any other good way forward than to sever the 30 compile-connected dependencies, but that's correct. Maybe an option that would list the compile-connected dependencies in descending order of the magnitude of the connected part would be helpful. Cycles would be automatically first. |
Yes, that's what I am thinking. Compile-connected in cycles get priority 99, the other ones get priority 95. |
One could imagine a compile-time dependency refering to a small cycle of 3 files (which wouldn't be too bad), and a huge tree of 300 files (i.e. cycle-free), and the priority should be on severing the reference to the big tree, no? That's why I think simply counting how many files are in the connected component(s) of the compile-time dependency is a better metric. |
Not to be a nuisance, but I do want to make it clear that my original issue is that there was no good way to spot a compile-connected dependency that comes about from a cycle. I think that is even more important to fix. There are hidden compile-connected dependencies in the example app I linked from the other issue.
That output isn't helpful, as it's showing a normal compile time dependency that I was aware of. It doesn't indicate to me at all that I'm all for general purpose tooling to help people with these kinds of problems, but to me this is the more pressing issue. |
@zachdaniel that is correct, I believe this is true if there's a cycle or not. |
@josevalim Should I make a PR to show both the compile-time dependency and connections (just one level down)? |
🤔 How am I supposed to read that in that case? Is it just saying "this module you have a compile time dependency on depends on something else at runtime"? How would I go about using that information to figure out what the cause is? What I would expect Basically the equivalent of first running |
@zachdaniel what you are asking is not possible given the visual representation of a terminal. You are asking for us to show this:
while we show this:
The first thing is to realize that both representations are the same. Because there is a cycle, you can append the
Then the dependencies have already been placed and we wouldn't nest either way. Unfortunately I don't think this is feasible, so I propose we drop this subthread. :) |
@zachdaniel if you plot any of the three representations above in a piece of paper, you would see they are the same representation and the dependency into the graph would become obvious. It is just hard to do in a terminal. It is also worth pointing you two are asking different questions. @zachdaniel is asking "why is gettext causing router.ex to recompile". @marcandre is asking "which files I need to improve to reduce compilation woes". @marcandre is pointing out that, by asking his question, you will eventually solve the gettext issue, even if you can't dissect it. I think there is wisdom in @marcandre's argument. For any considerably large project, trying to understand why it happens is very hard, because there are so many nodes. |
🤔 I'm not necessarily trying to ask for any specific representation. I proposed some ideas, yes. All I'm really trying to say is that this:
is the full output of that command from the test application I linked, and It doesn't display any runtime dependencies that are causing the issue. I would be totally fine if the output wasn't displayed nested but instead on top of each other as you displayed, but I don't currently get either. The current output is very confusing, as it is not display even a single source/destination that are connected via a transitive dependency. It is only displaying one compile time dependency. I understand if we can't necessarily display the nested information, or if some other tact must be taken, but I'm looking for a specific solution to that user experience issue.
I agree that folks should in general be asking "what can I do to make my compile times better", but as a framework author I have to understand the specifics of how the macros we are shipping to users work. I wasn't investigating this for a single application, I was trying to solve a problem that happens for all users of a particular macro being used in Phoenix routers was causing compile time issues. |
That's because it is not the runtime dependency that is causing the issue. It is the |
If your question is the user experience, look at compile-connected. If you want to understand why the compile connected is there, use |
If I didn't have any runtime dependencies, nothing would have shown up, right? So it is the combination of the two that makes a transitive compile time dependency. I understand now that But when I say As it stands right now, I'll leave it with that, perhaps what I'm saying doesn't make sense, and in that case I'm sorry for wasting your time 🙇 But from the standpoint of a user of EDIT: Like, if we were just talking about it, we'd say " |
Keep in mind that the whole point of the |
And --label compile-connected does not really behave any different than --label compile, they are all filters of the relationships. |
You are 100% correct. I have pushed a commit where the
We of course want to fix the top one. :) Thanks everyone for the convo. Elixir got better tonight because of it. |
And here is a commit in Livebook which fixes the one above: It mostly extracted all runtime dependencies into a single module which does not depend on anything else. |
@ zachdaniel Your understanding is correct, and your remarks make me think that it might be best to show both the compile dependency and the "connection" dependencies. Sometimes you want to remove the compile-time dependency, but often it's the "connection" dependencies that should be removed. @josevalim would you be against showing them? Nice improvement with stats @josevalim 🩷 The count is only the number of direct dependency, though, right? Wouldn't a better metric be the number of direct and indirect dependencies? |
@marcandre i am not against showing them but it may slightly change the meaning of the label option. We have to try. About stats, for incoming, it is only the direct ones, right? If I depend on A which depends on B at compile time, I am not recompiled if A changes. for outgoing, the whole graph matters more. But I am not sure how to put those in stats and if computing them won’t be too expensive in large projects. |
I'm working on quite a big project (around 2000 files in our (I'll name the parts so it's easier for you to answer to a specific part) 1. Cycles may not be the problem
Having to recompile a file when any of your project file changes arises in other cases. The doc talks a lot about the following structure : This is fixed by following the rule explained in 2. Hard to understand This is the second time I focused on reducing the number of files recompiled when basically any of our files changed. The first time, I didn't even understand how transitive compile dependencies were the problem. I may have overlooked some part of the documentation, but anyway, I remember trying to remove all compile dependencies, not the compile-connected ones. That worked, but wasn't really time-effective. Anyway, like I said above, I think the changes to xref doc will help greatly. I shared the new documentation to my team, and I'm waiting to see if it helps them understand how dependencies between files work, and how transitive compile dependencies are a problem. 3. Hard to debug I created a script that tries to evaluate and guide the developer to the source of an issue, basically looking for dependencies in modules that are used as macros. The output looks something like the following:
Under "Dependencies:" is the output of It's not perfect, sometimes the "suspicious lines" won't be the cause of the problem (see "5. Dependencies"), and I do not have feedback from my team on this tool yet, as it has only been added this week, but I feel like a tool like this could be a great addition to mix. It could probably be much more precise by adding it directly in mix 🙂 4. Not straightforward to prevent For the record, the script also as a concept of "safe list" (aka, files we want to ignore through the 5. Dependencies I doubt there's much to be done about it, but well, that's part of what I had to fix 😄 I think that's a reason more to add more information in "Meta-programming anti-patterns", as library writers should be aware of such caveats. And, that's it ! Sorry for the long speech, still I hope this can help improve |
I think this is a great idea. Would you like to take a stab at it? |
Sure ! I'll try to write this in the afternoon (CET) 👌 |
I hate to reap this, but the context of this thread is entirely relevant. What would really, truly help me, would be to know which files are actually compiling when I hit save. Is it possible to type a command with xref that will tell me exactly which files will recompile when I edit a specific file? As it is, all I see is All I'm left with are doubts and guesses. |
You can save the file and pass the —verbose flag to mix compile to see all compiled files. |
You can then pass the file you changed as the sink (others are compiled because it changed) and the one you are interested as the source. The CLI prints duplicated entries because it is still printing a graph. So if there are many entries pointing to the same place, they appear again. |
This is incredibly enlightening. I can see now where a lot of this comes from. I really wish that I'd known about this years ago. A major one is plugs. Every plug depends on the router at compile time because plug uses a macro. And though none have any strange code in them, any runtime reference to business logic from within any plug, which feels pretty unavoidable, (get the api key rate limit, pattern match a schema) chains all the way down into the business logic causing any of those business logic files to cause the router to recompile. Another thing that is hitting us, is some devs write code like
This is merely used to convert :user to User in code. While this could have been written as a function, my devs keep trying to move things into attributes as an optimization and if any of the things in the attribute happen to be actual schema atoms, this creates compile time dependencies everywhere. Okay well thank you for the tip. I think that will help a lot. edit: What's really weird about this to me, and what is tripping me up is that if I have any reference to an atom that happens to be a module within the same program, it creates a runtime reference. Like if I go.
What is the rationale for creating these dependencies for merely referencing an atom that happens to be a module somewhere, even if you don't use a function or pattern match or anything else? |
IIRC correctly these do not generate compile time dependencies if the attributes are used inside a function only. Run
That's because you could pass them to a function and that function could call something in it. |
Wow, I double-checked because I also thought these generated compile-time dependencies, and of course you are correct, very nice. The output that
On that subject, there are ways to cheat, e.g. |
Edit: I'm not able to replicate these conclusions on the latest Elixir compiler and so I think I need to verify that these issues haven't been fixed already or that there isn't some extra complication in my code leading to this extra recomp.
That's true, they are just runtime. But if there is a chain of runtime deps that back up into a compile time dep (as my plugs all do), then any of those runtime deps cause a compile time dep on my router. So even modules that looked completely clean to me, no calls to external functions, actually had dependencies in them to many other modules, purely because they used sensibly named atoms as guards to its own local functions. This lead to a spider web of secret dependencies that caused extra compilation.
Like if I use |
As discussed in phoenixframework/phoenix#5893, if you have a compile-time dependency in a cycle, it makes it so any runtime dependency in a file becomes a compile-time dependency. Therefore, it is extremely important that we make these cases easier to spot.
My suggestion is to introduce
mix xref cycles
, bringing it at the top level, and allow cycles to be classified by compile / export / runtime.The text was updated successfully, but these errors were encountered: