-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Add library dependency detection #2792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In theory, this should properly deal with #include inside of #ifdef. But only in a library. Sketch preprocessing is not changed. This only applies to detecting dependencies among library. It should now be possible to do something like this inside a sensor library using I2C
Of course, this hasn't been throughly tested. I know some of the people who've asked me to implement this feature have wanted this. Now's the time to give the test build a try and see if it works for such complex cases. |
Code looks good to me overall. However, I really think that hardcoding the preproccesor / gcc options in the java code is the wrong way to go, these should be defined by platform.txt. Having said that, I guess this approach is perfectly fine for initial testing, it should just be generalized before merging this to master (but odds are that this was your plan already) At one point I thought this code could be fooled by a sketch like: foo.h:
bar.cpp:
so it would see and process the bar.h include, before the path to foo.h However, in your code you are cleverly always processing the first Perhaps a comment explaining this wouldn't hurt, since readers of the Finally, having two different ways of checking dependencies (one for |
Regarding the latter point, my previous pullrequest (#2174) contained a bunch of code cleanups to allow the sketch and libraries to use the same dependency resolution code. I think some of it was included in, or deprecated by the IDE refactor that happened a while ago, but perhaps some of it would still apply. I'll try to have a look at that code and see if I can combine some of that with this PR. |
Wow, Matthijs, I'm really impressed how deeply you've analyzed the code! Yes, comments explaining some of the subtle points would be good. That weird regex was actually one of the final parts that held up completing this little project for a few hours. So much effort for an 8 character string! You're absolutely right about the "Adding library..." print producing duplicate output. For simple cases like SD requiring SPI, it's just 1 extra line. But it could grow to quite a few extra lines printed in a complex case where a long chain of libraries each depend upon each other. Duplicate printing could be suppressed by adding a List or ArrayList to remember which libraries have already printed and avoid printing them again. This technique might also be useful for sketch preprocessing. I certainly have thought about that. I would like to respectfully disagree, however, on the best path to take. Incremental improvement in relatively small and achievable steps is almost always be better software development strategy than expanding scope and features. I believe the best course is to first apply this to library dependency, then later consider how (and whether) to reuse it for sketch preprocessing. If we expand the scope of this work and delay merging it, we risk it suffering the same fate as your earlier work in #2174, and numerous other attempts that have been made over the years. Remember, this has been on the issue tracker since 2010, before Arduino was using GitHub. |
I asked myself this question several times while writing the code. I believe the answer is really another question: can we print a better error than the compiler will later report when it can't find the missing header. |
You can probably fix this by moving these prints up to outside of the loop, since any libraries added inside the loop are already printed once exactly.
I guess that's reasonable. I'm hesitant to have a release where there is inconsistency with how dependency detection works between libraries and sketches, but I guess having something that is slightly inconsistent in corner cases is better than having nothing at all :-)
I'm not so sure if the compiler gives a proper error message right now, due to warnings being disabled. Even if that is not true or fixed, I think the compiler might give error message for multiple missing includes. Since we only tried to locate the first one and gave up entirely when that didn't work, I think the compiler might show error message for subsequent missing header files that could actually have been found if we only tried. I'm afraid that might confuse users. This is something to actually try, though, but I don't have time for that right now. |
However, it does seem this pull request and #2729 are on a collision course! Maybe this is a good time to think about whether that approach or a "gcc -M" approach is best for Arduino in the long term? |
I'm not sure if these two collide: #2729 does preprocessing to find and forward-declare functions, this PR processes to find (missing) include files. I really think the -M approach is the best way forward - it allows reusing the existing preprocessor, instead of trying to emulate one (using regexes, ctags or whatever). Ideally, you'd also do this for function declarations, but that's probably not feasible, hence #2729 tries to fix this in a best-effort way. AFAICS, these two approaches solve different problems and can and should just coexist. |
This and #2729 absolutely do conflict. It completely replaces that part of Compiler.java with another abstraction layer called "chainRings". |
I'm reading the code now. It seems the code moved to preproc/IncludesToIncludeFolders.java. I have mixed feelings about this coding style. It seems every change to Arduino is adding more and more unnecessary abstraction. Really, what's the point to adding a PreprocessorChainRing class, which merely is used to call a bunch of functions? Why can't those just be ordinary classes with ordinary functions? The only "benefit" seems to be encapsulating all the results into a "context" list, which might be nice if you love abstraction layers, but it really only reduces the readability of Compiler.java by hiding which of the called functions actually accesses the 2 lists "includeFolders" and "importedLibraries". It may look pretty, but instead of ordinary code it becomes a list of actions with data flow obscured from easy view. You have to dig through many more files to figure out which of those things impacts any particular piece of the final output state used by the rest of Compiler.java. It even breaks simple searching for known variable names, by adding a layer that names them with strings. Maybe Frederico has some larger plans for this abstraction layer, but as it's used now, I can't see how it helps, and it certainly adds yet another unnecessary layer of complexity. Ok, I'm done ranting now.... So much work has already happened on the coanctags branch that it seems impossible to imagine it's not going to be merged. It's a massively invasive change, in terms of lines modified that will break most other pending pull requests, including a lot of unnecessary white space editing. Had I looked at this yesterday, I probably would have waited to make this library dependency improvement. |
@PaulStoffregen I just tested this PR with the USB Host shield library which depends on the SPI library and it works fine :) |
By encapsulating single responsibilities into single files they become much easier to test. Indeed every single file that implements PreprocessorChainRing has a corresponding test class which requires a minimum amount of setup code. It could have been done in a million other ways of course, but implementing the Command pattern like so make it easier to force a contribution to keep concerns separated. As for the merging, don't worry. The preprocessor PR is lacking a fundamental feature that will make it break every sketch more complex than a Blink, so it's not going to be merged very soon. On the contrary, yours is receiving positive feedback and it's likely to be merged sooner. |
On the plus side, it looks like coan is only called from preproc/IncludesFinder.java. Seems like it ought to be pretty straightforward to later switch from "coan" to "gcc -M", if there's any reason to do so. |
@ffissore - Ok, I get the need for unit tests. But please, I hope you'll reconsider the approach of "context". Packing all the data away into a list of named objects makes the Arduino source code dramatically more difficult to read and understand. You can't see in Compiler.java which objects are doing things with which data, like any ordinary coding style would make easily apparent. By using this sort of abstraction, you may be closing the door to future contributors by making the learning curve much too steep. |
Ok, what if I use a java bean instead, with getters and setters? This will make it easier to search for "callers of |
Right, of course. I previously meant that these two changes can conceptually co-exist, I hadn't looked at wether the actual implmentations would clash (which they clearly do). |
Like you said, there's a million ways to do something. If you're willing to do more on this, please try to imagine a moderately competent (but far from expert - a typical open source would-be contributor) Java programmer will perceive when the read Compiler.java. In the coanctags branch, first there's a list of object instances that look like function calls. None return data, or take inputs references that appear to be their outputs. Then generic "context" is created and they're all called. Afterwards, 2 items are taken out of the context list. From an abstraction point of view, it's beautiful. From a code readability point of view, Compiler.java, one of the most important files in Arduino's source, becomes much less readable. Ordinary, highly readable code defines the variables first. Ordinary code passes variables into functions, or creates objects, usually putting data into them via their constructor or by functions shortly after they're created. Readable code, read top to bottom, shows both the order actions AND gives a strong sense of what data is being manipulated, as it's passed from function to function. The point is you can see what's happening by simply reading ONE higher-level file's code, like Compiler.java. I know you're juggling a lot of priorities. I understand you probably don't have extra time to rework stuff like this. But if you can, I hope you'll consider how all these many abstraction layers are creating a steeper and steeper learning curve for potential new contributors. Unreadable code that doesn't show data flow without digging through many other files and understanding numerous abstraction classes is a sure way to close to door to most future contributors. |
What Paul said about readability and keeping foremost the priorities of functionality and maintainability. Hard to maintain code will wither and die, and/or become a real burden for the few who can understand it. I'm sure that's not the intent! |
I've added a couple more commits, for the comments Matthijs suggested and the duplicate printing issue. |
I believe I have a test case that fails (assuming I understand how the new code should work). What's the best way to get you the files / instructions? Email? Is there a way to attach binaries to one of these comments? (Nope: "Unfortunately, we don't support that file type. Try again with a PNG, GIF, or JPG.") |
Huh. Well. Maybe not a "failure". If there is no CPP file with a library, headers from other libraries are not automatically found. But, I believe, it was established that a library with no CPP (or C) file is out of bounds. |
You could create a repository here on github and commit the files. Or you could email them directly to me, paul at pjrc dot com. I'll give them a try and investigate. |
I will email files and instructions but, at this point, you looking at it seems like a waste of your time. It really is a matter of has-CPP versus does-not-have-CPP. |
It's supposed to work the .c and even .S files. I'll be happy to take a look when I have the files. Better to discover any problems or unusual cases now, rather than after it's in the hands of many thousands of end users! |
Yup. It works if there is a C or a CPP file. If there are only H files it fails. (Email sent.) |
@Coding-Badly, if there are only .h files, you will be including them from your sketch. This doesn't currently pull in any recursive dependencies, but once we also apply this preprocessor-based approach to sketch dependencies, I thin that case will be covered as well (when processing the sketch, which includes the library's .h file, any other missing include paths should show up there). |
@Coding-Badly, I'm looking at your files now. Even though you've got 2 libraries containing only header files, where one includes the other's headers, from a code dependency point of view this isn't actually a case where one library depends on the other. The sketch depends on both these libraries, even though it includes the header from only one of them. The "ProcessorTest" library can't depend on anything, since it contains no files that are actually cause the Compiler.java to invoke gcc. The sketch causes the actual run of gcc which depends on header files from both libraries. We casually talk about one library or the sketch depending upon another file or library. But sketches and libraries and even files are not the thing that really depends on other things, despite #include syntax. The runs of "gcc -c" are what actually depends on files. Speaking casually but accurately, a library depends on something else if any of the gcc instances compiling its code need any of those files. When/if we apply this "gcc -M" approach to sketches, I'm confident it will properly handle this case. When "gcc -M" is run on the .cpp file that's made from your .ino file, it'll follow the #include "ProcessorTest.h" into all those headers and discover ProcessorTest_m328_16.h includes PaulTest.h. It will produce a list of all the headers needed, for the set of #ifdef matching your code actually does. Then the lookups on importToLibraryTable will resolve both of those headers and put the 2 libraries onto the "importedLibraries" list. |
As a quick test, I ran the "gcc -M" commands on the .cpp file produced from the .ino. Here's the first iteration command, before anything is known:
On the first line, you can see it detected "ProcessorTest.h" is needed. This is one of the cases where gcc puts more than 1 file on the same line, delimited by spaces, which is why I had to create that weird regex on line 1033 in Compiler.java. Here's the 2nd iteration command, with ProcessorTest added to the include paths:
On this 2nd iteration, it properly detected that "PaulTest.h" is now required. Because there's a library with that header name, it'll be on Arduino's importToLibraryTable list. Also notice how "gcc -M" did not list the other header files from ProcessorTest which aren't actually used when Arduino Uno is selected. Here's the third iteration command, with PaulTest added to the include path:
Now "gcc -M" is able to find every file. This is our indication the searching is over and we've built a complete list of include paths and a complete list of libraries the sketch depends upon. |
Whether or not to apply this approach to sketches is a question for Cristian & Federico (and maybe Massimo?) Should "gcc -M" replace the work already done with "coan"? Before anyone does any work in this direction, I think we really need to hear from them. I'd like to continue moving forward with this on sketches after this is merged for libraries. Issue #236 has been unresolved for nearly 5 years, so the last thing I want to do is expand the scope of this work and add delay for closing #236. I believe it does make sense to apply "gcc -M" to discovering the sketch dependencies, but it also makes sense to develop software as a sequence of incremental improvements with achievable milestones. |
Agreed with @PaulStoffregen on all points :-) |
I tried to compile my testcase with PR-2792-BUILD-231-windows.zip, and it does not compile the sketch which compiles fine with Arduino 1.6.0. No output is given, neither in the DOS (cmd) window nor in the IDE's window. It's not really a helpful comment I guess... |
@KurTell - Any chance you could post that sketch, or email it directly to me? (paul at pjrc dot dom) |
👍 Hope this gets merged soon. |
I have a test case, i write two lib, the first may use or may not use the second lib. |
Fixed by arduino-builder, since version 1.0.0-beta8, available with hourly builds http://www.arduino.cc/en/Main/Software#hourly |
Not fixed with arduino-builder version 1.0.0-beta9 :( |
Tracking better implementation at arduino/arduino-builder#12 |
As recently discussed on the mailing list:
https://groups.google.com/a/arduino.cc/forum/#!topic/developers/b9Ckocu3ptw
Solves issue #236