Skip to content

Implement coverage reports for Starlark #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

arvi18
Copy link

@arvi18 arvi18 commented Apr 16, 2025

If the new flag --starlark_coverage_report is set to a non-empty value, the Starlark coverage recorder will be enabled for all .bzl files under the output base executed during the Blaze command. At the end of the command, the collected coverage data will be emitted to the file path given as the flag argument in LCOV format, containing function, line, and branch coverage information.

Summary by CodeRabbit

  • New Features
    • Added support for collecting and reporting code coverage for Starlark files, outputting results in LCOV format.
    • Introduced a new command-line option to generate Starlark coverage reports for executed files during a build.
  • Bug Fixes
    • Improved handling of parameter offsets to avoid potential errors with missing identifiers.
  • Refactor
    • Centralized and streamlined logic for printing comprehension clauses in code output.
  • Chores
    • Added a new failure code for coverage report dumping errors.

fmeum added 5 commits May 31, 2022 11:57
A `ConditionalExpression`, differing from an `IfStatement`, always has
an `else` case.
The generated coverage report includes function, line, and branch
coverage information in LCOV format.
If the new flag `--starlark_coverage_report` is set to a non-empty
value, the Starlark coverage recorder will be enabled for all `.bzl`
files under the output base executed during the Blaze command. At the
end of the command, the collected coverage data will be emitted to the
file path given as the flag argument in LCOV format.
@arvi18
Copy link
Author

arvi18 commented Apr 16, 2025

CC @Wyverald

I generated reports for some rulesets and found the results to be relatively accurate, but this would definitely benefit from more testing (it doesn't come with test cases yet) and polish. I'm also not sure whether it works for repository rules and/or functions executed during the execution phase, e.g. map_each callbacks. I wanted to open a PR to get early feedback and will happily fix bugs and supply tests if this is generally viewed favorably.

@arvi18
Copy link
Author

arvi18 commented Apr 16, 2025

Thanks for the PR, Fabian! I'll leave it to Jon & Sasha to review this. (NB: Jon is OOO until 9 June.)

Also cc @adonovan

@arvi18
Copy link
Author

arvi18 commented Apr 16, 2025

I fixed some bugs and refactored the branch logic, but will not make further changes.

@arvi18
Copy link
Author

arvi18 commented Apr 16, 2025

Thanks for sending this, but a change this big needs a design discussion before code. What problem are we trying to solve? What design do you propose? How can we make it simpler?

Sorry, I certainly didn't intend to skip proper process here. I wrote this code out of a concrete need to better understand the edge cases of rulesets I maintain and to improve my understanding of the Java Starlark implementation. I just thought about sharing it hoping that it could be useful for others. I am completely open to rewriting this from scratch or scrapping it entirely, depending on what the outcome of the design discussion is.

My initial questions are:

  • Does this make the interpreter slower? A lot of work has gone into making it as fast as it is (although there is still room for improvement). Benchmarking is time consuming because the JVM is noisy---execution time may vary by 5% or more---making it hard to find regressions down at the 1-2% level.

The way the hooks are currently implemented, I won't be able to dismiss this concern with confidence. As you said, benchmarking doesn't necessarily catch all regressions on such a low level.

I do think that there is a way to implement the coverage hooks without any overhead in the case where coverage isn't being collected:

  1. Make the static CoverageRecorder instance final and populate it based on e.g. a system property starlark.lcov_report_file being set to a non-empty value.
  2. Make --starlark_coverage_profile a startup option that sets this system property for the Bazel server JVM.
  3. Add logic that dumps the report on server shutdown.

The static final field should be inlined by the JIT - if we want to be certain, we could also make it a simple boolean that triggers a coverage slow path if true. I could then double-check that the code emitted by the JIT doesn't change with the PR merged.

Do you like this approach better?

  • Do we need all this new machinery? In principle, a coverage tool could be implemented as a debugger that simply steps through the whole program one instruction at a time recording the current pc. We already have a primitive debugger. Can its hooks be used to record coverage, with minimal changes?

After a first read of the code, I'm inclined to say this would work for statement coverage, but not for more fine-grained notions of coverage.

  • Do we need branch coverage? I notice the tool records control jumps such as break statements and the short-circuit in if true || true {, and information about individual parameters. Is this necessary? Statement coverage is good enough for most purposes and is less invasive.

I agree that statement coverage would already be quite useful. However, given the general conciseness of Starlark (especially with list/dict comprehensions and conditional expressions), a single statement can encode rather complex logic. With the rulesets I have tried this on, statement coverage has almost always been nearing 100%, but branch coverage revealed some interesting uncovered cases (e.g., handling of workspace vs. external repository, checks for missing/superfluous files). While branch coverage information for breaks may not be as useful, I found it better to cover all branches rather than just some of them.

Assuming we find a way to ensure that coverage support introduces no overhead at all (as possibly described above), would you still prefer to remove some or all of the branch coverage hooks?.

@arvi18
Copy link
Author

arvi18 commented Apr 16, 2025

@adonovan Would you like me to prototype the static final approach and take a look at the JIT output? I think that this has a fair chance of alleviating the performance worries entirely.

@arvi18
Copy link
Author

arvi18 commented Apr 16, 2025

Finally taking a look at this, as I promised Fabian back at BazelCon.

It looks like there's quite a bit more code to this PR than I originally thought. I'm interested in supporting more tooling for Starlark in the new year, coverage included, but in light of

  • the added code complexity (who will maintain all this extra logic?),
  • concerns of performance regression, and
  • uncertainty about the extent of the benefit of this feature to users, and the relative benefits of expression granularity vs statement granularity,

it's clear that a quick acceptance of this PR is off the table. I'd love to have a more involved discussion centered around the benefit vs maintainability tradeoff in Q1. I'll keep the PR open for now.

@arvi18
Copy link
Author

arvi18 commented Apr 16, 2025

@brandjon @tetromino Friendly ping on this.

I could trim this down to just line coverage and ensure that there is no runtime overhead if coverage is disabled.

@gandalf-repo
Copy link

@coderabbitai full review

Copy link

coderabbitai bot commented Apr 17, 2025

✅ Actions performed

Full review triggered.

Copy link

coderabbitai bot commented Apr 17, 2025

Walkthrough

This change introduces comprehensive Starlark code coverage instrumentation and reporting support. It adds a new command-line option to request LCOV-format coverage reports for Starlark files, and integrates coverage collection into the Blaze command execution flow. The Starlark evaluation engine is instrumented to record coverage at the AST node level, including branches and function entries. New infrastructure for coverage recording, AST traversal, and LCOV report generation is implemented. The system is thread-safe and supports filtering by filename. An explicit failure code is added for coverage report dumping errors, and minor related refactoring and bug fixes are included.

Changes

File(s) Change Summary
src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java Integrated Starlark coverage collection and report dumping into the Blaze command execution flow, including error handling for report file writing.
src/main/java/com/google/devtools/build/lib/runtime/CommonCommandOptions.java Added --starlark_coverage_report command-line option for specifying the output path of the Starlark LCOV report.
src/main/java/net/starlark/java/eval/Eval.java Instrumented the evaluation engine with coverage recording hooks for statements, expressions, parameters, and control flow branches.
src/main/java/net/starlark/java/eval/Starlark.java Added static methods to start coverage collection and to dump collected coverage data.
src/main/java/net/starlark/java/syntax/BUILD Included new source files CoverageRecorder.java and CoverageVisitor.java in the build configuration.
src/main/java/net/starlark/java/syntax/CoverageRecorder.java, src/main/java/net/starlark/java/syntax/CoverageVisitor.java Introduced a coverage recording system with AST traversal, branch/function/line tracking, and LCOV report generation.
src/main/java/net/starlark/java/syntax/NodePrinter.java Refactored comprehension clause printing into a dedicated method for clarity and reuse.
src/main/java/net/starlark/java/syntax/NodeVisitor.java Removed a null check before visiting the else-case in conditional expressions.
src/main/java/net/starlark/java/syntax/Parameter.java Fixed getEndOffset() in the Star subclass to handle null identifiers safely.
src/main/java/net/starlark/java/syntax/Program.java Registered new Program instances with the coverage recorder upon compilation.
src/main/protobuf/failure_details.proto Added a new failure code for Starlark coverage report dump failures.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant BlazeCommandDispatcher
    participant Starlark
    participant CoverageRecorder
    participant FileSystem

    User->>BlazeCommandDispatcher: Run command with --starlark_coverage_report=path
    BlazeCommandDispatcher->>Starlark: startCoverageCollection(filenameMatcher)
    Starlark->>CoverageRecorder: startCoverageCollection(filenameMatcher)
    BlazeCommandDispatcher->>...: Execute command (Starlark files evaluated)
    ...->>CoverageRecorder: recordCoverage(node) (during evaluation)
    BlazeCommandDispatcher->>FileSystem: Open coverage report file for writing
    alt File open fails
        BlazeCommandDispatcher->>User: Print error, exit with STARLARK_COVERAGE_REPORT_DUMP_FAILURE
    else File open succeeds
        BlazeCommandDispatcher->>Starlark: dumpCoverage(PrintWriter)
        Starlark->>CoverageRecorder: dump(PrintWriter)
        CoverageRecorder->>FileSystem: Write LCOV report
        BlazeCommandDispatcher->>User: Command completes
    end
Loading

Poem

In the warren of code, a new trail appears,
Starlark’s coverage counted—let’s give three cheers!
With branches and lines, and functions in tow,
Reports sprout in LCOV, for all to know.
If errors arise, a bunny hops back,
But with these new tools, we’re right on track!
🐇✨

Tip

⚡💬 Agentic Chat (Pro Plan, General Availability)
  • We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (9)
src/main/java/net/starlark/java/syntax/Program.java (2)

88-90: Consider also registering programs compiled from expressions

The current implementation only registers programs created via compileFile with the coverage recorder, but not those created via compileExpr. This could lead to incomplete coverage data when expressions are executed.

Consider whether expression coverage should also be tracked by adding similar registration code to the compileExpr method.


88-90: Add error handling for the registration call

The register call to the CoverageRecorder has no error handling. If registration fails for any reason, it could potentially cause compilation to fail or leave the coverage recorder in an inconsistent state.

Consider adding a try-catch block to handle potential exceptions from the registration process.

src/main/java/com/google/devtools/build/lib/runtime/CommonCommandOptions.java (1)

296-303: The command-line option is well-defined

The option follows the established pattern in this file and provides clear documentation of its purpose. It's properly categorized under logging with the appropriate effect tag.

One suggestion would be to enhance the help text with more details about the LCOV format, the types of coverage collected (functions, lines, branches), and any performance implications of enabling coverage.

src/main/java/net/starlark/java/eval/Starlark.java (1)

978-984: Consider adding a method to check if coverage is enabled

To allow conditional behavior based on whether coverage is enabled, consider adding a method to check the coverage state. This would help optimize code paths that only need to execute when coverage is being collected.

+/**
+ * Returns whether Starlark coverage collection is currently enabled.
+ *
+ * @return true if coverage collection is active, false otherwise
+ */
+public static boolean isCoverageEnabled() {
+  return CoverageRecorder.getInstance().isActive();
+}
src/main/java/net/starlark/java/eval/Eval.java (1)

493-494: Hot‑path overhead: guard coverage when disabled

eval is invoked for every expression; unconditionally calling
CoverageRecorder.getInstance().recordCoverage(expr) adds at least one
virtual method dispatch per node even when coverage is turned off.
If the recorder implements a cheap singleton noop this may be fine, but you can eliminate the call entirely:

if (CoverageRecorder.isActive()) {
  CoverageRecorder.getInstance().recordCoverage(expr);
}

Worth considering if micro‑benchmarks show a noticeable slowdown.

src/main/java/net/starlark/java/syntax/CoverageVisitor.java (1)

85-92: String concatenation cost in deeply‑nested functions

enterFunction rebuilds the full a > b > c name by streaming and joining the
entire stack on every function entry.
In files with many nested lambdas this is O(depth²) and allocates a new
String each time.

An incremental StringBuilder (push/pop) or caching the previously built
prefix would avoid repeated traversals and allocations.

Not critical, but easy to optimise later.

src/main/java/net/starlark/java/syntax/CoverageRecorder.java (3)

158-167: dump should not close the PrintWriter it did not open

Closing the writer here prevents callers from writing additional data (e.g. merging multiple recorders) or from handling the resource lifecycle themselves (try‑with‑resources, error handling, etc.).
Leave responsibility to the caller:

-    registeredPrograms.stream()
-        .sorted(Comparator.comparing(Program::getFilename))
-        .forEachOrdered(program -> {
-          CoverageNodeVisitor visitor = new CoverageNodeVisitor(program);
-          visitor.visitAll(program.getResolvedFunction().getBody());
-          visitor.dump(out);
-        });
-    out.close();
+    registeredPrograms.stream()
+        .sorted(Comparator.comparing(Program::getFilename))
+        .forEachOrdered(program -> {
+          CoverageNodeVisitor visitor = new CoverageNodeVisitor(program);
+          visitor.visitAll(program.getResolvedFunction().getBody());
+          visitor.dump(out);
+        });
   }

115-136: Avoid creating a new CoverageVisitor for every top‑level statement

register instantiates a fresh anonymous CoverageVisitor for each top‑level statement, causing:

  1. Redundant allocations (can be thousands in large files).
  2. Missed opportunities to share state across statements (e.g., sets/maps).

Refactor to create one visitor per program and call visitor.visitAll(program.getResolvedFunction().getBody()) once, mirroring the approach used later in dump.


93-103: Potential memory leak: Node keys retain full AST

counts / virtualJumpCounts store strong references from long‑lived maps to every Node, which in turn references the entire subtree and the containing Program.
If the JVM server process lives for many Bazel commands, this can keep all compiled ASTs alive even after dumping coverage.

Consider:

  • Using a WeakHashMap<Node, LongAdder> (synchronised wrapper) so entries are cleared once the AST is unreachable, or
  • Storing a lightweight key (e.g., (filename,line,column)).
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0c98a2 and b603de0.

📒 Files selected for processing (12)
  • src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java (3 hunks)
  • src/main/java/com/google/devtools/build/lib/runtime/CommonCommandOptions.java (1 hunks)
  • src/main/java/net/starlark/java/eval/Eval.java (17 hunks)
  • src/main/java/net/starlark/java/eval/Starlark.java (2 hunks)
  • src/main/java/net/starlark/java/syntax/BUILD (1 hunks)
  • src/main/java/net/starlark/java/syntax/CoverageRecorder.java (1 hunks)
  • src/main/java/net/starlark/java/syntax/CoverageVisitor.java (1 hunks)
  • src/main/java/net/starlark/java/syntax/NodePrinter.java (3 hunks)
  • src/main/java/net/starlark/java/syntax/NodeVisitor.java (1 hunks)
  • src/main/java/net/starlark/java/syntax/Parameter.java (1 hunks)
  • src/main/java/net/starlark/java/syntax/Program.java (1 hunks)
  • src/main/protobuf/failure_details.proto (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java (1)
src/main/java/net/starlark/java/eval/Starlark.java (1)
  • Starlark (55-985)
src/main/java/net/starlark/java/syntax/CoverageRecorder.java (2)
src/main/java/net/starlark/java/syntax/Resolver.java (1)
  • Function (119-277)
src/main/java/net/starlark/java/syntax/Program.java (1)
  • Program (24-105)
🔇 Additional comments (9)
src/main/java/net/starlark/java/syntax/NodeVisitor.java (1)

177-177:

✅ Verification successful

Verify the safety of removing the null check

The null check before visiting the else-case node has been removed, making this visit unconditional. Other similar methods in this file (like visitBlock, visit for defaultValue, Result, etc.) still retain their null checks. This change could potentially cause NullPointerExceptions if getElseCase() returns null and visit() doesn't handle null inputs.

This change is likely related to the new coverage tracking implementation. Please verify that either:

  1. ConditionalExpression.getElseCase() is guaranteed to never return null, or
  2. The CoverageVisitor implementation handles null nodes appropriately.

🏁 Script executed:

#!/bin/bash
# Check if ConditionalExpression.getElseCase() can return null
grep -A 10 "public .* getElseCase" src/main/java/net/starlark/java/syntax/ConditionalExpression.java

# Look for null checks in the CoverageVisitor class
grep -A 3 "visit.*ConditionalExpression" src/main/java/net/starlark/java/syntax/CoverageVisitor.java

Length of output: 660


🏁 Script executed:

#!/bin/bash
echo "=== visitBranch definition in CoverageVisitor.java ==="
grep -R "visitBranch" -n src/main/java/net/starlark/java/syntax/CoverageVisitor.java -A20

echo
echo "=== Beginning of NodeVisitor.java (to inspect visit(Node)) ==="
sed -n '1,200p' src/main/java/net/starlark/java/syntax/NodeVisitor.java

Length of output: 9410


Null check removal is safe

ConditionalExpression.getElseCase() is set via the constructor and, per the language grammar, always carries a non‑null expression; it can never be null. The unguarded call in NodeVisitor.visit(ConditionalExpression) therefore cannot NPE, and CoverageVisitor.visitBranch already allows a nullable successor. No further changes are required.

src/main/java/net/starlark/java/syntax/BUILD (1)

24-25: LGTM! New coverage implementation files added

The addition of CoverageRecorder.java and CoverageVisitor.java to the build file supports the new Starlark coverage reporting feature. These files will implement the necessary infrastructure for AST traversal with coverage tracking and LCOV-compatible output generation.

src/main/protobuf/failure_details.proto (1)

536-536: LGTM! Appropriate error code for coverage report failures

Adding a specific failure code for Starlark coverage report dumping failures is appropriate. The exit code 36 is consistent with other file-related errors in the system like SERVER_FILE_WRITE_FAILURE and other I/O operation failures.

src/main/java/net/starlark/java/syntax/Parameter.java (1)

119-121: LGTM! Defensive null check prevents potential NPE

This change adds proper null handling for Star parameters with null identifiers. Since the constructor accepts a nullable identifier, this defensive programming approach prevents NullPointerExceptions when calculating the end offset of a Star parameter without an identifier.

src/main/java/net/starlark/java/syntax/NodePrinter.java (3)

44-46: Good improvement for handling Comprehension.Clause nodes

Adding explicit handling for Comprehension.Clause nodes in the printNode method improves the completeness of this method and ensures all node types can be properly processed.


283-283: Cleaner code through refactoring

Changing the inline conditional logic to call the new printClause method improves readability and maintainability.


479-491: Well-structured helper method for printing clauses

The newly added printClause method properly handles both For and If clauses with the correct syntax. The refactoring centralizes this logic in a single place, making it easier to maintain and extend if needed.

src/main/java/com/google/devtools/build/lib/runtime/BlazeCommandDispatcher.java (1)

375-381: Ensure coverage collection is reset between commands

Starlark.startCoverageCollection is invoked every time a command is executed when the flag is set, but there is no corresponding call that resets or guards against a previous invocation still being active.
If two consecutive commands run in the same server process, coverage from the first may pollute the second (or the second call might be a no‑op, depending on startCoverageCollection’s implementation).

Please verify that:

  1. startCoverageCollection idempotently resets any prior state, or
  2. Coverage is explicitly cleared after dumpCoverage, or
  3. We guard with a boolean to only start once per server lifetime.

If none of the above holds, users will get confusing, merged LCOV reports.

src/main/java/net/starlark/java/syntax/CoverageRecorder.java (1)

190-199:

❌ Incorrect review comment

Same NPE risk when populating BranchInfo

counts.get(positiveUniqueSuccessor) (and negativeUniqueSuccessor) can return null if the successor hasn’t reached visitCode yet. Guard the lookup just like the previous comment or defer branch materialisation until after the whole subtree has been visited.


I’ll locate how counts is declared and how it’s populated to confirm whether get(...) can return null.


🏁 Script executed:

#!/bin/bash
grep -n "private.*counts" -n src/main/java/net/starlark/java/syntax/CoverageRecorder.java
grep -n "counts *= *new" -n src/main/java/net/starlark/java/syntax/CoverageRecorder.java
grep -n "counts\\.put" -n src/main/java/net/starlark/java/syntax/CoverageRecorder.java
rg -n "counts\\.get" src/main/java/net/starlark/java/syntax/CoverageRecorder.java

Length of output: 948


🏁 Script executed:

#!/bin/bash
# Show context around counts.put to see how and when entries are added
rg -n -C5 'counts\.put' src/main/java/net/starlark/java/syntax/CoverageRecorder.java

Length of output: 362


🏁 Script executed:

#!/bin/bash
sed -n '160,240p' src/main/java/net/starlark/java/syntax/CoverageRecorder.java

Length of output: 3373


I’ll inspect CoverageVisitor to understand the traversal order of visitBranch vs. visitCode.


🏁 Script executed:

#!/bin/bash
# Locate the CoverageVisitor class and its methods
rg -n "class CoverageVisitor" src/main/java/net/starlark/java/syntax/CoverageVisitor.java
rg -n -C5 "visitBranch" src/main/java/net/starlark/java/syntax/CoverageVisitor.java
rg -n -C5 "visitCode" src/main/java/net/starlark/java/syntax/CoverageVisitor.java

Length of output: 8760


🏁 Script executed:

#!/bin/bash
sed -n '80,160p' src/main/java/net/starlark/java/syntax/CoverageRecorder.java

Length of output: 2725


No NPE risk in visitBranch – all counts are pre‑populated in register

The counts map is fully populated for every executable Node (including branch conditions and successors) by the anonymous CoverageVisitor in register(…) (its visitCode runs on every node in the AST). Likewise, virtualJumpCounts is initialized for any branch with a missing negativeUniqueSuccessor. By the time dump() runs and the CoverageNodeVisitor calls counts.get(…) or virtualJumpCounts.get(…), neither lookup can return null. You can safely remove the suggested null guards.

Likely an incorrect or invalid review comment.

Comment on lines +978 to +980
public static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
CoverageRecorder.startCoverageCollection(filenameMatcher);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add documentation for startCoverageCollection method

This public API method lacks documentation explaining its purpose, when to use it, and how it integrates with the broader coverage system. Consider adding a Javadoc comment to clarify:

  • The purpose of the method
  • What the filenameMatcher parameter does exactly (what format it expects, examples)
  • When this should be called in relation to Starlark execution
  • Thread-safety guarantees
  • Any performance implications
+/**
+ * Starts collecting coverage data for Starlark code execution.
+ * 
+ * <p>Coverage will be collected for all Starlark files that match the provided filename matcher.
+ * This method should be called before any Starlark code execution for which coverage data is
+ * desired.
+ *
+ * <p>This method is thread-safe and can be called concurrently with Starlark execution.
+ *
+ * @param filenameMatcher a function that returns true for filenames that should have coverage
+ *     collected, and false for files to exclude from coverage
+ */
 public static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
   CoverageRecorder.startCoverageCollection(filenameMatcher);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
CoverageRecorder.startCoverageCollection(filenameMatcher);
}
/**
* Starts collecting coverage data for Starlark code execution.
*
* <p>Coverage will be collected for all Starlark files that match the provided filename matcher.
* This method should be called before any Starlark code execution for which coverage data is
* desired.
*
* <p>This method is thread-safe and can be called concurrently with Starlark execution.
*
* @param filenameMatcher a function that returns true for filenames that should have coverage
* collected, and false for files to exclude from coverage
*/
public static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
CoverageRecorder.startCoverageCollection(filenameMatcher);
}

Comment on lines +982 to +984
public static void dumpCoverage(PrintWriter out) {
CoverageRecorder.getInstance().dump(out);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add documentation and error handling for dumpCoverage method

This public API method also lacks documentation and doesn't provide any error handling mechanism. If an error occurs while writing the coverage data, it would silently fail.

Consider enhancing this method with proper documentation and error handling:

+/**
+ * Writes the collected Starlark coverage data to the specified writer in LCOV format.
+ * 
+ * <p>This method should be called after Starlark code execution has completed to generate
+ * the coverage report. It will dump coverage for all files that matched the filter provided
+ * to {@link #startCoverageCollection}.
+ *
+ * <p>The coverage report includes function, line, and branch coverage information.
+ *
+ * @param out the writer to which the LCOV report will be written
+ * @throws RuntimeException if an error occurs while writing the coverage data
+ */
 public static void dumpCoverage(PrintWriter out) {
-  CoverageRecorder.getInstance().dump(out);
+  try {
+    CoverageRecorder.getInstance().dump(out);
+  } catch (Exception e) {
+    throw new RuntimeException("Failed to dump Starlark coverage data", e);
+  }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static void dumpCoverage(PrintWriter out) {
CoverageRecorder.getInstance().dump(out);
}
/**
* Writes the collected Starlark coverage data to the specified writer in LCOV format.
*
* <p>This method should be called after Starlark code execution has completed to generate
* the coverage report. It will dump coverage for all files that matched the filter provided
* to {@link #startCoverageCollection}.
*
* <p>The coverage report includes function, line, and branch coverage information.
*
* @param out the writer to which the LCOV report will be written
* @throws RuntimeException if an error occurs while writing the coverage data
*/
public static void dumpCoverage(PrintWriter out) {
try {
CoverageRecorder.getInstance().dump(out);
} catch (Exception e) {
throw new RuntimeException("Failed to dump Starlark coverage data", e);
}
}

Comment on lines +613 to +624
if (!commonOptions.starlarkCoverageReport.isEmpty()) {
FileOutputStream out;
try {
out = new FileOutputStream(commonOptions.starlarkCoverageReport);
} catch (IOException ex) {
String message = "Starlark coverage recorder: " + ex.getMessage();
outErr.printErrLn(message);
return createDetailedCommandResult(
message, FailureDetails.Command.Code.STARLARK_COVERAGE_REPORT_DUMP_FAILURE);
}
Starlark.dumpCoverage(new PrintWriter(out));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Leak‑prone I/O: use try‑with‑resources and flush PrintWriter

FileOutputStream and the wrapping PrintWriter are opened but never closed.
On long‑running servers this leaks file descriptors, and on short‑lived invocations you may miss buffered bytes because PrintWriter is not flushed.

-        FileOutputStream out;
-        try {
-          out = new FileOutputStream(commonOptions.starlarkCoverageReport);
-        } catch (IOException ex) {
-          ...
-        }
-        Starlark.dumpCoverage(new PrintWriter(out));
+        try (FileOutputStream fos =
+                new FileOutputStream(commonOptions.starlarkCoverageReport);
+             PrintWriter pw = new PrintWriter(fos)) {
+          Starlark.dumpCoverage(pw);
+        } catch (IOException ex) {
+          ...
+        }

This guarantees flushing and closes both streams even if dumpCoverage throws.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!commonOptions.starlarkCoverageReport.isEmpty()) {
FileOutputStream out;
try {
out = new FileOutputStream(commonOptions.starlarkCoverageReport);
} catch (IOException ex) {
String message = "Starlark coverage recorder: " + ex.getMessage();
outErr.printErrLn(message);
return createDetailedCommandResult(
message, FailureDetails.Command.Code.STARLARK_COVERAGE_REPORT_DUMP_FAILURE);
}
Starlark.dumpCoverage(new PrintWriter(out));
}
if (!commonOptions.starlarkCoverageReport.isEmpty()) {
try (FileOutputStream fos =
new FileOutputStream(commonOptions.starlarkCoverageReport);
PrintWriter pw = new PrintWriter(fos)) {
Starlark.dumpCoverage(pw);
} catch (IOException ex) {
String message = "Starlark coverage recorder: " + ex.getMessage();
outErr.printErrLn(message);
return createDetailedCommandResult(
message, FailureDetails.Command.Code.STARLARK_COVERAGE_REPORT_DUMP_FAILURE);
}
}

Comment on lines +166 to +168
Parameter parameter = rfn.getParameters().get(i);
CoverageRecorder.getInstance().recordCoverage(parameter.getIdentifier());
Expression expr = parameter.getDefaultValue();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Null‑safety for anonymous * / ** parameters

Parameter.getIdentifier() may legally return null for the lone "*" that separates positional from keyword‑only parameters.
Calling CoverageRecorder.recordCoverage(null) will throw a NullPointerException, crashing evaluation of such functions.

-      CoverageRecorder.getInstance().recordCoverage(parameter.getIdentifier());
+      Identifier pid = parameter.getIdentifier();
+      if (pid != null) {
+        CoverageRecorder.getInstance().recordCoverage(pid);
+      }

Apply the same guard inside the later loop that visits varargs/kwargs.

Also applies to: 178-181

Comment on lines +181 to +185
protected void visitFunction(String identifier, Node defStatement, Node firstBodyStatement) {
functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(),
counts.get(firstBodyStatement).sum()));
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

counts.get(firstBodyStatement) can be null → potential NullPointerException

visitFunction is executed before visitCode has visited the first body statement (pre‑order traversal), so the LongAdder may not yet be in counts. Dereferencing sum() without a null‑check will crash during initial registration for every function.

-      functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(),
-          counts.get(firstBodyStatement).sum()));
+      LongAdder adder = counts.get(firstBodyStatement);
+      long count = adder == null ? 0 : adder.sum();
+      functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(), count));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
protected void visitFunction(String identifier, Node defStatement, Node firstBodyStatement) {
functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(),
counts.get(firstBodyStatement).sum()));
}
protected void visitFunction(String identifier, Node defStatement, Node firstBodyStatement) {
LongAdder adder = counts.get(firstBodyStatement);
long count = adder == null ? 0 : adder.sum();
functions.add(new FunctionInfo(identifier, defStatement.getStartLocation().line(), count));
}

Comment on lines +41 to +56
static CoverageRecorder getInstance() {
return CoverageRecorderHolder.INSTANCE;
}

/**
* Collect coverage for all {@link Program}s compiled after the call whose
* {@link Program#getFilename()} matches {@code filenameMatcher}.
*/
static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
CoverageRecorderHolder.INSTANCE = new LcovCoverageRecorder(filenameMatcher);
}

class CoverageRecorderHolder {

private static CoverageRecorder INSTANCE = new NoopCoverageRecorder();

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Make INSTANCE volatile (or use an AtomicReference) to avoid a data‑race

startCoverageCollection can be invoked from a different thread than recordCoverage/getInstance.
Because CoverageRecorderHolder.INSTANCE is a mutable static field without the volatile modifier, the write performed in line 50 may not become visible to threads already calling getInstance(), leading them to observe a partially‑constructed LcovCoverageRecorder or to keep using the NoopCoverageRecorder.

-  class CoverageRecorderHolder {
-    private static CoverageRecorder INSTANCE = new NoopCoverageRecorder();
+  class CoverageRecorderHolder {
+    // Volatile guarantees safe publication of the recorder instance
+    private static volatile CoverageRecorder INSTANCE = new NoopCoverageRecorder();

Alternatively, replace the field with an AtomicReference<CoverageRecorder> and perform a CAS to protect against multiple concurrent initialisations.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static CoverageRecorder getInstance() {
return CoverageRecorderHolder.INSTANCE;
}
/**
* Collect coverage for all {@link Program}s compiled after the call whose
* {@link Program#getFilename()} matches {@code filenameMatcher}.
*/
static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
CoverageRecorderHolder.INSTANCE = new LcovCoverageRecorder(filenameMatcher);
}
class CoverageRecorderHolder {
private static CoverageRecorder INSTANCE = new NoopCoverageRecorder();
static CoverageRecorder getInstance() {
return CoverageRecorderHolder.INSTANCE;
}
/**
* Collect coverage for all {@link Program}s compiled after the call whose
* {@link Program#getFilename()} matches {@code filenameMatcher}.
*/
static void startCoverageCollection(Function<String, Boolean> filenameMatcher) {
CoverageRecorderHolder.INSTANCE = new LcovCoverageRecorder(filenameMatcher);
}
class CoverageRecorderHolder {
// Volatile guarantees safe publication of the recorder instance
private static volatile CoverageRecorder INSTANCE = new NoopCoverageRecorder();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants