Skip to content

Support customization of FormulaEvaluator in PoiSheet #126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sravaniv-3cortex opened this issue Apr 23, 2024 · 1 comment · Fixed by #143
Closed

Support customization of FormulaEvaluator in PoiSheet #126

sravaniv-3cortex opened this issue Apr 23, 2024 · 1 comment · Fixed by #143

Comments

@sravaniv-3cortex
Copy link

sravaniv-3cortex commented Apr 23, 2024

Our use case of reading Excel files requires that errors in resolving the external sheet references in formula evaluations be ignored. This requires calling the org.apache.poi.ss.usermodel.FormulaEvaluator.setIgnoreMissingWorkbooks(boolean). This isn't currently possible with the current PoiSheet implementation - PoiSheet is package-private so it can't be extended to override the getRow method, which allows us to customize the FormulaEvalutor instance used.

What is the feasibility of doing one of these things?

  • Opening up PoiSheet for extension
  • Have a settings class which can be passed by the user and later applied by the PoiSheet to the FormulaEvaluator. This might result in future asks for setting classes for other things.

If nothing above works, the other option is to duplicate PoiSheet with necessary changes and then override PoiItemReader.

@mdeinum
Copy link
Collaborator

mdeinum commented Aug 7, 2024

While working on this I wonder if it would be a sensible default to set this to true by default. Or maybe even take it a step further and when we encounter a formula cell to always return the cached value. As we are reading the files the cached value should, generally speaking, be the right value. We don't inline change related cells or workbooks.

So from a performance perspective it might be even beneficial to, by default, ignore the formula(s).

@mdeinum mdeinum added this to the spring-batch-excel-0.2 milestone Aug 7, 2024
@mdeinum mdeinum self-assigned this Sep 18, 2024
mdeinum added a commit that referenced this issue Sep 18, 2024
Prior to this commit it was impossible to modify or configure the DataFormatter and FormulaEvaluator in use. With this commit we resolve this by introducing two interfaces. 1 DataFormatterCustomizer to customize/configure the the DataFormatter in use. 1 interface the FormulaEvaluatorFactory which is a factory used to create the FormulaEvaluator.

The default DataFormatterCustomizer will configure the formatter to use the cached values from cells instead of evaluating the formulas.

The default FormulaEvaluatorFactory in use will always return null instead of creating a FormulaEvaluator.

Fixes: #126
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants