|
1 |
| -# Guidelines |
| 1 | +# Project Guidelines |
2 | 2 |
|
3 | 3 | This document describes general guidelines for the Safe-DS Python Library. In the **DO**/**DON'T** examples below we either show _client code_ to describe the code users should/shouldn't have to write, or _library code_ to describe the code we, as library developers, need to write to achieve readable client code. We'll continuously update this document as we find new categories of usability issues.
|
4 | 4 |
|
@@ -56,6 +56,43 @@ Some flag parameters drastically alter the semantics of a function. This can lea
|
56 | 56 | table.drop("name", axis="columns")
|
57 | 57 | ```
|
58 | 58 |
|
| 59 | +### Return copies of objects |
| 60 | + |
| 61 | +Modifying objects in-place |
| 62 | +can lead to surprising behaviour |
| 63 | +and hard-to-find bugs. |
| 64 | +Methods shall never change |
| 65 | +the object they're called on |
| 66 | +or any of their parameters. |
| 67 | + |
| 68 | +!!! success "**DO** (library code):" |
| 69 | + |
| 70 | + ```py |
| 71 | + result = self._data.copy() |
| 72 | + result.columns = self._schema.column_names |
| 73 | + result[new_column.name] = new_column._data |
| 74 | + return Table._from_pandas_dataframe(result) |
| 75 | + ``` |
| 76 | + |
| 77 | +!!! failure "**DON'T** (library code):" |
| 78 | + |
| 79 | + ```py |
| 80 | + self._data.add(new_column, axis='column') |
| 81 | + ``` |
| 82 | + |
| 83 | +The corresponding docstring should explicitly state |
| 84 | +that a method returns a copy: |
| 85 | + |
| 86 | +!!! success "**DO** (library code):" |
| 87 | + |
| 88 | + ```py |
| 89 | + """ |
| 90 | + Return a new table with the given column added as the last column. |
| 91 | + The original table is not modified. |
| 92 | + ... |
| 93 | + """ |
| 94 | + ``` |
| 95 | + |
59 | 96 | ### Avoid uncommon abbreviations
|
60 | 97 |
|
61 | 98 | Write full words rather than abbreviations. The increased verbosity is offset by better readability, better functioning auto-completion, and a reduced need to consult the documentation when writing code. Common abbreviations like CSV or HTML are fine though, since they rarely require explanation.
|
@@ -205,27 +242,6 @@ The user should not have to deal with exceptions that are defined in the wrapper
|
205 | 242 | return pd.read_csv(path) # May raise a pd.ParserError
|
206 | 243 | ```
|
207 | 244 |
|
208 |
| -### Sort entries in `__all__` lists alphabetically |
209 |
| -The entries in the `__all__` list in `__init__.py` files should be sorted alphabetically. This helps reduce the likelihood of merge conflicts when new entries are introduced on different branches. |
210 |
| -!!! success "**DO** (library code):" |
211 |
| - |
212 |
| - ```py |
213 |
| - __all__ = [ |
214 |
| - "ColumnSizeError", |
215 |
| - "DuplicateColumnNameError", |
216 |
| - "MissingValuesColumnError" |
217 |
| - ] |
218 |
| - ``` |
219 |
| -!!! failure "**DON'T** (library code):" |
220 |
| - |
221 |
| - ```py |
222 |
| - __all__ = [ |
223 |
| - "MissingValuesColumnError", |
224 |
| - "ColumnSizeError", |
225 |
| - "DuplicateColumnNameError" |
226 |
| - ] |
227 |
| - ``` |
228 |
| - |
229 | 245 | ### Group API elements by task
|
230 | 246 |
|
231 | 247 | Packages should correspond to a specific task like classification or imputation. This eases discovery and makes it easy to switch between different solutions for the same task.
|
@@ -370,7 +386,114 @@ Examples
|
370 | 386 |
|
371 | 387 | ## Tests
|
372 | 388 |
|
373 |
| -If a function contains more code than just the getting or setting of a value, automated test should be added to the [`tests`][tests-folder] folder. The file structure in the tests folder should mirror the file structure of the [`src`][src-folder] folder. |
| 389 | +We aim for 100% line coverage, |
| 390 | +so automated tests should be added |
| 391 | +for any new function. |
| 392 | + |
| 393 | +### File structure |
| 394 | + |
| 395 | +Tests belong in the [`tests`][tests-folder] folder. |
| 396 | +The file structure in the tests folder |
| 397 | +should mirror the file structure |
| 398 | +of the [`src`][src-folder] folder. |
| 399 | + |
| 400 | +### Naming |
| 401 | + |
| 402 | +Names of test functions |
| 403 | +shall start with `test_should_` |
| 404 | +followed by a description |
| 405 | +of the expected behaviour, |
| 406 | +e.g. `test_should_add_column`. |
| 407 | + |
| 408 | +!!! success "**DO** (library code):" |
| 409 | + |
| 410 | + ```py |
| 411 | + def test_should_raise_if_less_than_or_equal_to_0(self, number_of_trees) -> None: |
| 412 | + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): |
| 413 | + ... |
| 414 | + ``` |
| 415 | + |
| 416 | +!!! failure "**DON'T** (library code):" |
| 417 | + |
| 418 | + ```py |
| 419 | + def test_value_error(self, number_of_trees) -> None: |
| 420 | + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): |
| 421 | + ... |
| 422 | + ``` |
| 423 | + |
| 424 | +### Parametrization |
| 425 | + |
| 426 | +Tests should be parametrized |
| 427 | +using `@pytest.mark.parametrize`, |
| 428 | +even if there is only a single test case. |
| 429 | +This makes it easier |
| 430 | +to add new test cases in the future. |
| 431 | +Test cases should be given |
| 432 | +descriptive IDs. |
| 433 | + |
| 434 | +!!! success "**DO** (library code):" |
| 435 | + |
| 436 | + ```py |
| 437 | + @pytest.mark.parametrize("number_of_trees", [0, -1], ids=["zero", "negative"]) |
| 438 | + def test_should_raise_if_less_than_or_equal_to_0(self, number_of_trees) -> None: |
| 439 | + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): |
| 440 | + RandomForest(number_of_trees=number_of_trees) |
| 441 | + ``` |
| 442 | + |
| 443 | +!!! failure "**DON'T** (library code):" |
| 444 | + |
| 445 | + ```py |
| 446 | + def test_should_raise_if_less_than_0(self, number_of_trees) -> None: |
| 447 | + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): |
| 448 | + RandomForest(number_of_trees=-1) |
| 449 | + |
| 450 | + def test_should_raise_if_equal_to_0(self, number_of_trees) -> None: |
| 451 | + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): |
| 452 | + RandomForest(number_of_trees=0) |
| 453 | + ``` |
| 454 | + |
| 455 | +## Code style |
| 456 | + |
| 457 | +### Consistency |
| 458 | + |
| 459 | +If there is more than one way |
| 460 | +to solve a particular task, |
| 461 | +check how it has been solved |
| 462 | +at other places in the codebase |
| 463 | +and stick to that solution. |
| 464 | + |
| 465 | +### Sort exported classes in `__init__.py` |
| 466 | + |
| 467 | +Classes defined in a module |
| 468 | +that other classes shall be able to import |
| 469 | +must be defined |
| 470 | +in a list named `__all__` |
| 471 | +in the module's `__init__.py` file. |
| 472 | +This list should be sorted alphabetically, |
| 473 | +to reduce the likelihood of merge conflicts |
| 474 | +when adding new classes to it. |
| 475 | + |
| 476 | +!!! success "**DO** (library code):" |
| 477 | + |
| 478 | + ```py |
| 479 | + __all__ = [ |
| 480 | + "Column", |
| 481 | + "Row", |
| 482 | + "Table", |
| 483 | + "TaggedTable", |
| 484 | + ] |
| 485 | + ``` |
| 486 | + |
| 487 | +!!! failure "**DON'T** (library code):" |
| 488 | + |
| 489 | + ```py |
| 490 | + __all__ = [ |
| 491 | + "Table", |
| 492 | + "TaggedTable", |
| 493 | + "Column", |
| 494 | + "Row", |
| 495 | + ] |
| 496 | + ``` |
374 | 497 |
|
375 | 498 | [src-folder]: https://github.com/Safe-DS/Stdlib/tree/main/src
|
376 | 499 | [tests-folder]: https://github.com/Safe-DS/Stdlib/tree/main/tests
|
|
0 commit comments