Skip to content

Commit 2c54ec0

Browse files
zzrillars-reimann
andauthored
docs: Better documentation for developers (#427)
Closes #375. ### Summary of Changes * Improved developer documentation about tests * Added guidelines about copying objects (rather than modifying them in-place) * Added code-style guidelines * Added code review guidelines --------- Co-authored-by: Lars Reimann <[email protected]>
1 parent bc73a6c commit 2c54ec0

File tree

3 files changed

+220
-24
lines changed

3 files changed

+220
-24
lines changed

docs/development/code_review.md

+72
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Code review
2+
3+
This document describes
4+
general guidelines for developers
5+
when reviewing a pull request (PR).
6+
7+
## Verify that the PR solves the addressed issue
8+
9+
### Understand the issue
10+
11+
* Read the issue that was addressed by the PR and make sure you understand it
12+
* Read any discussions that occurred on the issue page
13+
14+
### Check the PR
15+
16+
* Check the PR for __completeness__: Was every point from the issue covered?
17+
* Check for __correctness__: Were the problems described in the issue solved in the intended way?
18+
* (For PRs that introduce new features:) Check the design - does it comply with our [project guidelines][guidelines-general]?
19+
* Check for potential bugs - edge cases, exceptions that may be raised in functions that were called etc.
20+
* Check the code style: Is it readable? Do variables have sensible names? Is it consistent to existing code? Is there a better / more elegant way to do certain things (e.g. [f-string](https://docs.python.org/3/tutorial/inputoutput.html#tut-f-strings) instead of manual string concatenation)?
21+
* Check any warnings reported by your IDE
22+
23+
## Check the tests
24+
25+
* Run the tests locally
26+
* Check if there are any warnings that are not caught by the test suite
27+
* Make sure the test cases are complete - do they cover all edge cases (e.g. empty tables)?
28+
* Make sure they comply to our [project guidelines][guidelines-tests] - are the tests parametrized and do all testcases have descriptive IDs?
29+
30+
## Verify that the PR does not break existing code
31+
32+
This is largely covered by the automated tests.
33+
However, you should always:
34+
35+
* Make sure all tests actually ran through (pytest, linter, code coverage)
36+
* Make sure that the branch is up-to-date with the `main` branch, so you actually test the behaviour that will result once the feature branch is merged
37+
38+
## Check the PR format
39+
40+
* Check that the PR title starts with a fitting [type](https://github.com/Safe-DS/.github/blob/main/.github/CONTRIBUTING.md#types) (e.g. `feat`, `fix`, `docs`, ...)
41+
* Check that the changes introduced by the PR are documented in the PR message
42+
43+
## Requesting changes
44+
45+
If you found any issues with the reviewed PR,
46+
navigate to the `Files changed` tab in the PR page
47+
and click on the respective line
48+
to add your comments.
49+
50+
For more details on specific review features,
51+
check out the [documentation on GitHub][github-review].
52+
53+
## Finishing the review
54+
55+
When done, finish your review
56+
by navigating to the `Files changed` tab
57+
and clicking the green `Review changes` button
58+
on the top right.
59+
60+
If you found no issues,
61+
select the `Approve` option
62+
to approve the changes made by the PR.
63+
If you like, you can add a "LGTM" meme
64+
from [lgtm.party](https://lgtm.party/),
65+
preferably one with a cat image.
66+
67+
If you found any problems,
68+
select `Request Changes` instead.
69+
70+
[guidelines-general]: project_guidelines.md
71+
[guidelines-tests]: project_guidelines.md#tests
72+
[github-review]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests

docs/development/guidelines.md renamed to docs/development/project_guidelines.md

+146-23
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Guidelines
1+
# Project Guidelines
22

33
This document describes general guidelines for the Safe-DS Python Library. In the **DO**/**DON'T** examples below we either show _client code_ to describe the code users should/shouldn't have to write, or _library code_ to describe the code we, as library developers, need to write to achieve readable client code. We'll continuously update this document as we find new categories of usability issues.
44

@@ -56,6 +56,43 @@ Some flag parameters drastically alter the semantics of a function. This can lea
5656
table.drop("name", axis="columns")
5757
```
5858

59+
### Return copies of objects
60+
61+
Modifying objects in-place
62+
can lead to surprising behaviour
63+
and hard-to-find bugs.
64+
Methods shall never change
65+
the object they're called on
66+
or any of their parameters.
67+
68+
!!! success "**DO** (library code):"
69+
70+
```py
71+
result = self._data.copy()
72+
result.columns = self._schema.column_names
73+
result[new_column.name] = new_column._data
74+
return Table._from_pandas_dataframe(result)
75+
```
76+
77+
!!! failure "**DON'T** (library code):"
78+
79+
```py
80+
self._data.add(new_column, axis='column')
81+
```
82+
83+
The corresponding docstring should explicitly state
84+
that a method returns a copy:
85+
86+
!!! success "**DO** (library code):"
87+
88+
```py
89+
"""
90+
Return a new table with the given column added as the last column.
91+
The original table is not modified.
92+
...
93+
"""
94+
```
95+
5996
### Avoid uncommon abbreviations
6097

6198
Write full words rather than abbreviations. The increased verbosity is offset by better readability, better functioning auto-completion, and a reduced need to consult the documentation when writing code. Common abbreviations like CSV or HTML are fine though, since they rarely require explanation.
@@ -205,27 +242,6 @@ The user should not have to deal with exceptions that are defined in the wrapper
205242
return pd.read_csv(path) # May raise a pd.ParserError
206243
```
207244

208-
### Sort entries in `__all__` lists alphabetically
209-
The entries in the `__all__` list in `__init__.py` files should be sorted alphabetically. This helps reduce the likelihood of merge conflicts when new entries are introduced on different branches.
210-
!!! success "**DO** (library code):"
211-
212-
```py
213-
__all__ = [
214-
"ColumnSizeError",
215-
"DuplicateColumnNameError",
216-
"MissingValuesColumnError"
217-
]
218-
```
219-
!!! failure "**DON'T** (library code):"
220-
221-
```py
222-
__all__ = [
223-
"MissingValuesColumnError",
224-
"ColumnSizeError",
225-
"DuplicateColumnNameError"
226-
]
227-
```
228-
229245
### Group API elements by task
230246

231247
Packages should correspond to a specific task like classification or imputation. This eases discovery and makes it easy to switch between different solutions for the same task.
@@ -370,7 +386,114 @@ Examples
370386

371387
## Tests
372388

373-
If a function contains more code than just the getting or setting of a value, automated test should be added to the [`tests`][tests-folder] folder. The file structure in the tests folder should mirror the file structure of the [`src`][src-folder] folder.
389+
We aim for 100% line coverage,
390+
so automated tests should be added
391+
for any new function.
392+
393+
### File structure
394+
395+
Tests belong in the [`tests`][tests-folder] folder.
396+
The file structure in the tests folder
397+
should mirror the file structure
398+
of the [`src`][src-folder] folder.
399+
400+
### Naming
401+
402+
Names of test functions
403+
shall start with `test_should_`
404+
followed by a description
405+
of the expected behaviour,
406+
e.g. `test_should_add_column`.
407+
408+
!!! success "**DO** (library code):"
409+
410+
```py
411+
def test_should_raise_if_less_than_or_equal_to_0(self, number_of_trees) -> None:
412+
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
413+
...
414+
```
415+
416+
!!! failure "**DON'T** (library code):"
417+
418+
```py
419+
def test_value_error(self, number_of_trees) -> None:
420+
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
421+
...
422+
```
423+
424+
### Parametrization
425+
426+
Tests should be parametrized
427+
using `@pytest.mark.parametrize`,
428+
even if there is only a single test case.
429+
This makes it easier
430+
to add new test cases in the future.
431+
Test cases should be given
432+
descriptive IDs.
433+
434+
!!! success "**DO** (library code):"
435+
436+
```py
437+
@pytest.mark.parametrize("number_of_trees", [0, -1], ids=["zero", "negative"])
438+
def test_should_raise_if_less_than_or_equal_to_0(self, number_of_trees) -> None:
439+
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
440+
RandomForest(number_of_trees=number_of_trees)
441+
```
442+
443+
!!! failure "**DON'T** (library code):"
444+
445+
```py
446+
def test_should_raise_if_less_than_0(self, number_of_trees) -> None:
447+
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
448+
RandomForest(number_of_trees=-1)
449+
450+
def test_should_raise_if_equal_to_0(self, number_of_trees) -> None:
451+
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
452+
RandomForest(number_of_trees=0)
453+
```
454+
455+
## Code style
456+
457+
### Consistency
458+
459+
If there is more than one way
460+
to solve a particular task,
461+
check how it has been solved
462+
at other places in the codebase
463+
and stick to that solution.
464+
465+
### Sort exported classes in `__init__.py`
466+
467+
Classes defined in a module
468+
that other classes shall be able to import
469+
must be defined
470+
in a list named `__all__`
471+
in the module's `__init__.py` file.
472+
This list should be sorted alphabetically,
473+
to reduce the likelihood of merge conflicts
474+
when adding new classes to it.
475+
476+
!!! success "**DO** (library code):"
477+
478+
```py
479+
__all__ = [
480+
"Column",
481+
"Row",
482+
"Table",
483+
"TaggedTable",
484+
]
485+
```
486+
487+
!!! failure "**DON'T** (library code):"
488+
489+
```py
490+
__all__ = [
491+
"Table",
492+
"TaggedTable",
493+
"Column",
494+
"Row",
495+
]
496+
```
374497

375498
[src-folder]: https://github.com/Safe-DS/Stdlib/tree/main/src
376499
[tests-folder]: https://github.com/Safe-DS/Stdlib/tree/main/tests

mkdocs.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,9 @@ nav:
1515
- Glossary: glossary.md
1616
- Development:
1717
- Environment: development/environment.md
18-
- Guidelines: development/guidelines.md
18+
- Project Guidelines: development/project_guidelines.md
1919
- Contributing: https://github.com/Safe-DS/Stdlib/contribute
20+
- Code Review Guide: development/code_review.md
2021

2122
# Configuration of MkDocs & Material for MkDocs --------------------------------
2223

0 commit comments

Comments
 (0)