|
| 1 | +Using Existing Artifacts |
| 2 | +======================== |
| 3 | + |
| 4 | +Once connected to an artifact store (it can be an individual or shared one), we can query existing artifacts, like so: |
| 5 | + |
| 6 | +.. code:: python |
| 7 | +
|
| 8 | + lineapy.artifact_store() |
| 9 | +
|
| 10 | +which would print a list looking as the following: |
| 11 | + |
| 12 | +.. code:: none |
| 13 | +
|
| 14 | + iris_preprocessed:0 created on 2022-09-29 01:22:39.612871 |
| 15 | + iris_preprocessed:1 created on 2022-09-29 01:22:41.336159 |
| 16 | + iris_preprocessed:2 created on 2022-09-29 01:22:43.511112 |
| 17 | + iris_model:0 created on 2022-09-29 01:22:45.381132 |
| 18 | + iris_model:1 created on 2022-09-29 01:22:46.786414 |
| 19 | + iris_model:2 created on 2022-09-29 01:22:47.990517 |
| 20 | + iris_model:3 created on 2022-09-29 01:22:49.366484 |
| 21 | + toy_artifact:0 created on 2022-09-29 01:22:50.189060 |
| 22 | + toy_artifact:1 created on 2022-09-29 01:22:50.676276 |
| 23 | + toy_artifact:2 created on 2022-09-29 01:22:51.084704 |
| 24 | +
|
| 25 | +Each line contains three pieces of information about an existing artifact: its name, version, and time of creation. |
| 26 | +Hence, for an artifact named ``iris_model``, we have four versions created at different times. |
| 27 | + |
| 28 | +Now, say we are interested in reusing the first version of this artifact. We can retrieve the desired artifact as follows: |
| 29 | + |
| 30 | +.. code:: python |
| 31 | +
|
| 32 | + model_artifact = lineapy.get("iris_model", version=0) |
| 33 | +
|
| 34 | +Note that what has been retrieved and saved in ``model_artifact`` is not the model itself; it is the model *artifact*, |
| 35 | +which contains more than the model itself, e.g., the code that was used to generate the model. Hence, to resuse the model, |
| 36 | +we need to extract the artifact's value: |
| 37 | + |
| 38 | +.. code:: python |
| 39 | +
|
| 40 | + model = model_artifact.get_value() |
| 41 | +
|
| 42 | +However, we actually do not fully know how to reuse this model as we are missing the memory (or knowledge, if the artifact |
| 43 | +was created by someone else) of its context such as input details. Thankfully, the artifact also stores the code that was |
| 44 | +used to generate its value, so we can check it out: |
| 45 | + |
| 46 | +.. code:: python |
| 47 | +
|
| 48 | + print(data_artifact.get_code()) |
| 49 | +
|
| 50 | +which prints: |
| 51 | + |
| 52 | +.. code:: none |
| 53 | +
|
| 54 | + import lineapy |
| 55 | + from sklearn.linear_model import LinearRegression |
| 56 | +
|
| 57 | + art_df_processed = lineapy.get("iris_preprocessed", version=2) |
| 58 | + df_processed = art_df_processed.get_value() |
| 59 | + mod = LinearRegression() |
| 60 | + mod.fit( |
| 61 | + X=df_processed[["petal.width", "d_versicolor", "d_virginica"]], |
| 62 | + y=df_processed["sepal.width"], |
| 63 | + ) |
| 64 | +
|
| 65 | +With this, we now know the source and shape of the data that was used to train this model, |
| 66 | +which enables us to adapt and reuse the model in our context. Specifically, we can check out the |
| 67 | +training data by loading the corresponding artifact, like so: |
| 68 | + |
| 69 | +.. code:: python |
| 70 | +
|
| 71 | + art_df_processed = lineapy.get("iris_preprocessed", version=2) |
| 72 | + df_processed = art_df_processed.get_value() |
| 73 | + print(df_processed) |
| 74 | +
|
| 75 | +Based on the values in the data, we would have a more concrete understanding of the model and its job, |
| 76 | +which would enable us to make new predictions, like so: |
| 77 | + |
| 78 | +.. code:: python |
| 79 | +
|
| 80 | + import pandas as pd |
| 81 | +
|
| 82 | + # Create data to make predictions on |
| 83 | + df = pd.DataFrame({ |
| 84 | + "petal.width": [1.3, 5.2, 0.3, 1.5, 4.9], |
| 85 | + "d_versicolor": [1, 0, 0, 1, 0], |
| 86 | + "d_virginica": [0, 1, 0, 0, 1], |
| 87 | + }) |
| 88 | +
|
| 89 | + # Make new predictions |
| 90 | + df["sepal.width.pred"] = model.predict(df) |
| 91 | +
|
| 92 | +This example illustrates the benefit of LineaPy's unified storage framework: |
| 93 | +encapsulating both value and code as well as other metadata, LineaPy's artifact store |
| 94 | +enables the user to explore the history and relations among different works, |
| 95 | +hence rendering them more reusable. |
0 commit comments