You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DOC: Adding ArcticDB to the ecosystem.md page (pandas-dev#59830)
* adding ArcticDB to the ecosystem.md page
* Update web/pandas/community/ecosystem.md
Co-authored-by: Matthew Roeschke <[email protected]>
* making pandas lower case
---------
Co-authored-by: Matthew Roeschke <[email protected]>
ArcticDB is a serverless DataFrame database engine designed for the Python Data Science ecosystem. ArcticDB enables you to store, retrieve, and process pandas DataFrames at scale. It is a storage engine designed for object storage and also supports local-disk storage using LMDB. ArcticDB requires zero additional infrastructure beyond a running Python environment and access to object storage and can be installed in seconds. Please find full documentation [here](https://docs.arcticdb.io/latest/).
374
+
375
+
#### ArcticDB Terminology
376
+
377
+
ArcticDB is structured to provide a scalable and efficient way to manage and retrieve DataFrames, organized into several key components:
378
+
379
+
-`Object Store` Collections of libraries. Used to separate logical environments from each other. Analogous to a database server.
380
+
-`Library` Contains multiple symbols which are grouped in a certain way (different users, markets, etc). Analogous to a database.
381
+
-`Symbol` Atomic unit of data storage. Identified by a string name. Data stored under a symbol strongly resembles a pandas DataFrame. Analogous to tables.
382
+
-`Version` Every modifying action (write, append, update) performed on a symbol creates a new version of that object.
383
+
384
+
#### Installation
385
+
386
+
To install, simply run:
387
+
388
+
```console
389
+
pip install arcticdb
390
+
```
391
+
392
+
To get started, we can import ArcticDB and instantiate it:
393
+
394
+
```python
395
+
import arcticdb as adb
396
+
import numpy as np
397
+
import pandas as pd
398
+
# this will set up the storage using the local file system
399
+
arctic = adb.Arctic("lmdb://arcticdb_test")
400
+
```
401
+
402
+
> **Note:** ArcticDB supports any S3 API compatible storage, including AWS. ArcticDB also supports Azure Blob storage.
403
+
> ArcticDB also supports LMDB for local/file based storage - to use LMDB, pass an LMDB path as the URI: `adb.Arctic('lmdb://path/to/desired/database')`.
404
+
405
+
#### Library Setup
406
+
407
+
ArcticDB is geared towards storing many (potentially millions) of tables. Individual tables (DataFrames) are called symbols and are stored in collections called libraries. A single library can store many symbols. Libraries must first be initialized prior to use:
Now we have a library set up, we can get to reading and writing data. ArcticDB has a set of simple functions for DataFrame storage. Let's write a DataFrame to storage.
416
+
417
+
```python
418
+
df = pd.DataFrame(
419
+
{
420
+
"a": list("abc"),
421
+
"b": list(range(1, 4)),
422
+
"c": np.arange(3, 6).astype("u1"),
423
+
"d": np.arange(4.0, 7.0, dtype="float64"),
424
+
"e": [True, False, True],
425
+
"f": pd.date_range("20130101", periods=3)
426
+
}
427
+
)
428
+
429
+
df
430
+
df.dtypes
431
+
```
432
+
433
+
Write to ArcticDB.
434
+
435
+
```python
436
+
write_record = lib.write("test", df)
437
+
```
438
+
439
+
> **Note:** When writing pandas DataFrames, ArcticDB supports the following index types:
> The "row" concept in `head`/`tail` refers to the row number ('iloc'), not the value in the `pandas.Index` ('loc').
447
+
448
+
#### Reading Data from ArcticDB
449
+
450
+
Read the data back from storage:
451
+
452
+
```python
453
+
read_record = lib.read("test")
454
+
read_record.data
455
+
df.dtypes
456
+
```
457
+
458
+
ArcticDB also supports appending, updating, and querying data from storage to a pandas DataFrame. Please find more information [here](https://docs.arcticdb.io/latest/api/query_builder/).
0 commit comments