You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Feel free to pick up an issue from the bug tracker: https://github.com/manahl/arctic/issues or add an issue in general and assign it to yourself so we don't duplicate the work on the same issue.
4
+
5
+
* Local installation
6
+
* Clone the repo locally
7
+
* Create a virtualenv eg. `virtualenv .venv -p python3`
8
+
* Activate the virtualenv eg. `source .venv/bin/activate`
9
+
* Run `python setup.py install` to install dependencies in your virtualenv.
10
+
* Arctic should be ready to use locally, you can test it by importing it in your python interpreter
11
+
12
+
* After you have made changes, you can run tests with `python setup.py test`. You can also do something like: `python setup.py test -a tests/integration/<test_name>` to run a specific test.
13
+
14
+
* Run pycodestyle locally to make sure it passes the coding style checks.
Copy file name to clipboardExpand all lines: docs/faq.md
+21-8
Original file line number
Diff line number
Diff line change
@@ -10,19 +10,28 @@ other data types and optional versioning.
10
10
11
11
Arctic can query millions of rows per second per client, achieves ~10x compression on network bandwidth,
12
12
~10x compression on disk, and scales to hundreds of millions of rows per second per
13
-
[MongoDB](https://www.mongodb.org/) instance.
13
+
[MongoDB](https://www.mongodb.org/) instance.
14
+
15
+
Other benefits are:-
16
+
* Serializes a number of data types eg. Pandas DataFrames, Numpy arrays, Python objects via pickling etc. so you don't have to handle different datatypes manually.
17
+
* Uses LZ4 compression by default on the client side to get big savings on network / disk.
18
+
* Allows you to version different stages of an object and snapshot the state (In some ways similar to git), and allows you to freely experiment and then just revert back the snapshot. [VersionStore only]
19
+
* Does the chunking (breaking a Dataframe to smaller part* for you.
20
+
* Adds a concept of Users and per User Libraries which can build on Mongo's auth.
21
+
* Has different types of Stores, each with it's own benefits. Eg. Versionstore allows you to version and snapshot stuff, TickStore is for storage and highly efficient retrieval of streaming data, ChunkStore allows you to chunk and efficiently retrieve ranges of chunks. If nothing suits you, feel free to use vanilla Mongo commands with BSONStore.
22
+
* Restricts data access to Mongo and thus prevents ad hoc queries on unindexed / unsharded collections
23
+
14
24
15
25
## Differences between VersionStore and TickStore?
16
26
17
-
tickstore is for constant streams of data, version store is for working with data
18
-
(i.e. playing around with it). It keeps versions so you can 'undo' changes and keep
19
-
track of updates.
27
+
Tickstore is for tick style data generally via streaming, VersionStore is for playing around with data. It keeps versions so you can 'undo' changes and keep track of updates.
20
28
21
29
## Which Store should I use?
22
30
23
-
* VersionStore: when ..
24
-
* ChunkStore: when ..
25
-
* TickStore: when ..
31
+
* VersionStore: This is the default Store type. This gives you the ability to Version and Snapshot your objects while doing the serialization, compression etc alongside it. This is useful as you can basically play with your data and revert back to an older state if needed
32
+
* ChunkStore: Use ChunkStore when you don't care about versioning, and want to store DataFrames into user defined chunks with fast reads.
33
+
* TickStore: When you are storing constant tick data (eg. buy / sell info from exchanges). This generally plays well with Kafka / other message brokers.
34
+
* BSONStore: For basically using raw Mongo operations via arctic. Can be used for storing adhoc data.
26
35
27
36
## Why Mongo?
28
37
@@ -32,4 +41,8 @@ chose Mongo as the backend for Arctic.
32
41
## I'm running Mongo in XXXX setup - what performance should I expect?
33
42
We're constantly asked what the expected performance of Arctic is/should be for given configutations and Mongo cluster setups. Its hard to know for sure given the enormous number of ways Mongo, networks, machines, workstations, etc can be configured. MongoDB performance tuning is outside the scope of this library, but countless tutorials and examples are available via a quick search of the Internet.
34
43
35
-
... Work in Progress.
44
+
45
+
## Thread safety
46
+
47
+
VersionStore is thread safe, and operations that are interrupted should never corrupt the data, based on us writing the data segments first and then the pointers to it. This could leak data in cases though.
Copy file name to clipboardExpand all lines: docs/index.md
+17-5
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,22 @@
2
2
3
3
Arctic is a timeseries / dataframe database that sits atop MongoDB. Arctic supports serialization of a number of datatypes for storage in the mongo document model.
4
4
5
+
## Why use Arctic?
6
+
7
+
Some of the reasons to use Arctic are:-
8
+
9
+
* Serializes a number of data types eg. Pandas DataFrames, Numpy arrays, Python objects via pickling etc. so you don't have to handle different datatypes manually.
10
+
* Uses LZ4 compression by default on the client side to get big savings on network / disk.
11
+
* Allows you to version different stages of an object and snapshot the state (In some ways similar to git), and allows you to freely experiment and then just revert back the snapshot. [VersionStore only]
12
+
* Does the chunking (breaking a Dataframe to smaller part* for you.
13
+
* Adds a concept of Users and per User Libraries which can build on Mongo's auth.
14
+
* Has different types of Stores, each with it's own benefits. Eg. Versionstore allows you to version and snapshot stuff, TickStore is for storage and highly efficient retrieval of streaming data, ChunkStore allows you to chunk and efficiently retrieve ranges of chunks. If nothing suits you, feel free to use vanilla Mongo commands with BSONStore.
15
+
* Restricts data access to Mongo and thus prevents ad hoc queries on unindexed / unsharded collections
16
+
17
+
Head over to the FAQs and James's presentation given below for more details.
18
+
19
+
## Basic Operations
20
+
5
21
Arctic provides a [wrapper](../arctic/arctic.py) for handling connections to Mongo. The `Arctic` class is what actually connects to Arctic.
6
22
7
23
```
@@ -58,11 +74,7 @@ Other basic methods:
58
74
59
75
*`library.list_symbols()`
60
76
- Does what you might expect - lists all the symbols in the given library
- Arctic internally sets quotas on libraries so they do not consume too much space. You can check and set quotas with these two methods. Note these operate on the `Arctic` object, not on libraries
0 commit comments