forked from bellwether/minerva
added readme
This commit is contained in:
parent
0f1018c7d5
commit
9aae669ca2
2 changed files with 46 additions and 0 deletions
44
README.md
Normal file
44
README.md
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
# Minerva
|
||||
Minerva is the Roman equivalent of Athena, and Athena is AWS's database that
|
||||
stores results in S3.
|
||||
|
||||
In order to ease programmatic access to Athena and offer blocking access (so
|
||||
that your code waits for the result), I wrote `minerva` to make it seamless.
|
||||
|
||||
The results are returned as pyarrow datasets (with parquet files as the
|
||||
underlying structure).
|
||||
|
||||
# Basic Usage
|
||||
```
|
||||
import access as a
|
||||
|
||||
athena = a.Athena("hay", "s3://haystac-pmo-athena/")
|
||||
query = athena.query('select * from "trajectories"."kitware" limit 10')
|
||||
data = query.results()
|
||||
print(data.head(10))
|
||||
```
|
||||
|
||||
First, a connection to Athena is made. The first argument is the AWS profile in
|
||||
`~/.aws/credentials`. The second argument is the S3 location where the results
|
||||
will be stored.
|
||||
|
||||
In the second substantive line, an SQL query is made. This is **non-blocking**.
|
||||
The query is off and running and you are free to do whatever you want now.
|
||||
|
||||
In the third line, the results are requested. This is **blocking**, so the code
|
||||
will wait here (checking with AWS every 5 seconds) until the results are ready.
|
||||
Then, the results are downloaded to `/tmp/` and lazily interpreted as parquet
|
||||
files in the form of a `pyarrow.dataset.dataset`.
|
||||
|
||||
# Returning Scalar Values
|
||||
In SQL, scalar values get assigned an anonymous column -- Athena doesn't like
|
||||
that. Thus, you have to assign the column a name.
|
||||
|
||||
```
|
||||
data = athena.query('select count(*) as my_col from "trajectories"."kitware"').results()
|
||||
print(data.head(1))
|
||||
```
|
||||
|
||||
# TODO
|
||||
* parallelize the downloading of files
|
||||
|
||||
Loading…
Add table
Add a link
Reference in a new issue