update readme

This commit is contained in:
Lachlan Kermode
2020-06-18 11:04:30 +02:00
parent f43f0f322c
commit e7718f18c7
3 changed files with 43 additions and 62 deletions

View File

@@ -36,45 +36,6 @@ Querying data directly from spreadsheets is brittle, as it relies on the mainten
To see how to get a local instance of datasheet server running in practice, see [this wiki](https://github.com/forensic-architecture/timemap/wiki/running-timemap-and-datasheet-server-locally) explaining how to use it to feed data from a Google Sheet to a local instance of [Timemap](https://github.com/forensic-architecture/timemap).
### Design Concepts
The codebase currently only supports Google Sheets as a source, and a REST-like format as a query language. It is designed, however, with extensibility in mind.
**Sources**
A source represents a sheet-like collection of data, such as a Google Sheet. A source has one or more **tabs**, each of which contains a 2-dimensional grids of cells. Each cell contains a body of text (a string).
**Resources**
The data from sources are made available as resources, which are structured blocks of data that are granularly accessible through a query language. Resources are the outfacing aspect of Datasheet Server, and represent the only kind of data that can be queried by applications. Each resource is configured with one or more **query languages**. (Currently only a REST-like query language supported.)
**Blueprints**
Blueprints are a data structure that represent the way that infromation from **sources** are to be turned into **resources**. For each tab in a source, there is a corresponding Blueprint. Blueprints are created through a [blueprinter function](/src/blueprinters) invoked on the raw data from a source tab.
Blueprints are JSON objects. There have two forms:
1. _desaturated_ -- describes the resources and query languages available on data from a source tab.
2. _saturated_ -- both describes resources available on data from a source etab, and contains that data.
A desaturated Blueprint can be saturated by retrieving its data from the server's **model layer**, which stores tab data from sources.
A JSON catalogue of the available blueprints (desaturated) in a server is available at `/api/blueprints`.
## [Configuration](#configuration)
Copy the example environment file:
```
cp .env.example .env
```
Inside this file, you will need to modify at least the `SERVICE_ACCOUNT_EMAIL` and `SERVICE_ACCOUNT_PRIVATE_KEY` fields. These fields refer to the credentials of a [Google service account](https://cloud.google.com/iam/docs/understanding-service-accounts). Google requires that developers create these when attempting to access their services programmatically, so that they can attribute requests to users. Service accounts also contain identity information, which means that asset owners can allow certain service accounts access to certain sheets, just as one might differentially grant certain users access to certain cloud assets.
Once you have [created a service account](https://support.google.com/a/answer/7378726?hl=en), create and download an [API key for that account](https://cloud.google.com/iam/docs/creating-managing-service-account-keys). The JSON file for the API key that you download when you create it contains both a service account private key, and an email associated with the service account: add these respectively in the strings in .env for `SERVICE_ACCOUNT_PRIVATE_KEY` and `SERVICE_ACCOUNT_EMAIL`.
The last thing to do is to grant the service account access to the sheet that
datasheet-server is pulling from. You can add a service account to a sheet as
you would any other Google user: just enter the email address associated. (Note
that this step is not necessary if you are accessing a publicly available
sheet.)
Other configuration options, such as the port at which the server will expose
resources, are also modifiable from the .env file.
#### [Sources](#sources)
Sources are specified in [src/config.js](https://github.com/forensic-architecture/datasheet-server/blob/develop/src/config.js). Datasheet server currently only supports Google Sheets as a source.
@@ -86,31 +47,10 @@ Each Google Sheet being used as a as source requires a corresponding object in `
| Option | Description | Type |
| ------ | ----------- | ---- |
| name | Used to refer to data served from this source | string |
| id | The ID of the sheet in Google. (You can find it in the address bar when the Sheet is open in a browser. It is the string that follows 'spreadsheets/d/'). | string |
| path/id | The path to the XLSX sheet on your local, or the ID of the sheet in Google. (You can find it in the address bar when the Sheet is open in a browser. It is the string that follows 'spreadsheets/d/'). | string |
| tabs | An object that maps each tab in the source to one or more Blueprinters. All of the Blueprinters in the [blueprinters folder](/lib/blueprinters) are available through a single import as at the top of [example.config.js](/src/example.config.js). <br> To correctly associate a Blueprinter, the object key needs to be _the tab name with all lowercase letters, and spaces replaced by a '-'_. For example, if the tab name in Google Sheets is 'Info About SHEEP', the object key should be 'info-about-sheep'. <br> The value should be the Blueprinter function that you want to use for the data in that tab. If you require more than one endpoint for a single tab, you can support multiple blueprinters by making the item an array. See the example of a configuration object below. <br>TODO: no Blueprinter is used by default. | object |
###### Example Configuration Object
```js
import BP from './lib/blueprinters'
export default {
googleSheets: {
email: 'project-name@reliable-baptist-23338.iam.gserviceaccount.com',
privateKey: 'SOME_PRIVATE_KEY',
sheets: [
{
name: 'example',
id: '1s-vfBR8Uy-B-TLO_C5Ozw4z-L0E3hdP8ohMV761ouRI',
tabs: {
'objects': BP.rows,
'fruit': [BP.columns, BP.ids],
}
},
]
}
}
```
See src/config.js for an example configuration sheet.
## [Quickstart](#quickstart)

22
docs/design-concepts.md Normal file
View File

@@ -0,0 +1,22 @@
### Design Concepts
The codebase currently only supports Google Sheets as a source, and a REST-like format as a query language. It is designed, however, with extensibility in mind.
**Sources**
A source represents a sheet-like collection of data, such as a Google Sheet. A source has one or more **tabs**, each of which contains a 2-dimensional grids of cells. Each cell contains a body of text (a string).
**Resources**
The data from sources are made available as resources, which are structured blocks of data that are granularly accessible through a query language. Resources are the outfacing aspect of Datasheet Server, and represent the only kind of data that can be queried by applications. Each resource is configured with one or more **query languages**. (Currently only a REST-like query language supported.)
**Blueprints**
Blueprints are a data structure that represent the way that infromation from **sources** are to be turned into **resources**. For each tab in a source, there is a corresponding Blueprint. Blueprints are created through a [blueprinter function](/src/blueprinters) invoked on the raw data from a source tab.
Blueprints are JSON objects. There have two forms:
1. _desaturated_ -- describes the resources and query languages available on data from a source tab.
2. _saturated_ -- both describes resources available on data from a source etab, and contains that data.
A desaturated Blueprint can be saturated by retrieving its data from the server's **model layer**, which stores tab data from sources.
A JSON catalogue of the available blueprints (desaturated) in a server is available at `/api/blueprints`.

19
docs/gsheet-config.md Normal file
View File

@@ -0,0 +1,19 @@
## [Configuration](#configuration)
Copy the example environment file:
```
cp .env.example .env
```
Inside this file, you will need to modify at least the `SERVICE_ACCOUNT_EMAIL` and `SERVICE_ACCOUNT_PRIVATE_KEY` fields. These fields refer to the credentials of a [Google service account](https://cloud.google.com/iam/docs/understanding-service-accounts). Google requires that developers create these when attempting to access their services programmatically, so that they can attribute requests to users. Service accounts also contain identity information, which means that asset owners can allow certain service accounts access to certain sheets, just as one might differentially grant certain users access to certain cloud assets.
Once you have [created a service account](https://support.google.com/a/answer/7378726?hl=en), create and download an [API key for that account](https://cloud.google.com/iam/docs/creating-managing-service-account-keys). The JSON file for the API key that you download when you create it contains both a service account private key, and an email associated with the service account: add these respectively in the strings in .env for `SERVICE_ACCOUNT_PRIVATE_KEY` and `SERVICE_ACCOUNT_EMAIL`.
The last thing to do is to grant the service account access to the sheet that
datasheet-server is pulling from. You can add a service account to a sheet as
you would any other Google user: just enter the email address associated. (Note
that this step is not necessary if you are accessing a publicly available
sheet.)
Other configuration options, such as the port at which the server will expose
resources, are also modifiable from the .env file.