Commit Graph

89 Commits

Author SHA1 Message Date
msramalho
3ab5477e6c removing tmp log 2023-10-25 15:01:51 +01:00
msramalho
5e0024c726 temp changes 2023-10-25 14:59:25 +01:00
msramalho
7ed54c18d7 fixing sql non-null constaint 2023-10-25 14:51:41 +01:00
Miguel Sozinho Ramalho
099c854a91 Merge pull request #17 from bellingcat/archive 2023-10-17 16:12:38 +01:00
msramalho
e3c128c4fd adds access control to new endpoint 2023-10-17 16:08:35 +01:00
Lilia Kai
d8bb637532 Add db task endpoint 2023-10-16 14:53:08 +02:00
Miguel Sozinho Ramalho
d99ddea9a9 Merge pull request #13 from bellingcat/get_status 2023-09-22 10:30:29 +01:00
msramalho
f017dbe1f2 quick fix author_id 2023-09-20 13:52:14 +01:00
msramalho
c6cd027e13 allows search to happen with API_TOKEN 2023-09-20 11:30:57 +01:00
Lilia Kai
f20dd05928 Refactor get_status and create_archive_task error handling
Raise exceptions instead of returning error messages from the worker in
create_arvive_task. This ensures consistency in how the errors are
presented on the task result: the Exception will be the result instead
of *maybe* being wrapped in an object like {error: Exception}.

This lets us simplify error handling in get_status so we have only one
try/except block where the error can be returned to the client.
2023-09-20 11:43:55 +02:00
Lilia Kai
00201770ba Create archive task returns dict instead of string
This will save the task result in redis as a json object instead of a
json-encoded string. This makes for a nicer response from get_status and
prevents the client having to parse a json string to work with the
result.
2023-09-20 11:43:55 +02:00
msramalho
f7160aad91 updating auto-archiver dependency 2023-09-20 10:24:24 +01:00
Lilia Kai
1b39f2c291 Rename variables in get_status
There are no logic changes in this commit, just renamed variables so
that fewer things are called "result" which seemed confusing.

Instead of result.result = task_result.result,
we can say response.result = task.result
2023-09-20 11:01:00 +02:00
Miguel Sozinho Ramalho
ce4cd6c59c Merge pull request #11 from bellingcat/wacz-dood 2023-09-20 09:17:46 +01:00
msramalho
72c9374639 updates aa to latest version 2023-09-20 09:13:29 +01:00
msramalho
ceb5c9764d updates aa to latest version 2023-09-15 20:20:32 +01:00
Lilia Kai
9aea7d561b Fix value for BROWSERTRIX_HOME_HOST 2023-09-15 10:33:28 +02:00
msramalho
8ace0ff1bd introduces makefile 2023-09-15 01:05:25 +01:00
msramalho
9443d73d20 simplifies docker compose 2023-09-15 01:05:01 +01:00
msramalho
fc01ba1194 updates auto-archiver dependency 2023-09-15 01:04:52 +01:00
msramalho
bbab27ff3c simplifies dev compose and removes bad redis command - was no longer working since other containers try to use credentials 2023-09-15 01:04:37 +01:00
Lilia Kai
67e97d9d5f Rename the shared volume to crawls 2023-09-14 12:46:20 +02:00
Lilia Kai
65bb479218 New environment variables 2023-09-14 12:46:03 +02:00
Lilia Kai
8e4801f3d3 Run browsertrix in docker on the host
Install docker in the container

Add a named volume called `browsertrix`

Mount the named volume in the worker at /crawls

Expose the host docker socket

Override the environment variable from auto-archiver's Dockerfile so
that it will call docker.

This will require setting new configs in orchestration.yaml:

 wacz_archiver_enricher:
  browsertrix_home: auto-archiver-api_browsertrix
  wacz_collections: /crawls
2023-09-12 20:37:25 +02:00
Miguel Sozinho Ramalho
b26bac6ba4 Merge pull request #10 from bellingcat/dev-compose 2023-09-06 16:26:20 +01:00
Lilia Kai
43144330a7 Move dev configs to their own file 2023-09-06 15:01:46 +02:00
Lilia Kai
91762f58b7 Add option to serve local archive files
Set an environment variable in the docker compose file, then reference
that variable in main.py to mount the local archive so that the links
generated by auto-archiver will work correctly. Fixes #8
2023-09-05 16:10:37 +02:00
Lilia Kai
9b622d1393 Update src/.example.env
Removes some configs that are no longer used and adds some that are.
2023-09-04 19:29:49 +02:00
Lilia Kai
3b46554aa1 Fix get_user_first_group for user with no groups
If the email is defined in user-groups.yaml but has no groups, groups is
assigned None and len(groups) throws an exception.

Intuitively, one would expect groups to default to [] rather than None
because [] is passed as the second argument to Dictionary.get, but this
default only applies if the key is not found in the dictionary. In this
case the key is defined but has a value of None.
2023-08-31 20:56:48 +02:00
msramalho
ce1599b160 wacz working in docker 2023-08-24 17:44:37 +01:00
msramalho
03164b9ede version updates 2023-08-18 21:33:45 +01:00
msramalho
75b42c0f33 filter by date archived before/after 2023-08-18 16:15:06 +01:00
msramalho
36387de435 fixing volume location 2023-08-07 15:09:32 +01:00
msramalho
55dc977bfa removing duplicate env var from dockerfile 2023-07-28 16:53:34 +01:00
msramalho
4741638c33 wacz working withing docker 2023-07-28 16:01:45 +01:00
msramalho
c1d76fae81 missing reqs 2023-07-28 14:46:41 +01:00
msramalho
ee2db3f950 archiver-api updates 2023-07-28 13:55:35 +01:00
msramalho
6b9f5149e8 ensuring email is lowercase 2023-07-24 16:23:38 +01:00
msramalho
8c6ff8cb91 version bump 2023-07-11 15:44:13 +01:00
msramalho
344cc8d2bd fix: group permissions 2023-07-11 15:42:44 +01:00
msramalho
409eb07b44 fix: update aa version 2023-07-11 12:32:37 +01:00
msramalho
fafe821432 pulling twitter scraper fix 2023-07-02 19:00:24 +02:00
msramalho
707b19b4fa feat: email domain-level access 2023-06-27 14:50:13 +01:00
msramalho
dd70b7a908 updates auto-archiver version 2023-06-26 18:59:05 +01:00
Logan Williams
46c487be5d Restore sheet_service endpoint 2023-06-06 18:44:45 +00:00
msramalho
88be84127a adds /metrics 2023-05-26 13:15:36 +01:00
msramalho
7248e36309 method rename 2023-05-26 11:17:16 +01:00
msramalho
ced06e3a45 fixes submit parsing 2023-05-25 13:35:59 +01:00
msramalho
66e81d48eb feat: process thumbnails and insert missing users 2023-05-25 11:50:38 +01:00
msramalho
d79fe0a27c feat: submit-url endpoint ready 2023-05-24 19:05:44 +01:00