Commit Graph

159 Commits

Author SHA1 Message Date
msramalho
c058bfd067 fixes unique constraint issues for archives containing the same url in archive_urls 2023-12-20 18:38:28 +00:00
msramalho
cff6f713bd aa dep update 2023-12-20 14:28:24 +00:00
Miguel Sozinho Ramalho
cdfe7429f0 Merge pull request #34 from bellingcat/loopback
Bind the web app to the loopback interface
2023-12-18 10:42:20 +00:00
msramalho
23beab0eb8 logging correct emails in sheet_service endpoint 2023-12-17 23:55:26 +00:00
Lilia Kai
f4f378082c Bind the web app to the loopback interface
Making it only accesible over ssl via the nginx proxy.
2023-12-14 16:54:41 +01:00
msramalho
3599ab2c19 dependency updates 2023-12-13 19:06:02 +00:00
msramalho
496a3651e5 detecting already inserted entries 2023-12-13 14:59:51 +00:00
msramalho
74f93ef856 catch cached inserts 2023-12-13 14:28:28 +00:00
msramalho
50417481f4 dep updated 2023-12-13 14:16:15 +00:00
msramalho
7dd0503d90 slight /metrics improvement 2023-12-13 13:46:53 +00:00
msramalho
48272cc8e9 dependencies update 2023-12-13 13:46:41 +00:00
msramalho
b92b8e3f8a auto-archiver dep update 2023-12-13 11:51:23 +00:00
msramalho
0e8864c68e updates auto-archiver 2023-12-13 10:38:38 +00:00
Miguel Sozinho Ramalho
1b7e6602db Merge pull request #33 from bellingcat/allow-query-before-archive 2023-12-13 10:30:36 +00:00
msramalho
99acfb113f most recent first 2023-12-12 22:43:31 +00:00
msramalho
3d4d7979a5 fixes data leak 2023-12-12 22:24:36 +00:00
msramalho
bb4ac31c12 version updated 2023-12-12 19:17:24 +00:00
Miguel Sozinho Ramalho
b0332ca438 Merge pull request #32 from bellingcat/remove-static-file 2023-12-12 19:14:58 +00:00
msramalho
6874d123eb adds logic to test if archive is needed, if specified by the user 2023-12-12 19:14:10 +00:00
Lilia Kai
76c99af48b Remove static file endpoint 2023-12-11 13:43:44 +01:00
msramalho
3ab5477e6c removing tmp log 2023-10-25 15:01:51 +01:00
msramalho
5e0024c726 temp changes 2023-10-25 14:59:25 +01:00
msramalho
7ed54c18d7 fixing sql non-null constaint 2023-10-25 14:51:41 +01:00
Miguel Sozinho Ramalho
099c854a91 Merge pull request #17 from bellingcat/archive 2023-10-17 16:12:38 +01:00
msramalho
e3c128c4fd adds access control to new endpoint 2023-10-17 16:08:35 +01:00
Lilia Kai
d8bb637532 Add db task endpoint 2023-10-16 14:53:08 +02:00
Miguel Sozinho Ramalho
d99ddea9a9 Merge pull request #13 from bellingcat/get_status 2023-09-22 10:30:29 +01:00
msramalho
f017dbe1f2 quick fix author_id 2023-09-20 13:52:14 +01:00
msramalho
c6cd027e13 allows search to happen with API_TOKEN 2023-09-20 11:30:57 +01:00
Lilia Kai
f20dd05928 Refactor get_status and create_archive_task error handling
Raise exceptions instead of returning error messages from the worker in
create_arvive_task. This ensures consistency in how the errors are
presented on the task result: the Exception will be the result instead
of *maybe* being wrapped in an object like {error: Exception}.

This lets us simplify error handling in get_status so we have only one
try/except block where the error can be returned to the client.
2023-09-20 11:43:55 +02:00
Lilia Kai
00201770ba Create archive task returns dict instead of string
This will save the task result in redis as a json object instead of a
json-encoded string. This makes for a nicer response from get_status and
prevents the client having to parse a json string to work with the
result.
2023-09-20 11:43:55 +02:00
msramalho
f7160aad91 updating auto-archiver dependency 2023-09-20 10:24:24 +01:00
Lilia Kai
1b39f2c291 Rename variables in get_status
There are no logic changes in this commit, just renamed variables so
that fewer things are called "result" which seemed confusing.

Instead of result.result = task_result.result,
we can say response.result = task.result
2023-09-20 11:01:00 +02:00
Miguel Sozinho Ramalho
ce4cd6c59c Merge pull request #11 from bellingcat/wacz-dood 2023-09-20 09:17:46 +01:00
msramalho
72c9374639 updates aa to latest version 2023-09-20 09:13:29 +01:00
msramalho
ceb5c9764d updates aa to latest version 2023-09-15 20:20:32 +01:00
Lilia Kai
9aea7d561b Fix value for BROWSERTRIX_HOME_HOST 2023-09-15 10:33:28 +02:00
msramalho
8ace0ff1bd introduces makefile 2023-09-15 01:05:25 +01:00
msramalho
9443d73d20 simplifies docker compose 2023-09-15 01:05:01 +01:00
msramalho
fc01ba1194 updates auto-archiver dependency 2023-09-15 01:04:52 +01:00
msramalho
bbab27ff3c simplifies dev compose and removes bad redis command - was no longer working since other containers try to use credentials 2023-09-15 01:04:37 +01:00
Lilia Kai
67e97d9d5f Rename the shared volume to crawls 2023-09-14 12:46:20 +02:00
Lilia Kai
65bb479218 New environment variables 2023-09-14 12:46:03 +02:00
Lilia Kai
8e4801f3d3 Run browsertrix in docker on the host
Install docker in the container

Add a named volume called `browsertrix`

Mount the named volume in the worker at /crawls

Expose the host docker socket

Override the environment variable from auto-archiver's Dockerfile so
that it will call docker.

This will require setting new configs in orchestration.yaml:

 wacz_archiver_enricher:
  browsertrix_home: auto-archiver-api_browsertrix
  wacz_collections: /crawls
2023-09-12 20:37:25 +02:00
Miguel Sozinho Ramalho
b26bac6ba4 Merge pull request #10 from bellingcat/dev-compose 2023-09-06 16:26:20 +01:00
Lilia Kai
43144330a7 Move dev configs to their own file 2023-09-06 15:01:46 +02:00
Lilia Kai
91762f58b7 Add option to serve local archive files
Set an environment variable in the docker compose file, then reference
that variable in main.py to mount the local archive so that the links
generated by auto-archiver will work correctly. Fixes #8
2023-09-05 16:10:37 +02:00
Lilia Kai
9b622d1393 Update src/.example.env
Removes some configs that are no longer used and adds some that are.
2023-09-04 19:29:49 +02:00
Lilia Kai
3b46554aa1 Fix get_user_first_group for user with no groups
If the email is defined in user-groups.yaml but has no groups, groups is
assigned None and len(groups) throws an exception.

Intuitively, one would expect groups to default to [] rather than None
because [] is passed as the second argument to Dictionary.get, but this
default only applies if the key is not found in the dictionary. In this
case the key is defined but has a value of None.
2023-08-31 20:56:48 +02:00
msramalho
ce1599b160 wacz working in docker 2023-08-24 17:44:37 +01:00