mirror of
https://github.com/bellingcat/gesara-entity-viz.git
synced 2026-06-07 19:18:32 +03:00
changed bbox of network, added variable in json data to set label threshold, changed to British spelling convention
This commit is contained in:
14
README.md
14
README.md
@@ -1,22 +1,22 @@
|
||||
# GESARA Named Entity Network Visualization
|
||||
# GESARA Named Entity Network Visualisation
|
||||
|
||||
This project generates an [interactive visualization](https://bellingcat.github.io/gesara-entity-viz/) of [named entities](https://spacy.io/usage/linguistic-features#named-entities) in English-language posts archived in a database of Telegram channels that have posted about the GESARA conspiracy theory.
|
||||
This project generates an [interactive visualisation](https://bellingcat.github.io/gesara-entity-viz/) of [named entities](https://spacy.io/usage/linguistic-features#named-entities) in English-language posts archived in a database of Telegram channels that have posted about the GESARA conspiracy theory.
|
||||
|
||||
This visualization was developed by Bellingcat based on an excellent [Sigma.js demo](https://github.com/jacomyal/sigma.js/tree/main/demo), and uses [react-sigma-v2](https://github.com/sim51/react-sigma-v2) to interface sigma.js with React.
|
||||
This visualisation was developed by Bellingcat based on an excellent [Sigma.js demo](https://github.com/jacomyal/sigma.js/tree/main/demo), and uses [react-sigma-v2](https://github.com/sim51/react-sigma-v2) to interface sigma.js with React.
|
||||
|
||||
You can view the live visualization [here](https://bellingcat.github.io/gesara-entity-viz/). With GitHub pages configured, after making changes to the `main` branch, you need th run the command `npm run deploy` for the latest changes to be reflected in the live visualization.
|
||||
You can view the live visualisation [here](https://bellingcat.github.io/gesara-entity-viz/). With GitHub pages configured, after making changes to the `main` branch, you need to run the command `npm run deploy` for the latest changes to be reflected in the live visualisation.
|
||||
|
||||
## Python Scripts
|
||||
|
||||
In the `scripts/` subdirectory, you can run Python scripts that were used to generate the network and visualization:
|
||||
In the `scripts/` subdirectory, you can run Python scripts that were used to generate the network and visualisation:
|
||||
|
||||
### `generate_network.py`
|
||||
|
||||
Extracts the data from a PostgreSQL database, cleans the entity data, generates a NetworkX graph, prunes the edges using the [Marginal Likelihood Filter](https://github.com/naviddianati/GraphPruning), and exports the pruned graph.
|
||||
|
||||
### `generate_visualization.py`
|
||||
### `generate_visualisation.py`
|
||||
|
||||
After visualizing the network using [Gephi](https://gephi.org/) (using the Force Atlas 2 algorithm, with the "LinLog mode" and "Prevent Overlap" options enabled, and exporting as the file `entity_network_layout.graphml`), this script converts the node, edge, and cluster data into a format readable by this sigma.js project.
|
||||
After visualising the network using [Gephi](https://gephi.org/) (using the Force Atlas 2 algorithm, with the "LinLog mode" and "Prevent Overlap" options enabled, and exporting as the file `entity_network_layout.graphml`), this script converts the node, edge, and cluster data into a format readable by this sigma.js project.
|
||||
|
||||
## NPM Scripts
|
||||
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -12,7 +12,7 @@ COLORS = colorcet.glasbey_dark
|
||||
|
||||
OUTPUT_JSON = "../public/dataset_entities.json"
|
||||
|
||||
NODE_SCALING = 0.5
|
||||
NODE_SCALING = 0.35
|
||||
# GraphML file generated by Gephi
|
||||
INPUT_GRAPHML = "data/entity_network_layout.graphml"
|
||||
CLUSTERS = [
|
||||
@@ -46,7 +46,9 @@ CLUSTERS = [
|
||||
{"key": "37", "clusterLabel": "Payment platforms"},
|
||||
{"key": "42", "clusterLabel": "Vote audit"},
|
||||
]
|
||||
BOUNDING_BOX = {"x": [-300, 400], "y": [-600, 150]}
|
||||
BOUNDING_BOX = {"x": [100, 200], "y": [-370,-50]}
|
||||
|
||||
LABEL_THRESHOLD = 15
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
@@ -84,6 +86,7 @@ if __name__ == "__main__":
|
||||
]
|
||||
+ [{"key": "100", "clusterLabel": "Other", "color": "#999999"}],
|
||||
"bbox": BOUNDING_BOX,
|
||||
'labelThreshold': LABEL_THRESHOLD
|
||||
}
|
||||
|
||||
with open(OUTPUT_JSON, "w") as f:
|
||||
@@ -19,7 +19,8 @@ export interface Dataset {
|
||||
nodes: NodeData[];
|
||||
edges: [string, string][];
|
||||
clusters: Cluster[];
|
||||
bbox: {'x': Extent, 'y': Extent}
|
||||
bbox: {'x': Extent, 'y': Extent},
|
||||
labelThreshold: number
|
||||
}
|
||||
|
||||
export interface FiltersState {
|
||||
|
||||
@@ -14,18 +14,17 @@ const DescriptionPanel: FC = () => {
|
||||
}
|
||||
>
|
||||
<p>
|
||||
This visualisation represents a <i>network</i> of{" "}
|
||||
This interactive visualisation represents a <i>network</i> of{" "}
|
||||
<a target="_blank" rel="noreferrer" href="https://spacy.io/usage/linguistic-features#named-entities">
|
||||
named entities
|
||||
</a> in English-language posts archived in a database of Telegram channels that have posted about the GESARA conspiracy theory. Each{" "}
|
||||
<i>node</i> represents an entity, <i>edges</i> between nodes indicate that one or more posts contain both entities
|
||||
.
|
||||
<i>node</i> represents an entity, <i>edges</i> between nodes indicate that one or more posts contain both entities.
|
||||
</p>
|
||||
<p>
|
||||
This kind of visualization shows the ecosystem of the people, organizations, and ideas these conspiracy Telegram channels talk about, as well as the connections between them.
|
||||
This kind of visualisation shows the ecosystem of the people, organisations, and ideas these conspiracy Telegram channels talk about, as well as the connections between them.
|
||||
</p>
|
||||
<p>
|
||||
Some social media channels were identified by researchers from{" "}
|
||||
Some Telegram channels were identified by researchers from{" "}
|
||||
<a target="_blank" rel="noreferrer" href="https://www.bellingcat.com/">
|
||||
Bellingcat
|
||||
</a>{" "}and{" "}
|
||||
@@ -59,14 +58,13 @@ const DescriptionPanel: FC = () => {
|
||||
.
|
||||
</p>
|
||||
<p>
|
||||
The network was visualized using{" "}
|
||||
The network was visualised using{" "}
|
||||
<a target="_blank" rel="noreferrer" href="https://gephi.org/">
|
||||
Gephi
|
||||
</a>. Node sizes are related to the number of channels the entity was posted about in the database.
|
||||
Nodes are colored based a{" "}
|
||||
</a>. The radius of each node is proportional to the number of channels in the database whose posts mention the entity. Nodes are coloured based on a{" "}
|
||||
<a target="_blank" rel="noreferrer" href="https://arxiv.org/abs/0803.0476">
|
||||
community detection algorithm
|
||||
</a>.
|
||||
</a>. Edges are weighted by the number of posts that mention both entities.
|
||||
For visualisation purposes, edges were pruned using the{" "}
|
||||
<a target="_blank" rel="noreferrer" href="https://github.com/naviddianati/GraphPruning">
|
||||
Marginal Likelihood Filter
|
||||
|
||||
@@ -53,7 +53,7 @@ const Root: FC = () => {
|
||||
defaultNodeType: "image",
|
||||
labelDensity: 0.07,
|
||||
labelGridCellSize: 60,
|
||||
labelRenderedSizeThreshold: 10,
|
||||
labelRenderedSizeThreshold: dataset.labelThreshold,
|
||||
labelFont: "Lato, sans-serif",
|
||||
zIndex: true,
|
||||
}}
|
||||
|
||||
Reference in New Issue
Block a user