Files
sugartrail/notebooks/004_connection_check.ipynb
2023-01-31 13:37:20 +00:00

364 lines
10 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "b7641405",
"metadata": {},
"source": [
"*In this tutorial we will investigate two seperate companies and check if they are connected.*"
]
},
{
"cell_type": "markdown",
"id": "e39bd44d",
"metadata": {},
"source": [
"There are instances where we may want to see if two companies are connected. We can do this by simply building a network for each company and comparing them to see if there are any common officers, addresses or companies.\n",
"\n",
"Lets test this approach with two example companies, Zahawi & Zahawi Ltd (07285998) and Gorgeous Services Limited (05714521):"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "53435932",
"metadata": {},
"outputs": [],
"source": [
"import sugartrail\n",
"import pandas as pd\n",
"sugartrail.api.basic_auth.username = \"\""
]
},
{
"cell_type": "markdown",
"id": "489a4141",
"metadata": {},
"source": [
"Create one network for Zahawi & Zahawi including some limits to reduce the number of possibly irrelevant connections:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "300cecde",
"metadata": {},
"outputs": [],
"source": [
"zahawi_connections = sugartrail.base.Network(company_id='07285998')\n",
"zahawi_connections.hop.officer_appointments_maxsize = 20\n",
"zahawi_connections.hop.officers_at_address_maxsize = 20\n",
"zahawi_connections.hop.companies_at_address_maxsize = 20"
]
},
{
"cell_type": "markdown",
"id": "bf8ddb84",
"metadata": {},
"source": [
"Create a second network for Gorgeous Services:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "9480e020",
"metadata": {},
"outputs": [],
"source": [
"gorgeous_connections = sugartrail.base.Network(company_id='05714521')\n",
"gorgeous_connections.hop.officer_appointments_maxsize = 20\n",
"gorgeous_connections.hop.officers_at_address_maxsize = 20\n",
"gorgeous_connections.hop.companies_at_address_maxsize = 20"
]
},
{
"cell_type": "markdown",
"id": "fd678b28",
"metadata": {},
"source": [
"We can now pass both networks to the `find_network_connections` method which returns any connections found between two networks. The method accepts two networks as input and an optional `max_depth` value (defaults to 5) which sets the maximum depth of network we will build for both. `find_network_connections` builds each network up to the `max_depth` value and completes when connections are found or the `max_depth` is reached."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b4036e3d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1/5 hops completed.\n",
"2/5 hops completed.\n",
"3/5 hops completed.\n",
"Found connection(s)!\n"
]
}
],
"source": [
"connections = sugartrail.processing.find_network_connections(zahawi_connections, gorgeous_connections)"
]
},
{
"cell_type": "markdown",
"id": "bac64a8e",
"metadata": {},
"source": [
"Looks like a connection was found. We can see by the long string of characters that its an officer ID:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "be034584",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['g8BmvnpH8blqT87i93sgJeowx7I']"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"connections"
]
},
{
"cell_type": "markdown",
"id": "6cd89faa",
"metadata": {},
"source": [
"We can now trace the path from Zahawi & Zahawi to this connection:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "9544095a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>title</th>\n",
" <th>depth</th>\n",
" <th>node_type</th>\n",
" <th>id</th>\n",
" <th>link_type</th>\n",
" <th>link</th>\n",
" <th>node_index</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>ZAHAWI &amp; ZAHAWI LTD</td>\n",
" <td>0</td>\n",
" <td>Company</td>\n",
" <td>07285998</td>\n",
" <td></td>\n",
" <td></td>\n",
" <td>a</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Nadhim ZAHAWI</td>\n",
" <td>1</td>\n",
" <td>Person</td>\n",
" <td>tKup8kXPh3-jx_5Bs-BkF5XCyPM</td>\n",
" <td>Officer</td>\n",
" <td>a</td>\n",
" <td>b</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>YOUGOV PLC</td>\n",
" <td>2</td>\n",
" <td>Company</td>\n",
" <td>03607311</td>\n",
" <td>Appointment</td>\n",
" <td>b</td>\n",
" <td>c</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Benjamin William ELLIOT</td>\n",
" <td>3</td>\n",
" <td>Person</td>\n",
" <td>g8BmvnpH8blqT87i93sgJeowx7I</td>\n",
" <td>Officer</td>\n",
" <td>c</td>\n",
" <td>d</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" title depth node_type id \\\n",
"0 ZAHAWI & ZAHAWI LTD 0 Company 07285998 \n",
"1 Nadhim ZAHAWI 1 Person tKup8kXPh3-jx_5Bs-BkF5XCyPM \n",
"2 YOUGOV PLC 2 Company 03607311 \n",
"3 Benjamin William ELLIOT 3 Person g8BmvnpH8blqT87i93sgJeowx7I \n",
"\n",
" link_type link node_index \n",
"0 a \n",
"1 Officer a b \n",
"2 Appointment b c \n",
"3 Officer c d "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.DataFrame(zahawi_connections.find_path('g8BmvnpH8blqT87i93sgJeowx7I'))"
]
},
{
"cell_type": "markdown",
"id": "613910a7",
"metadata": {},
"source": [
"... and the path from Gorgeous Connections to the connection:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "f810b714",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>title</th>\n",
" <th>depth</th>\n",
" <th>node_type</th>\n",
" <th>id</th>\n",
" <th>link_type</th>\n",
" <th>link</th>\n",
" <th>node_index</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>GORGEOUS SERVICES LIMITED</td>\n",
" <td>0</td>\n",
" <td>Company</td>\n",
" <td>05714521</td>\n",
" <td></td>\n",
" <td></td>\n",
" <td>a</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Benjamin William ELLIOT</td>\n",
" <td>1</td>\n",
" <td>Person</td>\n",
" <td>g8BmvnpH8blqT87i93sgJeowx7I</td>\n",
" <td>Officer</td>\n",
" <td>a</td>\n",
" <td>b</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" title depth node_type id \\\n",
"0 GORGEOUS SERVICES LIMITED 0 Company 05714521 \n",
"1 Benjamin William ELLIOT 1 Person g8BmvnpH8blqT87i93sgJeowx7I \n",
"\n",
" link_type link node_index \n",
"0 a \n",
"1 Officer a b "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.DataFrame(gorgeous_connections.find_path('g8BmvnpH8blqT87i93sgJeowx7I'))"
]
},
{
"cell_type": "markdown",
"id": "3e6ffa85",
"metadata": {},
"source": [
"Reading both paths tells us how Zahawi & Zahawi connect to Gorgeous Connections."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}