Which Countries Are More Suicidal Than The Others?

Ubajaka CJ
4 min readJun 7, 2019
A young man about to commit suicide. © istock.

I stumbled across a dataset in Kaggle concerning the suicide rates overview of countries from 1985 to 2016. An interesting dataset to play around with, especially with this tool for visualization — Plotly. Also, an intriguing one because you would want to get insights on why people wish to commit suicide, what are the triggers for suicide, which countries are more suicidal than the others, among which generations is it trendy. Does it come with the territory of the generational, economic, religious, or political?

Anyways, I downloaded the dataset. It is really a rich one — from the United Nations Development Program(UNDP) — containing a lot of complex features, spanning through a large time frame, cutting across several generations. That sort of thing.

In this post, to have a bird’s view of the dataset, I decided to have the summation of suicide rates within a country, across sex and generation and spanning through 1985 to 2016, to compare that among countries visually in a global map. In another post, I’ll delve deep into the dataset to gather more insights about the problem of suicide. So, let’s get cracking.

Importing all the necessary modules.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.tools as tls
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

I read the dataset, made a data frame for the country and year features.

suicide_data = pd.read_csv('master.csv')countries = np.unique(suicide_data['country'])
countries_df = pd.DataFrame({'Country Name': countries})

Then, I arranged the summation of all the suicide rates of each country according to the year(column) in which it was committed and matched them to each country(rows); and converted that into a data frame. So, basically, we are making a new dataset from the original for our brief analysis.

tuple_ = []for country in countries:
country_data = suicide_data[suicide_data['country'] == country]
country_name = list(country_data.country.unique())[0]
for year in years:
suicide = country_data[country_data['year'] == year]\ ['suicides/100k pop'].sum()
tuple_.append((year, country, suicide))
suicide_data_new = pd.DataFrame(tuple_, columns=['year', 'country', 'suicide_rate'])
countries_df = pd.DataFrame(data={'country':countries})
for year in years:
data_year = suicide_data_new[suicide_data_new['year'] == year].reset_index(drop=True)
s_rate_year = pd.DataFrame(data={str(year): data_year['suicide_rate']})
countries_df = countries_df.join(s_rate_year)
countries_df

Plotting visuals for 1985.

# plotting 1985 visuals
metricscale=[[0, 'rgb(102,194,165)'], [0.05, 'rgb(102,194,164)'],
[0.15, 'rgb(171,221,164)'], [0.2, 'rgb(230,245,152)'],
[0.25, 'rgb(255,255,191)'], [0.35, 'rgb(254,224,139)'],
[0.45, 'rgb(253,174,97)'], [0.55, 'rgb(213,62,79)'], [1.0, 'rgb(158,1,66)']]
data = [dict(
type = 'choropleth',
autocolorscale = False,
colorscale = metricscale,
showscale=True,
locations = countries_df['country'].values,
z = countries_df['1985'].values,
locationmode = 'country names',
text = countries_df['country'].values,
marker = dict(
line = dict(color='rgb(0, 0, 0)', width=1)),
colorbar = dict(autotick=True, tickprefix='',
title = '# Suicide\nRate')
# marker = go.ch
)
]
layout = dict(
title = 'World Map of Suicide Rate in the Year 1985',
geo = dict(
showframe = False,
showocean = True,
oceancolor = 'rgb(0,255,255)',
# type = 'equirectangular'
projection = dict(
type = 'orthographic',
rotation = dict(
lon = 60,
lat = 10)
),
lonaxis = dict(
showgrid = False,
gridcolor = 'rgb(102, 102, 102)'
),
lataxis = dict(
showgrid = False,
gridcolor = 'rgb(102, 102, 102)'
)
),
)
fig = dict(data=data, layout=layout)
py.plot(fig, validate=False, filename='worldmap1985')

And for 2016.

# plotting 2016 visuals
metricscale=[[0, 'rgb(102,194,165)'], [0.05, 'rgb(102,194,164)'],
[0.15, 'rgb(171,221,164)'], [0.2, 'rgb(230,245,152)'],
[0.25, 'rgb(255,255,191)'], [0.35, 'rgb(254,224,139)'],
[0.45, 'rgb(253,174,97)'], [0.55, 'rgb(213,62,79)'], [1.0, 'rgb(158,1,66)']]
data = [dict(
type = 'choropleth',
autocolorscale = False,
colorscale = metricscale,
showscale=True,
locations = countries_df['country'].values,
z = countries_df['2016'].values,
locationmode = 'country names',
text = countries_df['country'].values,
marker = dict(
line = dict(color='rgb(0, 0, 0)', width=1)),
colorbar = dict(autotick=True, tickprefix='',
title = '# Suicide\nRate')
# marker = go.ch
)
]
layout = dict(
title = 'World Map of Suicide Rate in the Year 2016',
geo = dict(
showframe = False,
showocean = True,
oceancolor = 'rgb(0,255,255)',
# type = 'equirectangular'
projection = dict(
type = 'orthographic',
rotation = dict(
lon = 60,
lat = 10)
),
lonaxis = dict(
showgrid = False,
gridcolor = 'rgb(102, 102, 102)'
),
lataxis = dict(
showgrid = False,
gridcolor = 'rgb(102, 102, 102)'
)
),
)
fig = dict(data=data, layout=layout)
py.plot(fig, validate=False, filename='worldmap2016')

From the data visualization, Western Europe — especially France, Austria, and Serbia — was more suicidal than the rest of the world in 1985. They were followed by North America — the US and Canada. But this rate dropped across Western Europe and America in 2016. Eastern Europe — especially Russia — and Asia had maintained a fairly even low-suicidal rate both in 1985 and 2016. One of the questions we are to find the answer is why the first world countries — Western Europe and America — was suicidal in the 1980s. Does it have to do with the economic, because they are the first-world countries? We’ll find this out as we delve more deeply into the dataset.

The code to this Data Analysis is found in my Github repo.

--

--