Part 4 - Plotting Using Seaborn - Heatmap, Lollipop Plot, Scatter Plot

23 Aug 2019
python, visualisation

Introduction and Data preparation

Please follow the folloing links regarding data preparation and previous posts to follow along -

For Data Preparation - Part 0 - Plotting Using Seaborn - Data Preparation
For Part 1 - Part 1 - Plotting Using Seaborn - Violin, Box and Line Plot
For Part 2 - Part 2 - Plotting Using Seaborn - Distribution Plot, Facet Grid
For Part 3 - Part 3 - Plotting Using Seaborn - Donut

Heatmap shwoing average percentage score across each test by track

test_scores_TestName2 = test_scores.groupby(['Test Name','Track'])[['Score','maximum_score']].mean().reset_index().sort_values(by=['maximum_score','Score'])
test_scores_TestName2['Percent'] = test_scores_TestName2['Score']/test_scores_TestName2['maximum_score']
df_heatmap = test_scores_TestName2.pivot('Test Name','Track','Percent')
fig, axes = plt.subplots(figsize=(9,9))
f = sns.heatmap(df_heatmap, annot=True,cmap ="Blues")
f.set_xlabel(xlabel = '',fontsize=20)
f.set_ylabel(ylabel = '',fontsize=20)
f.set_yticklabels(labels = list(df_heatmap.index.values), fontsize=12, rotation = 360)
f.set_xticklabels(labels = ['Engineering', 'QA', 'Support'], fontsize=12, rotation =360)
fig.suptitle('Heatmap shwoing average percentage score across each test by track', 
             fontsize=20, x = 0.5, y = 0.94)
#plb.savefig('Heat_Track',dpi=100,bbox_inches='tight')  

Part 3 - Plotting Using Seaborn - Donut

23 Aug 2019
python, visualisation

Introduction and Data preparation

Please follow the folloing links regarding data preparation and previous posts to follow along -

For Data Preparation - Part 0 - Plotting Using Seaborn - Data Preparation
For Part 1 - Part 1 - Plotting Using Seaborn - Violin, Box and Line Plot
For Part 2 - Part 2 - Plotting Using Seaborn - Distribution Plot, Facet Grid

Basic prelimanries for Donut Chart

company_headcount = pd.melt(test_participant, id_vars=['Designation'], 
                            value_vars=['Engineering', 'Quality Assurance', 'Support']) \
                        .rename(columns={"variable": "Track", "value": "Headcount"})

for_donuts = test_scores.groupby(['Track','Designation'])[['Participant identifier']].nunique().reset_index()

participant_matrix = pd.merge(company_headcount,for_donuts,how = 'left',left_on = ['Track', 'Designation'],
                              right_on= ['Track', 'Designation'])

participant_matrix_eng = participant_matrix[participant_matrix['Track'] == 'Engineering']
participant_matrix_sup = participant_matrix[participant_matrix['Track'] == 'Support']
participant_matrix_qa = participant_matrix[participant_matrix['Track'] == 'Quality Assurance']

Part 2 - Plotting Using Seaborn - Distribution Plot, Facet Grid

23 Aug 2019
python, visualisation

Introduction and Data preparation

Please follow the folloing links regarding data preparation and previous posts to follow along -

For Data Preparation - Part 0 - Plotting Using Seaborn - Data Preparation
For Part 1 - Part 1 - Plotting Using Seaborn - Violin, Box and Line Plot

Subset of data based on complexity

test_scores_easy = test_scores[test_scores['Complexity']=='Easy']
test_scores_medium = test_scores[test_scores['Complexity']=='Medium']
test_scores_hard = test_scores[test_scores['Complexity']=='Difficult']

Distribution of score percentage across track in test with easy complexity

sns.set(style="whitegrid")
g = sns.FacetGrid(test_scores_easy, col='Track', row='Test Name', height = 4, aspect =2.5)
g.map(sns.distplot, "Percent", kde = False, hist = True, rug = False)
g.set_titles(size =20)
g.set_xlabels(size = 25)
g.set_ylabels(size = 25, label = "Participants")
g.set_yticklabels(fontsize =25)
g.set_xticklabels(fontsize =25, labels = [0,0,20,40,60,80,100])
g.fig.suptitle('Distribution of score percentage across track in test with easy complexity', fontsize=40, x = 0.5, y = 1.05)
#plb.savefig('Distribution_easy.png',dpi=50,bbox_inches='tight')