Part 1 - Plotting Using Seaborn - Violin, Box and Line Plot

21 Aug 2019
python, visualisation

Introduction and Data preparation

Please follow the folloing links regarding data preparation and previous posts to follow along -

For Data Preparation - Part 0 - Plotting Using Seaborn - Data Preparation

Violin Plot showing distribution of score for each track by complexity

plot1 = sns.catplot(y="Track",
                 x="Score",
                 data=test_scores, 
                 jitter=True, 
                 height=9, 
                 aspect=0.5,
                 kind = "violin", 
                 col = "Complexity",
                 cut = 0,
                 col_order = ["Easy","Medium","Difficult"],
                 palette = sns.color_palette(["#a05195","#d45087","#f95d6a"])) \
                .set_ylabels(fontsize=25) \
                .set_xlabels(fontsize = 25, label = "Score") \
                .set_xticklabels(fontsize=20) \
                .set_yticklabels(fontsize=20) \
                .set_titles(size = 20) 

plot1.fig.subplots_adjust(top=0.8)
plot1.fig.suptitle('Violin Plot showing distribution of score for each track by complexity',size = 30)
#plot1.savefig("Violin Plot -1.png",dpi=100,bbox_inches='tight')

Part 0 - Plotting Using Seaborn - Data Preparation

20 Aug 2019
python, visualisation

Import Preliminaries and datasets

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.pylab as plb
import warnings
warnings.filterwarnings('ignore')

test_scores = pd.read_csv("Data/Test scores.csv", parse_dates=['Test taken date'])
test_master = pd.read_csv("Data/Test master.csv")
test_participant = pd.read_csv("Data/Audience summary.csv")

We have three datasets, namely -

Test Scores Dataset

This contains scores of each particpant in the test they appeared.

test_scores.head()

	Participant identifier	Test Name	Test taken date	Track	Designation	Score
0	37MCTM	If conditional	2018-11-23	Engineering	Lead	18
1	37MCTM	Determiners and Quantifiers	2018-11-23	Engineering	Lead	28
2	37MCTM	Modals	2018-11-23	Engineering	Lead	22
3	37MCTM	Tenses	2018-11-13	Engineering	Lead	12
4	37MCTM	Pronouns	2018-11-13	Engineering	Lead	15

Nearest Neighbors using L2 and L1 Distance

20 Jul 2019
python, machine learning

Preliminaries
Distance Matrics
- L2 Norm
- L1 Norm
Nearest Neighbor
- Using L2 Distance
- Using L1 Distance
Predictions
Errors
Confusion Matrix
- Using Pandas
- From Scratch