Please follow the folloing links regarding data preparation and previous posts to follow along -
plot1 = sns.catplot(y="Track",
x="Score",
data=test_scores,
jitter=True,
height=9,
aspect=0.5,
kind = "violin",
col = "Complexity",
cut = 0,
col_order = ["Easy","Medium","Difficult"],
palette = sns.color_palette(["#a05195","#d45087","#f95d6a"])) \
.set_ylabels(fontsize=25) \
.set_xlabels(fontsize = 25, label = "Score") \
.set_xticklabels(fontsize=20) \
.set_yticklabels(fontsize=20) \
.set_titles(size = 20)
plot1.fig.subplots_adjust(top=0.8)
plot1.fig.suptitle('Violin Plot showing distribution of score for each track by complexity',size = 30)
#plot1.savefig("Violin Plot -1.png",dpi=100,bbox_inches='tight')
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.pylab as plb
import warnings
warnings.filterwarnings('ignore')
test_scores = pd.read_csv("Data/Test scores.csv", parse_dates=['Test taken date'])
test_master = pd.read_csv("Data/Test master.csv")
test_participant = pd.read_csv("Data/Audience summary.csv")
We have three datasets, namely -
This contains scores of each particpant in the test they appeared.
test_scores.head()
Participant identifier | Test Name | Test taken date | Track | Designation | Score | |
---|---|---|---|---|---|---|
0 | 37MCTM | If conditional | 2018-11-23 | Engineering | Lead | 18 |
1 | 37MCTM | Determiners and Quantifiers | 2018-11-23 | Engineering | Lead | 28 |
2 | 37MCTM | Modals | 2018-11-23 | Engineering | Lead | 22 |
3 | 37MCTM | Tenses | 2018-11-13 | Engineering | Lead | 12 |
4 | 37MCTM | Pronouns | 2018-11-13 | Engineering | Lead | 15 |