<div dir="ltr"><font face="arial, sans-serif">Hi,</font><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">I wanted to share the plot I was trying to show this morning.<br><br>Here's the description of the that is attached below<br><br clear="all"></font><div><p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">This plot has a lot of information in it. Let's break it
down.<br>
<br>
Important note: I generated 1866 knockout plots over the summer.</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"> Full
dataset means all natural plots + knockout plots.</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"> Natural
plots are the plots that I did not generate.</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"> Each
trial group on the x-axis had five trials each</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">The y-axis represents the accuracy when the model was tested
on just the knockout plots I created over the summer via a different training
script.<br>
The x-axis:<br>
Example label1: train: 60 validation: 40
knockout in train: 0.<br>
<br>
This means that 60% of the natural dataset was used for training. 40% was set
aside for validation, and there were 0 knockout plots in the training.</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">Example label 2: train: 95 validation: 5 knockout in train:
20</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">This means that 95% of the FULL DATASET was used for
training, which includes 20% of all knockout plots. <br>
5% of the full dataset was set aside for validation.</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">Example label 3: Hydra original</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">This means that I did not do anything different. I ran the training
script without altering how many knockout or natural plants are used where. Hydra original has knockouts as it was given the full dataset.</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"> </font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"><b>Interpretation:</b> In this plot, we see no particular trend. This is because some plots in the natural dataset look like the knockout
plots I produced. This was also suggested in the meeting this morning.</font></p><p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"> <br>Why the big lower error bars?<br>
This is basically because of how the models are trained. The training script
figures out the location of each plot from the database. After which, it's put
in a pandas data frame and shuffled before being split into training and
validation sets. Sometimes, the shuffle
is just unlucky for the model with very little knockout or knockout-like plots
that never end up in the training. I tested all these models on bad plots only. Every time a plot was not predicted as bad, it was predicted as cosmic. There were
only a handful of times when the plots were predicted led or good or no data. <br>
The model is confused between cosmic and bad with these unlucky shuffles.</font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">How are the error bars calculated?</font></p><p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">Lower error is the mean of trials in the trial group – min of
trials in the trial group. Upper error is max of trials in the trial group – the mean of
the trials in the trial group.</font></p><p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">Best,</font></p><p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif">Manav</font></p></div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><p class="MsoNormal" style="margin-bottom:0cm;color:rgb(34,34,34)"><font face="arial, sans-serif"><br></font></p><p class="MsoNormal" style="margin-bottom:0cm;color:rgb(34,34,34)"><br></p></div></div></div></div>