Page 87 - sjsi
P. 87
Research Article: Alsibai & Heydari 87
An appropriate dataset is vital for the proper highlighted paper of Hosain AK et al. (11)
functioning of any deep learning framework. mentioned previously and will be referred to as
Thus, the publicly available PCOS ultrasound dataset A in this paper as well. A screenshot of
images available on Kaggle (12) is used. This same the website providing this data is shown in
dataset is referred to as dataset A in the Figure 1 and Figure 2.
Figure 1: Screenshot of the publicly available PCOS dataset on Kaggle consisting of `infected` and `notinfected` ovarian
ultrasound images referred to as Dataset A
Figure 2: Statistics of Dataset A show that it is downloaded 301 times out of 2394 views. This means that almost 1 out
of every 10 viewers downloads this dataset for utilization in research/projects
The dataset consists of 3856 ultrasound images presence of PCOS. These images are partitioned
divided into 2 classes which are labeled as: into train and test sets in which 1932 images
`infected` and `notinfected`. The latter depicts belong to the test set and the rest belong to the
healthy ovaries and the first indicates the train set. However, the same images seem to be
SJSI – 2023: VOLUME 1-1