Page 94 - sjsi
P. 94

Research Article: Alsibai & Heydari                                                             94



                                                                  findings  from  the  datasets.  The  DenseNet201
                                                                  model not performing well on dataset B could
            As figure 9 shows, the train accuracy increases       be  due  to  a  variety  of  reasons  such  as  the
            steadily until it reaches 83.33% while the test       complexity  of  the  ultrasound  images  or  the
            accuracy  remains  relatively  unchanged  at          relatively small number of data points available
            62.92%. The same can be observed for train and        in  the  original  dataset.  To  address  this,  the
            test loss where the train loss steadily decreases     entire  DenseNet201  model  can  be  trained
            but not for the test loss. This is a clear indication   rather  than  freezing  the  feature  extractor  to
            of overfitting and the inability of the model to      perhaps produce better accuracy results on the
            generalize well on unseen data. The poor model        test  data  of  dataset  B  as  training  only  the
            performance is also confirmed in the confusion        classifier seems to be not sufficient for this task.
            matrix where the number of true positives and         Also,  experimenting  with  different  learning
            true negatives is unsatisfactory.                     rates  and  optimizers  could  yield  more
            Table  1  exhibits  the  precision,  recall  and  F1-  satisfactory results. However, the accuracy and
            score  for  the  infected  data  points  in  both     reliability  of  the  model's  predictions  depend
            datasets:                                             heavily on the quality of the data used to train
                                                                  it. If the data is flawed or biased, the model will
               Table 1: Precision, recall, and F1-score for the   likely produce inaccurate or unreliable results
              infected data points in both datasets               even if the results appear to be satisfactory. In
                                Dataset A    Dataset B            the  medical  and  health  field,  this  can  have

               Precision (%)    99.57        55.47                serious consequences as it can lead to incorrect
               Recall (%)       100.0        36.98                diagnoses  or  treatment  recommendations,
                                                                  potentially causing harm to patients. Therefore,
               F1-score (%)     99.78        44.38
                                                                  it is essential to ensure that the data used to

            Further inspection on the dataset A and after         train  these  models  is  of  highest  quality  and
                                                                  represents accurately the population intended
            consulting  a  professional  specialist  in  this
                                                                  to serve. This includes ensuring that the data is
            medical field, it turned out that the dataset is
            highly   erroneous    and    misleading.   The        free  from  errors,  and  represents  the  target
                                                                  population,  and  has  been  collected  using
            `notinfected`  class  which  is  supposed  to
                                                                  appropriate  methods.  Ensuring data  quality  is
            represent the healthy ovaries having no sign of
            PCOS  are  in  fact  not  images  of  ovaries  at  all.   an  ongoing  process  that  requires  continuous
                                                                  monitoring and improvement.
            Rather,  they  are  ultrasound  images  of  uterus

            which completely falsify this dataset.
                                                                  References
                                                                  1. HF.  EM.  Polycystic  ovary  syndrome:  definition,
            Conclusion
                                                                     aetiology,  diagnosis  and  treatment.  Nat  Rev
            Two  experiments  were  conducted  on  two               Endocrinol. 2018; 14(5):270–84.
            different  datasets  as  called  in  this  paper:     2. Farkas J. Rigo A, Demitrovic Z. Psychological Aspects
            Dataset A and dataset B. Dataset A gave much             of  the  Polycystic  Ovary  Syndrome.  Gynecol
                                                                     Endocrinol. 2013; 30(2).
            better results but, it turned out that the dataset
                                                                  3. Louwers YV, Laven JSE. Characteristics of polycystic
            is highly erroneous and misleading. Therefore,
                                                                     ovary  syndrome  throughout  life.  Therapeutic
            data quality is of the utmost importance when            Advances in Reproductive Health. 2020; 14.
            training deep learning models, especially in the      4. Adams  Jea.  Prevalence  of  polycystic  ovaries  in
            medical  and  health  fields.  The  results  of  this    women with anovulation and idiopathic hirsutism.
                                                                     British Medical Journal. 1986 Aug 9; 293,6543 (1986):
            study,  show  the  ability  of  CNN  and  deep
                                                                     355-9.
            learning  models  in  detecting  the  suspicious
                      SJSI – 2023: VOLUME 1-1
   89   90   91   92   93   94   95   96   97   98   99