Skip to main navigation Skip to search Skip to main content

Evaluating the impact of machine learning platforms on cancer classification model performance: A cross-platform comparative study

    Research output: Contribution to journalArticlepeer-review

    2 Downloads (Pure)

    Abstract

    Machine Learning techniques have become pivotal in advancing predictive models for early cancer detection, addressing the growing need for improved diagnostic efficiency. However, the role of implementation platforms in influencing model performance remains underexplored, even as variationsin performance with the same dataset raise questions about platform choice. This study evaluates the impact of three ML implementation tools, the Scikit-learn, KNIME, and MATLAB on the performance of four classification algorithms: Logistic Regression, Decision Tree, Random Forest, and Gradient Boosting. Using the publicly available Wisconsin Diagnostic Breast Cancer dataset, these algorithms were implemented under default configurations and compared across key metrics: accuracy, recall, precision, and F1-score. Results revealed significant platform-dependent variations: Scikit-learn achieved consistently higher recall, particularly for Random Forest and Gradient Boosting, making it more effective at minimising false negatives critical in cancer diagnosis. MATLAB demonstrated superior precision, especially for Random Forest and Gradient Boosting, indicating potential in reducing false positives. KNIME, while effective in specific contexts, underperformed in recall and precision, raising concerns in scenarios requiring high sensitivity and specificity.

    These findings underscore the importance of platform selection based on predictive task requirements, especially in healthcare, where balancing false positives and false negatives is crucial. The study provides actionable insights for selecting ML platforms to enhance diagnostic accuracy in cancer classification tasks, with source code and data fully accessible through a public GitHub repository.
    Original languageEnglish
    Pages (from-to)96 - 111
    JournalInternational Journal on Advanced in Life Sciences
    Volume16
    Issue numberno 3 & 4
    Publication statusPublished - 2024

    UN SDGs

    This output contributes to the following UN Sustainable Development Goals (SDGs)

    1. SDG 3 - Good Health and Well-being
      SDG 3 Good Health and Well-being

    Keywords

    • Cancer
    • KNIME
    • MATLAB
    • Machine learning
    • Python Scikit-learn
    • Wisconsin Diagnostic Breast Cancer

    Fingerprint

    Dive into the research topics of 'Evaluating the impact of machine learning platforms on cancer classification model performance: A cross-platform comparative study'. Together they form a unique fingerprint.

    Cite this