Sağlık Yönetimi Araştırmalarında Makine Öğrenimi Uygulaması: Tahmine Dayalı Analitikten Öğrenen Sağlık Sistemlerine
Özet
Bu bölüm, sağlık yönetimi araştırmalarında geleneksel istatistiksel yöntemlerden makine öğrenimi destekli analitik süreçlere geçişin teorik ve pratik yönlerini kapsamlı şekilde incelemektedir. Dijitalleşmenin sağlık hizmetlerinde artan veri hacmi ve çeşitliliği, klasik hipotez testi ve doğrusal regresyon modellerinin tahmin gücünü sınırlandırmaktadır. Burada, makine öğreniminin sağlık yönetimindeki rolü, denetimli, denetimsiz ve pekiştirmeli öğrenme yaklaşımlarıyla ele alınmaktadır. Özellikle Random Forest, Gradient Boosting ve Doğal Dil İşleme tekniklerinin acil servis yoğunluk tahmini, yeniden yatış riski analizi ve yapılandırılmamış klinik verilerin anlamlandırılması gibi uygulamalardaki üstünlükleri vurgulanmaktadır. Ayrıca, model başarısının ölçümünde geleneksel p-değeri yöntemiyle modern performans metrikleri arasındaki epistemolojik farklar tartışılmaktadır. Analizler, makine öğreniminin sadece tahmin yapmakla kalmayıp, verinin sürekli geri bildirim döngüsüyle klinik uygulamaları geliştiren öğrenen sağlık sistemlerinin temelini attığını ortaya koymaktadır. Bölüm, algoritmik hakkaniyet, açıklanabilirlik ve veri gizliliği gibi etik konulara dikkat çekmekte ve sağlık yönetimi araştırmalarında nedensel makine öğrenimine geçişin önemini savunmaktadır.
Referanslar
Agrawal, R., & Prabakaran, S. (2020). Big data in digital healthcare: lessons learnt and recommendations for general practice. Heredity, 124(4), 525-534.
Ahmed, A., Zengul, F. D., Khan, S., Hearld, K. R., Feldman, S. S., & Hall, A. G. (2024). Developing a decision model to early predict ICU admission for COVID-19 patients: A machine learning approach. Intell Based Med, 9, 100136.
Al-Nafjan, A., Aljuhani, A., Alshebel, A., Alharbi, A., & Alshehri, A. (2025). Artificial intelligence in predictive healthcare: A systematic review. J Clin Med, 14(19), 6752.
Beaulieu-Jones, B., Finlayson, S. G., Chivers, C., Chen, I., McDermott, M., & Kandola, J. (2019). Trends and focus of machine learning applications for health research. JAMA Netw Open, 2(10).
Blanco, J., Ferreras, M., & Cosido, O. (2025). Predictive modeling of hospital emergency department demand using artificial intelligence: A systematic review. Int J Med Inform, 106215.
Bowling, A. (2014). Research methods in health: investigating health and health services. McGraw-Hill Education (UK).
Breiman, L. (2001). Random forests. Mach Learn, 45(1), 5-32.
Brusco, N. K., & Watts, J. J. (2015). Empirical evidence of recall bias for primary health care visits. BMC Health Serv Res, 15(1), 381.
Chandrasekaran, R., Sadiq, T. M., & Moustakas, E. (2025). Usage trends and data sharing practices of healthcare wearable devices among US adults: cross-sectional study. J Med Internet Res, 27.
Chen, R. J., Wang, J. J., Williamson, D. F. K., Chen, T. Y., Lipkova, J., & Lu, M. Y. (2023). Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat Biomed Eng, 7(6), 719-742.
de Ruijter, U. W., Kaplan, Z. L. R., & Bramer, W. M. (2022). Prediction Models for Future High-Need High-Cost Healthcare Use: a Systematic Review. J Gen Intern Med, 37, 1763-1770.
Delen, D. (2012). Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications.
Doupe, P., Faghmous, J., & Basu, S. (2019). Machine learning for health services researchers. Value Health, 22(7), 808-815.
Dwivedi, R., Dave, D., Naik, H., Singhal, S., Omer, R., & Patel, P. (2023). Explainable AI (XAI): Core ideas, techniques, and solutions. ACM Comput Surv, 55(9), 1-33.
Eckhardt, C. M., Madjarova, S. J., Williams, R. J., Ollivier, M., Karlsson, J., Pareek, A., & Nwachukwu, B. U. (2023). Unsupervised machine learning methods and emerging applications in healthcare. Knee Surg Sports Traumatol Arthrosc, 31(2), 376-381.
Ellis, L. A., Sarkies, M., Churruca, K., Dammery, G., Meulenbroeks, I., & Smith, C. L. (2022). The science of learning health systems: scoping review of empirical research. JMIR Med Inform, 10(2).
Feretzakis, G., Sakagianni, A., Kalles, D., Loupelis, E., Tzelves, L., & Panteris, V. (2022, 2022). Exploratory clustering for emergency department patients. In: Advances in Informatics, Management and Technology in Healthcare,
Finnegan, H., & Mountford, N. (2025). 25 years of electronic health record implementation processes: scoping review. J Med Internet Res, 27.
Fridgeirsson, E. A., Williams, R., Rijnbeek, P., Suchard, M. A., & Reps, J. M. (2024). Comparing penalization methods for linear models on large observational health data. J Am Med Inform Assoc, 31(7), 1514-1521.
Friedman, J. H., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J Stat Softw, 33, 1-22.
Gopukumar, D., Ghoshal, A., & Zhao, H. (2022). Predicting readmission charges billed by hospitals: machine learning approach. JMIR Med Inform, 10(8).
Gordon, B., Barrett, J., Fennessy, C., Cake, C., Milward, A., & Irwin, C. (2021). Development of a data utility framework to support effective health data curation. BMJ Health Care Inform, 28(1).
Grant, R. W., McCloskey, J., Hatfield, M., Uratsu, C., Ralston, J. D., Bayliss, E., & Kennedy, C. J. (2020). Use of latent class analysis and k-means clustering to identify complex patient profiles. JAMA Netw Open, 3(12).
Hajek, A. M. (2013). Breaking down clinical silos in healthcare. Front Health Serv Manag, 29(4), 45-50.
Haneuse, S. (2016). Distinguishing Selection Bias and Confounding Bias in Comparative Effectiveness Research. Med Care.
Haripriya, G., Abinaya, K., Aarthi, N., & Praveen Kumar, P. (2021). Random forest algorithms in health care sectors: a review of applications. In.
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.). Springer.
Herawati, N., Nisa, K., Setiawan, E., Nusyirwan, N., & Tiryono, T. (2018). Regularized multiple regression methods to deal with severe multicollinearity. Int J Stat Appl, 8(4), 167-172.
Hlatky, M. A., Winkelmayer, W. C., & Setoguchi, S. (2013). Epidemiologic and statistical methods for comparative effectiveness research. Heart Fail Clin, 9(1), 29-36.
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: applications to nonorthogonal problems. Technometrics, 12(1), 69-82.
Howard, K. A., Anderson, W., Podichetty, J. T., Gould, R., Boyce, D., & Dasher, P. (2025). Wrangling Real-World Data: Optimizing Clinical Research Through Factor Selection with LASSO Regression. Int J Environ Res Public Health, 22(4), 464.
Huang, K., Altosaar, J., & Ranganath, R. (1904). ClinicalBERT: Modeling clinical notes and predicting hospital readmission.
Huang, Y., Talwar, A., Chatterjee, S., & Aparasu, R. R. (2021). Application of machine learning in predicting hospital readmissions: a scoping review of the literature. BMC Med Res Methodol, 21(1), 96.
Huston, P., & Naylor, C. D. (1996). Health services research: reporting on studies using secondary data sources. CMAJ, 155(12), 1697.
Hyer, J. M., Ejaz, A., Tsilimigras, D. I., Paredes, A. Z., Mehta, R., & Pawlik, T. M. (2019). Novel Machine Learning Approach to Identify Preoperative Risk Factors Associated With Super-Utilization of Medicare Expenditure Following Surgery. JAMA Surg, 154(11), 1014-1021.
Jones, L., Barnett, A., & Vagenas, D. (2025). Linear regression reporting practices for health researchers, a cross-sectional meta-research study. PLOS ONE, 20(3).
Kan, H. J., Kharrazi, H., Chang, H. Y., Bodycombe, D., Lemke, K., & Weiner, J. P. (2019). Exploring the use of machine learning for risk adjustment: A comparison of standard and penalized linear regression models in predicting health care costs in older adults. PLOS ONE, 14(3).
Kaushal, A., Altman, R., & Langlotz, C. (2020). Geographic Distribution of US Cohorts Used to Train Deep Learning Algorithms. JAMA, 324(12), 1212-1213.
Khare, S. R., & Vedel, I. (2019). Recall bias and reduction measures: an example in primary health care service utilization. Fam Pract, 36(5), 672-676.
Kolluri, J., Kotte, V. K., Phridviraj, M. S. B., & Razia, S. (2020, 2020). Reducing overfitting problem in machine learning using novel L1/4 regularization method. In: 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI),
Krumholz, H. M., Terry, S. F., & Waldstreicher, J. (2016). Data acquisition, curation, and use for a continuously learning health system. JAMA, 316(16), 1669-1670.
Kulkarni, S., Ambekar, S. S., & Hudnurkar, M. (2021). Predicting the inpatient hospital cost using a machine learning approach. Int J Innov Sci, 13(1), 87-104.
Liu, F., & Panagiotakos, D. (2022). Real-world data: a brief review of the methods, applications, challenges and opportunities. BMC Med Res Methodol, 22(1), 287.
Lohr, K. N., & Steinwachs, D. M. (2002). Health services research: An evolving definition of the field. Health Serv Res, 37(1), 15.
Lourenço, L., Weber, L., Garcia, L., Ramos, V., & Souza, J. (2024). Machine Learning Algorithms to Estimate Propensity Scores in Health Policy Evaluation: A Scoping Review. Int J Environ Res Public Health, 21(11), 1484.
Lu, J., Sattler, A., Wang, S., Khaki, A. R., Callahan, A., & Fleming, S. (2022). Considerations in the reliability and fairness audits of predictive models for advance care planning. Front Digit Health, 4, 943768.
Mackin, S., Major, V. J., Chunara, R., & Newton-Dame, R. (2025). Identifying and mitigating algorithmic bias in the safety net. NPJ Digit Med, 8(1), 335.
Martinez, R. G., & Van Dongen, D. M. (2023). Deep learning algorithms for the early detection of breast cancer: A comparative study with traditional machine learning. Inform Med Unlocked, 41, 101317.
Mavrogiorgou, A., Kiourtis, A., Kleftakis, S., Mavrogiorgos, K., Zafeiropoulos, N., & Kyriazis, D. (2022). A catalogue of machine learning algorithms for healthcare risk predictions. Sensors, 22(22), 8615.
McGinnis, J. M., Fineberg, H. V., & Dzau, V. J. (2021). Advancing the learning health system. N Engl J Med, 385(1), 1-5.
Menze, B. H., Kelm, B. M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W., & Hamprecht, F. A. (2009). A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinformatics, 10(1), 213.
Mihaylova, B., Briggs, A., O'Hagan, A., & Thompson, S. G. (2011). Review of statistical methods for analysing healthcare resources and costs. Health Econ, 20(8), 897-916.
Miner, G. D., Miner, L. A., Goldstein, M., Nisbet, R., Walton, N., & Bolding, P. (2014). Practical predictive analytics and decisioning systems for medicine: Informatics accuracy and cost-effectiveness for healthcare administration and delivery including medical research. Academic Press.
Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform, 19(6), 1236-1246.
Morris, Z. S., Wooding, S., & Grant, J. (2011). The answer is 17 years, what is the question: understanding time lags in translational research. J R Soc Med, 104(12), 510-520.
Munro, B. H. (2005). Statistical methods for health care research. Lippincott Williams & Wilkins.
Nabrawi, E., & Alanazi, A. (2023). Fraud detection in healthcare insurance claims using machine learning. Risks, 11(9), 160.
Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Front Neurorobot, 7, 63623.
National Academies of Sciences, E., & Medicine. (1994). Health Services Research: Opportunities for an Expanding Field of Inquiry: An Interim Statement. National Academies Press (US).
Newman-Griffis, D. R., Hurwitz, M. B., McKernan, G. P., Houtrow, A. J., & Dicianno, B. E. (2022). A roadmap to reduce information inequities in disability with digital health and natural language processing. PLOS Digit Health, 1(11).
Ngiam, K. Y., & Khor, W. (2019). Big data and machine learning algorithms for health-care delivery. Lancet Oncol, 20(5).
Nohara, Y., Matsumoto, K., Soejima, H., & Nakashima, N. (2022). Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Comput Methods Programs Biomed, 214, 106584.
Norori, N., Hu, Q., Aellen, F. M., Faraci, F. D., & Tzovara, A. (2021). Addressing bias in big data and AI for health care: A call for open science. Patterns (N Y), 2(10), 100347.
Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447-453.
Olaya, B., Moneta, M. V., Caballero, F. F., Tyrovolas, S., Bayes, I., Ayuso-Mateos, J. L., & Haro, J. M. (2017). Latent class analysis of multimorbidity patterns and associated outcomes in Spanish older adults: a prospective cohort study. BMC Geriatr, 17(1), 186.
Olson, D. L., & Delen, D. (2008). Advanced data mining techniques. Springer Science & Business Media.
Oner, N., Zengul, F. D., & Agirbas, I. (2024). Evaluation of the financial distress of hospitals through machine learning: An application of AI in healthcare industry. Intell Syst Account Finance Manag, 31(4).
Orhan, F., & Kurutkan, M. N. (2025). Predicting total healthcare demand using machine learning: separate and combined analysis of predisposing, enabling, and need factors. BMC Health Serv Res, 25, 366.
Panch, T., Szolovits, P., & Atun, R. (2018). Artificial intelligence, machine learning and health systems. J Glob Health, 8(2), 020303.
Panga, N. K. R. (2021). Financial fraud detection in healthcare using machine learning and deep learning techniques. Int J Manag Res Bus Strateg, 11(3), 46-66.
Park, J. H., Jo, H. S., Lee, S. H., Oh, S. W., & Na, M. G. (2022). A reliable intelligent diagnostic assistant for nuclear power plants using explainable artificial intelligence of GRU-AE, LightGBM and SHAP. Nucl Eng Technol, 54(4), 1271-1287.
Patra, B. G., Sharma, M. M., Vekaria, V., Adekkanattu, P., Patterson, O. V., & Glicksberg, B. (2021). Extracting social determinants of health from electronic health records using natural language processing: a systematic review. J Am Med Inform Assoc, 28(12), 2716-2727.
Pavlou, M., Ambler, G., Seaman, S., De Iorio, M., & Omar, R. Z. (2016). Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events. Stat Med, 35(7), 1159-1177.
Petch, J., Di, S., & Nelson, W. (2022). Opening the black box: the promise and limitations of explainable machine learning in cardiology. Can J Cardiol, 38(2), 204-213.
Pfohl, S. R., Foryciarz, A., & Shah, N. H. (2021). An empirical characterization of fair machine learning for clinical risk prediction. J Biomed Inform, 113, 103621.
Pirracchio, R., Petersen, M. L., & van der Laan, M. (2015). Improving propensity score estimators' robustness to model misspecification using super learner. Am J Epidemiol, 181(2), 108-119.
Pittman, P. (2010). Health Services Research in 2020: data and methods needs for the future. Health Serv Res, 45(5 Pt 2), 1431.
Porto, B. M., & Fogliatto, F. S. (2024). Enhanced forecasting of emergency department patient arrivals using feature engineering approach and machine learning. BMC Med Inform Decis Mak, 24(1), 377.
Saini, S. S. (2025). Federated Learning for Privacy-Preserving Healthcare AI Models. Int J Unified Res Dev (IJURD), 1(1), 1-5.
Salerno, S., & Li, Y. (2023). High-Dimensional Survival Analysis: Methods and Applications. Annu Rev Stat Appl, 10(1), 25-49.
Sanchez, P., Voisey, J. P., Xia, T., Watson, H. I., O’Neil, A. Q., & Tsaftaris, S. A. (2022). Causal machine learning for healthcare and precision medicine. R Soc Open Sci, 9(8), 220450.
Sanchez-Morillo, D., Fernandez-Granero, M. A., & A, L. (2015). Detecting COPD exacerbations early using daily telemonitoring of symptoms and k-means clustering: a pilot study. Med Biol Eng Comput, 53(5), 441-451.
Sarijaloo, F., Park, J., Zhong, X., & Wokhlu, A. (2021). Predicting 90 day acute heart failure readmission and death using machine learning‐supported decision analysis. Clin Cardiol, 44(2), 230-237.
Sarkies, M. N., Bowles, K. A., Skinner, E. H., Mitchell, D., Haas, R., & Ho, M. (2015). Data collection methods in health services research. Appl Clin Inform, 6(1), 96-109.
Schmier, J. K., & Halpern, M. T. (2004). Patient recall and recall bias of health state and health status. Expert Rev Pharmacoecon Outcomes Res, 4(2), 159-163.
Shafik, W., & Abubakari, M. S. (2026). Regulatory Frameworks: HIPAA, GDPR, and Compliance in Federated Learning. The Convergence of Federated Learning and Healthcare 5.0 and Beyond: A New Era of Intelligent Health Systems, 99.
Sinaga, K. P., & Yang, M. S. (2020). Unsupervised K-means clustering algorithm. IEEE Access, 8, 80716-80727.
Spasic, I., & Nenadic, G. (2020). Clinical text data in machine learning: systematic review. JMIR Med Inform, 8(3).
Strutz, S., Liang, H., Carey, K., Bashiri, F., Jani, P., & Gilbert, E. (2025). Machine Learning for Predicting Critical Events Among Hospitalized Children. JAMA Netw Open, 8(5).
Sun, M., Oliwa, T., Peek, M. E., & Tung, E. L. (2022). Negative Patient Descriptors: Documenting Racial Bias In The Electronic Health Record: Study examines racial bias in the patient descriptors used in the electronic health record. Health Aff (Millwood), 41(2), 203-211.
Susanto, A. P., Lyell, D., Widyantoro, B., Berkovsky, S., & Magrabi, F. (2023). Effects of machine learning-based clinical decision support systems on decision-making, care delivery, and patient outcomes: a scoping review. J Am Med Inform Assoc, 30(12), 2050-2063.
Talbot, D., Diop, A., Chiu, Y., Sirois, C., & Spieker, A. J. (2025). Guidelines and Best Practices for the Use of Targeted Maximum Likelihood and Machine Learning When Estimating Causal Effects of Exposures on Time-To-Event Outcomes. Stat Med, 44(6).
Talwar, A., Lopez-Olivo, M. A., Huang, Y., Ying, L., & Aparasu, R. R. (2023). Performance of advanced machine learning algorithms over logistic regression in predicting hospital readmissions: A meta-analysis. Explor Res Clin Soc Pharm, 11, 100317.
Teng, Q., Liu, Z., Song, Y., Han, K., & Lu, Y. (2022). A survey on the interpretability of deep learning in medical diagnosis. Multimed Syst, 28(6), 2335-2355.
Tennant, P. W., Murray, E. J., Arnold, K. F., Berrie, L., Fox, M. P., & Gadd, S. C. (2021). Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int J Epidemiol, 50(2), 620-632.
Theodorou, B., Xiao, C., & Sun, J. (2023). Synthesize Extremely High-dimensional Longitudinal Electronic Health Records via Hierarchical Autoregressive Language Model. https://doi.org/10.21203/rs.3.rs-2644725/v1
Traino, K. A., Sharkey, C. M., Perez, M. N., Bakula, D. M., Roberts, C. M., Chaney, J. M., & Mullins, L. L. (2021). Health care utilization, transition readiness, and quality of life: A latent class analysis. J Pediatr Psychol, 46(2), 197-207.
Vorisek, C. N., Lehne, M., Klopfenstein, S. A. I., Mayer, P. J., Bartschke, A., Haese, T., & Thun, S. (2022). Fast healthcare interoperability resources (FHIR) for interoperability in health research: systematic review. JMIR Med Inform, 10(7).
Vota, F., Pediconi, F., & Liscio, A. (2025). Federated learning in healthcare: Addressing AI challenges and operational realities under the GDPR. J Data Prot Priv, 7(3), 235-251.
Vranas, K. C., Jopling, J. K., Sweeney, T. E., Ramsey, M. C., Milstein, A. S., & Slatore, C. G. (2017). Identifying distinct subgroups of ICU patients: a machine learning approach. Crit Care Med, 45(10), 1607-1615.
Wang, H., Robinson, R. D., Johnson, C., Zenarosa, N. R., Jayswal, R. D., Keithley, J., & Delaney, K. A. (2014). Using the LACE index to predict hospital readmissions in congestive heart failure patients. BMC Cardiovasc Disord, 14(1), 97.
Weissman, G. E., Hubbard, R. A., Ungar, L. H., Harhay, M. O., Greene, C. S., Himes, B. E., & Halpern, S. D. (2018). Inclusion of Unstructured Clinical Text Improves Early Prediction of Death or Prolonged ICU Stay. Crit Care Med, 46(7), 1125-1132.
Weller, B. E., Bowen, N. K., & Faubert, S. J. (2020). Latent class analysis: a guide to best practice. J Black Psychol, 46(4), 287-311.
World Health, O. (2021). Ethics and governance of artificial intelligence for health: WHO guidance. World Health Organization.
Yang, J., Soltan, A. A., Eyre, D. W., Yang, Y., & Clifton, D. A. (2023). An adversarial training framework for mitigating algorithmic biases in clinical machine learning. NPJ Digit Med, 6(1), 55.
Yang, S., Varghese, P., Stephenson, E., Tu, K., & Gronsbell, J. (2023). Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc, 30(2), 367-381.
Yu, M. Y., & Son, Y. J. (2024). Machine learning-based 30-day readmission prediction models for patients with heart failure: a systematic review. Eur J Cardiovasc Nurs, 23(7), 711-719.
Zhang, Z., & Hong, Y. (2017). Development of a novel score for the prediction of hospital mortality in patients with severe sepsis: the use of electronic healthcare records with LASSO regression. Oncotarget, 8(30), 49637.
Zhou, M., Thayer, W. M., & Bridges, J. F. (2018). Using latent class analysis to model preference heterogeneity in health: a systematic review. Pharmacoeconomics, 36(2), 175-187.
Referanslar
Agrawal, R., & Prabakaran, S. (2020). Big data in digital healthcare: lessons learnt and recommendations for general practice. Heredity, 124(4), 525-534.
Ahmed, A., Zengul, F. D., Khan, S., Hearld, K. R., Feldman, S. S., & Hall, A. G. (2024). Developing a decision model to early predict ICU admission for COVID-19 patients: A machine learning approach. Intell Based Med, 9, 100136.
Al-Nafjan, A., Aljuhani, A., Alshebel, A., Alharbi, A., & Alshehri, A. (2025). Artificial intelligence in predictive healthcare: A systematic review. J Clin Med, 14(19), 6752.
Beaulieu-Jones, B., Finlayson, S. G., Chivers, C., Chen, I., McDermott, M., & Kandola, J. (2019). Trends and focus of machine learning applications for health research. JAMA Netw Open, 2(10).
Blanco, J., Ferreras, M., & Cosido, O. (2025). Predictive modeling of hospital emergency department demand using artificial intelligence: A systematic review. Int J Med Inform, 106215.
Bowling, A. (2014). Research methods in health: investigating health and health services. McGraw-Hill Education (UK).
Breiman, L. (2001). Random forests. Mach Learn, 45(1), 5-32.
Brusco, N. K., & Watts, J. J. (2015). Empirical evidence of recall bias for primary health care visits. BMC Health Serv Res, 15(1), 381.
Chandrasekaran, R., Sadiq, T. M., & Moustakas, E. (2025). Usage trends and data sharing practices of healthcare wearable devices among US adults: cross-sectional study. J Med Internet Res, 27.
Chen, R. J., Wang, J. J., Williamson, D. F. K., Chen, T. Y., Lipkova, J., & Lu, M. Y. (2023). Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat Biomed Eng, 7(6), 719-742.
de Ruijter, U. W., Kaplan, Z. L. R., & Bramer, W. M. (2022). Prediction Models for Future High-Need High-Cost Healthcare Use: a Systematic Review. J Gen Intern Med, 37, 1763-1770.
Delen, D. (2012). Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications.
Doupe, P., Faghmous, J., & Basu, S. (2019). Machine learning for health services researchers. Value Health, 22(7), 808-815.
Dwivedi, R., Dave, D., Naik, H., Singhal, S., Omer, R., & Patel, P. (2023). Explainable AI (XAI): Core ideas, techniques, and solutions. ACM Comput Surv, 55(9), 1-33.
Eckhardt, C. M., Madjarova, S. J., Williams, R. J., Ollivier, M., Karlsson, J., Pareek, A., & Nwachukwu, B. U. (2023). Unsupervised machine learning methods and emerging applications in healthcare. Knee Surg Sports Traumatol Arthrosc, 31(2), 376-381.
Ellis, L. A., Sarkies, M., Churruca, K., Dammery, G., Meulenbroeks, I., & Smith, C. L. (2022). The science of learning health systems: scoping review of empirical research. JMIR Med Inform, 10(2).
Feretzakis, G., Sakagianni, A., Kalles, D., Loupelis, E., Tzelves, L., & Panteris, V. (2022, 2022). Exploratory clustering for emergency department patients. In: Advances in Informatics, Management and Technology in Healthcare,
Finnegan, H., & Mountford, N. (2025). 25 years of electronic health record implementation processes: scoping review. J Med Internet Res, 27.
Fridgeirsson, E. A., Williams, R., Rijnbeek, P., Suchard, M. A., & Reps, J. M. (2024). Comparing penalization methods for linear models on large observational health data. J Am Med Inform Assoc, 31(7), 1514-1521.
Friedman, J. H., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J Stat Softw, 33, 1-22.
Gopukumar, D., Ghoshal, A., & Zhao, H. (2022). Predicting readmission charges billed by hospitals: machine learning approach. JMIR Med Inform, 10(8).
Gordon, B., Barrett, J., Fennessy, C., Cake, C., Milward, A., & Irwin, C. (2021). Development of a data utility framework to support effective health data curation. BMJ Health Care Inform, 28(1).
Grant, R. W., McCloskey, J., Hatfield, M., Uratsu, C., Ralston, J. D., Bayliss, E., & Kennedy, C. J. (2020). Use of latent class analysis and k-means clustering to identify complex patient profiles. JAMA Netw Open, 3(12).
Hajek, A. M. (2013). Breaking down clinical silos in healthcare. Front Health Serv Manag, 29(4), 45-50.
Haneuse, S. (2016). Distinguishing Selection Bias and Confounding Bias in Comparative Effectiveness Research. Med Care.
Haripriya, G., Abinaya, K., Aarthi, N., & Praveen Kumar, P. (2021). Random forest algorithms in health care sectors: a review of applications. In.
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.). Springer.
Herawati, N., Nisa, K., Setiawan, E., Nusyirwan, N., & Tiryono, T. (2018). Regularized multiple regression methods to deal with severe multicollinearity. Int J Stat Appl, 8(4), 167-172.
Hlatky, M. A., Winkelmayer, W. C., & Setoguchi, S. (2013). Epidemiologic and statistical methods for comparative effectiveness research. Heart Fail Clin, 9(1), 29-36.
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: applications to nonorthogonal problems. Technometrics, 12(1), 69-82.
Howard, K. A., Anderson, W., Podichetty, J. T., Gould, R., Boyce, D., & Dasher, P. (2025). Wrangling Real-World Data: Optimizing Clinical Research Through Factor Selection with LASSO Regression. Int J Environ Res Public Health, 22(4), 464.
Huang, K., Altosaar, J., & Ranganath, R. (1904). ClinicalBERT: Modeling clinical notes and predicting hospital readmission.
Huang, Y., Talwar, A., Chatterjee, S., & Aparasu, R. R. (2021). Application of machine learning in predicting hospital readmissions: a scoping review of the literature. BMC Med Res Methodol, 21(1), 96.
Huston, P., & Naylor, C. D. (1996). Health services research: reporting on studies using secondary data sources. CMAJ, 155(12), 1697.
Hyer, J. M., Ejaz, A., Tsilimigras, D. I., Paredes, A. Z., Mehta, R., & Pawlik, T. M. (2019). Novel Machine Learning Approach to Identify Preoperative Risk Factors Associated With Super-Utilization of Medicare Expenditure Following Surgery. JAMA Surg, 154(11), 1014-1021.
Jones, L., Barnett, A., & Vagenas, D. (2025). Linear regression reporting practices for health researchers, a cross-sectional meta-research study. PLOS ONE, 20(3).
Kan, H. J., Kharrazi, H., Chang, H. Y., Bodycombe, D., Lemke, K., & Weiner, J. P. (2019). Exploring the use of machine learning for risk adjustment: A comparison of standard and penalized linear regression models in predicting health care costs in older adults. PLOS ONE, 14(3).
Kaushal, A., Altman, R., & Langlotz, C. (2020). Geographic Distribution of US Cohorts Used to Train Deep Learning Algorithms. JAMA, 324(12), 1212-1213.
Khare, S. R., & Vedel, I. (2019). Recall bias and reduction measures: an example in primary health care service utilization. Fam Pract, 36(5), 672-676.
Kolluri, J., Kotte, V. K., Phridviraj, M. S. B., & Razia, S. (2020, 2020). Reducing overfitting problem in machine learning using novel L1/4 regularization method. In: 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI),
Krumholz, H. M., Terry, S. F., & Waldstreicher, J. (2016). Data acquisition, curation, and use for a continuously learning health system. JAMA, 316(16), 1669-1670.
Kulkarni, S., Ambekar, S. S., & Hudnurkar, M. (2021). Predicting the inpatient hospital cost using a machine learning approach. Int J Innov Sci, 13(1), 87-104.
Liu, F., & Panagiotakos, D. (2022). Real-world data: a brief review of the methods, applications, challenges and opportunities. BMC Med Res Methodol, 22(1), 287.
Lohr, K. N., & Steinwachs, D. M. (2002). Health services research: An evolving definition of the field. Health Serv Res, 37(1), 15.
Lourenço, L., Weber, L., Garcia, L., Ramos, V., & Souza, J. (2024). Machine Learning Algorithms to Estimate Propensity Scores in Health Policy Evaluation: A Scoping Review. Int J Environ Res Public Health, 21(11), 1484.
Lu, J., Sattler, A., Wang, S., Khaki, A. R., Callahan, A., & Fleming, S. (2022). Considerations in the reliability and fairness audits of predictive models for advance care planning. Front Digit Health, 4, 943768.
Mackin, S., Major, V. J., Chunara, R., & Newton-Dame, R. (2025). Identifying and mitigating algorithmic bias in the safety net. NPJ Digit Med, 8(1), 335.
Martinez, R. G., & Van Dongen, D. M. (2023). Deep learning algorithms for the early detection of breast cancer: A comparative study with traditional machine learning. Inform Med Unlocked, 41, 101317.
Mavrogiorgou, A., Kiourtis, A., Kleftakis, S., Mavrogiorgos, K., Zafeiropoulos, N., & Kyriazis, D. (2022). A catalogue of machine learning algorithms for healthcare risk predictions. Sensors, 22(22), 8615.
McGinnis, J. M., Fineberg, H. V., & Dzau, V. J. (2021). Advancing the learning health system. N Engl J Med, 385(1), 1-5.
Menze, B. H., Kelm, B. M., Masuch, R., Himmelreich, U., Bachert, P., Petrich, W., & Hamprecht, F. A. (2009). A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinformatics, 10(1), 213.
Mihaylova, B., Briggs, A., O'Hagan, A., & Thompson, S. G. (2011). Review of statistical methods for analysing healthcare resources and costs. Health Econ, 20(8), 897-916.
Miner, G. D., Miner, L. A., Goldstein, M., Nisbet, R., Walton, N., & Bolding, P. (2014). Practical predictive analytics and decisioning systems for medicine: Informatics accuracy and cost-effectiveness for healthcare administration and delivery including medical research. Academic Press.
Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform, 19(6), 1236-1246.
Morris, Z. S., Wooding, S., & Grant, J. (2011). The answer is 17 years, what is the question: understanding time lags in translational research. J R Soc Med, 104(12), 510-520.
Munro, B. H. (2005). Statistical methods for health care research. Lippincott Williams & Wilkins.
Nabrawi, E., & Alanazi, A. (2023). Fraud detection in healthcare insurance claims using machine learning. Risks, 11(9), 160.
Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Front Neurorobot, 7, 63623.
National Academies of Sciences, E., & Medicine. (1994). Health Services Research: Opportunities for an Expanding Field of Inquiry: An Interim Statement. National Academies Press (US).
Newman-Griffis, D. R., Hurwitz, M. B., McKernan, G. P., Houtrow, A. J., & Dicianno, B. E. (2022). A roadmap to reduce information inequities in disability with digital health and natural language processing. PLOS Digit Health, 1(11).
Ngiam, K. Y., & Khor, W. (2019). Big data and machine learning algorithms for health-care delivery. Lancet Oncol, 20(5).
Nohara, Y., Matsumoto, K., Soejima, H., & Nakashima, N. (2022). Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Comput Methods Programs Biomed, 214, 106584.
Norori, N., Hu, Q., Aellen, F. M., Faraci, F. D., & Tzovara, A. (2021). Addressing bias in big data and AI for health care: A call for open science. Patterns (N Y), 2(10), 100347.
Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447-453.
Olaya, B., Moneta, M. V., Caballero, F. F., Tyrovolas, S., Bayes, I., Ayuso-Mateos, J. L., & Haro, J. M. (2017). Latent class analysis of multimorbidity patterns and associated outcomes in Spanish older adults: a prospective cohort study. BMC Geriatr, 17(1), 186.
Olson, D. L., & Delen, D. (2008). Advanced data mining techniques. Springer Science & Business Media.
Oner, N., Zengul, F. D., & Agirbas, I. (2024). Evaluation of the financial distress of hospitals through machine learning: An application of AI in healthcare industry. Intell Syst Account Finance Manag, 31(4).
Orhan, F., & Kurutkan, M. N. (2025). Predicting total healthcare demand using machine learning: separate and combined analysis of predisposing, enabling, and need factors. BMC Health Serv Res, 25, 366.
Panch, T., Szolovits, P., & Atun, R. (2018). Artificial intelligence, machine learning and health systems. J Glob Health, 8(2), 020303.
Panga, N. K. R. (2021). Financial fraud detection in healthcare using machine learning and deep learning techniques. Int J Manag Res Bus Strateg, 11(3), 46-66.
Park, J. H., Jo, H. S., Lee, S. H., Oh, S. W., & Na, M. G. (2022). A reliable intelligent diagnostic assistant for nuclear power plants using explainable artificial intelligence of GRU-AE, LightGBM and SHAP. Nucl Eng Technol, 54(4), 1271-1287.
Patra, B. G., Sharma, M. M., Vekaria, V., Adekkanattu, P., Patterson, O. V., & Glicksberg, B. (2021). Extracting social determinants of health from electronic health records using natural language processing: a systematic review. J Am Med Inform Assoc, 28(12), 2716-2727.
Pavlou, M., Ambler, G., Seaman, S., De Iorio, M., & Omar, R. Z. (2016). Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events. Stat Med, 35(7), 1159-1177.
Petch, J., Di, S., & Nelson, W. (2022). Opening the black box: the promise and limitations of explainable machine learning in cardiology. Can J Cardiol, 38(2), 204-213.
Pfohl, S. R., Foryciarz, A., & Shah, N. H. (2021). An empirical characterization of fair machine learning for clinical risk prediction. J Biomed Inform, 113, 103621.
Pirracchio, R., Petersen, M. L., & van der Laan, M. (2015). Improving propensity score estimators' robustness to model misspecification using super learner. Am J Epidemiol, 181(2), 108-119.
Pittman, P. (2010). Health Services Research in 2020: data and methods needs for the future. Health Serv Res, 45(5 Pt 2), 1431.
Porto, B. M., & Fogliatto, F. S. (2024). Enhanced forecasting of emergency department patient arrivals using feature engineering approach and machine learning. BMC Med Inform Decis Mak, 24(1), 377.
Saini, S. S. (2025). Federated Learning for Privacy-Preserving Healthcare AI Models. Int J Unified Res Dev (IJURD), 1(1), 1-5.
Salerno, S., & Li, Y. (2023). High-Dimensional Survival Analysis: Methods and Applications. Annu Rev Stat Appl, 10(1), 25-49.
Sanchez, P., Voisey, J. P., Xia, T., Watson, H. I., O’Neil, A. Q., & Tsaftaris, S. A. (2022). Causal machine learning for healthcare and precision medicine. R Soc Open Sci, 9(8), 220450.
Sanchez-Morillo, D., Fernandez-Granero, M. A., & A, L. (2015). Detecting COPD exacerbations early using daily telemonitoring of symptoms and k-means clustering: a pilot study. Med Biol Eng Comput, 53(5), 441-451.
Sarijaloo, F., Park, J., Zhong, X., & Wokhlu, A. (2021). Predicting 90 day acute heart failure readmission and death using machine learning‐supported decision analysis. Clin Cardiol, 44(2), 230-237.
Sarkies, M. N., Bowles, K. A., Skinner, E. H., Mitchell, D., Haas, R., & Ho, M. (2015). Data collection methods in health services research. Appl Clin Inform, 6(1), 96-109.
Schmier, J. K., & Halpern, M. T. (2004). Patient recall and recall bias of health state and health status. Expert Rev Pharmacoecon Outcomes Res, 4(2), 159-163.
Shafik, W., & Abubakari, M. S. (2026). Regulatory Frameworks: HIPAA, GDPR, and Compliance in Federated Learning. The Convergence of Federated Learning and Healthcare 5.0 and Beyond: A New Era of Intelligent Health Systems, 99.
Sinaga, K. P., & Yang, M. S. (2020). Unsupervised K-means clustering algorithm. IEEE Access, 8, 80716-80727.
Spasic, I., & Nenadic, G. (2020). Clinical text data in machine learning: systematic review. JMIR Med Inform, 8(3).
Strutz, S., Liang, H., Carey, K., Bashiri, F., Jani, P., & Gilbert, E. (2025). Machine Learning for Predicting Critical Events Among Hospitalized Children. JAMA Netw Open, 8(5).
Sun, M., Oliwa, T., Peek, M. E., & Tung, E. L. (2022). Negative Patient Descriptors: Documenting Racial Bias In The Electronic Health Record: Study examines racial bias in the patient descriptors used in the electronic health record. Health Aff (Millwood), 41(2), 203-211.
Susanto, A. P., Lyell, D., Widyantoro, B., Berkovsky, S., & Magrabi, F. (2023). Effects of machine learning-based clinical decision support systems on decision-making, care delivery, and patient outcomes: a scoping review. J Am Med Inform Assoc, 30(12), 2050-2063.
Talbot, D., Diop, A., Chiu, Y., Sirois, C., & Spieker, A. J. (2025). Guidelines and Best Practices for the Use of Targeted Maximum Likelihood and Machine Learning When Estimating Causal Effects of Exposures on Time-To-Event Outcomes. Stat Med, 44(6).
Talwar, A., Lopez-Olivo, M. A., Huang, Y., Ying, L., & Aparasu, R. R. (2023). Performance of advanced machine learning algorithms over logistic regression in predicting hospital readmissions: A meta-analysis. Explor Res Clin Soc Pharm, 11, 100317.
Teng, Q., Liu, Z., Song, Y., Han, K., & Lu, Y. (2022). A survey on the interpretability of deep learning in medical diagnosis. Multimed Syst, 28(6), 2335-2355.
Tennant, P. W., Murray, E. J., Arnold, K. F., Berrie, L., Fox, M. P., & Gadd, S. C. (2021). Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int J Epidemiol, 50(2), 620-632.
Theodorou, B., Xiao, C., & Sun, J. (2023). Synthesize Extremely High-dimensional Longitudinal Electronic Health Records via Hierarchical Autoregressive Language Model. https://doi.org/10.21203/rs.3.rs-2644725/v1
Traino, K. A., Sharkey, C. M., Perez, M. N., Bakula, D. M., Roberts, C. M., Chaney, J. M., & Mullins, L. L. (2021). Health care utilization, transition readiness, and quality of life: A latent class analysis. J Pediatr Psychol, 46(2), 197-207.
Vorisek, C. N., Lehne, M., Klopfenstein, S. A. I., Mayer, P. J., Bartschke, A., Haese, T., & Thun, S. (2022). Fast healthcare interoperability resources (FHIR) for interoperability in health research: systematic review. JMIR Med Inform, 10(7).
Vota, F., Pediconi, F., & Liscio, A. (2025). Federated learning in healthcare: Addressing AI challenges and operational realities under the GDPR. J Data Prot Priv, 7(3), 235-251.
Vranas, K. C., Jopling, J. K., Sweeney, T. E., Ramsey, M. C., Milstein, A. S., & Slatore, C. G. (2017). Identifying distinct subgroups of ICU patients: a machine learning approach. Crit Care Med, 45(10), 1607-1615.
Wang, H., Robinson, R. D., Johnson, C., Zenarosa, N. R., Jayswal, R. D., Keithley, J., & Delaney, K. A. (2014). Using the LACE index to predict hospital readmissions in congestive heart failure patients. BMC Cardiovasc Disord, 14(1), 97.
Weissman, G. E., Hubbard, R. A., Ungar, L. H., Harhay, M. O., Greene, C. S., Himes, B. E., & Halpern, S. D. (2018). Inclusion of Unstructured Clinical Text Improves Early Prediction of Death or Prolonged ICU Stay. Crit Care Med, 46(7), 1125-1132.
Weller, B. E., Bowen, N. K., & Faubert, S. J. (2020). Latent class analysis: a guide to best practice. J Black Psychol, 46(4), 287-311.
World Health, O. (2021). Ethics and governance of artificial intelligence for health: WHO guidance. World Health Organization.
Yang, J., Soltan, A. A., Eyre, D. W., Yang, Y., & Clifton, D. A. (2023). An adversarial training framework for mitigating algorithmic biases in clinical machine learning. NPJ Digit Med, 6(1), 55.
Yang, S., Varghese, P., Stephenson, E., Tu, K., & Gronsbell, J. (2023). Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc, 30(2), 367-381.
Yu, M. Y., & Son, Y. J. (2024). Machine learning-based 30-day readmission prediction models for patients with heart failure: a systematic review. Eur J Cardiovasc Nurs, 23(7), 711-719.
Zhang, Z., & Hong, Y. (2017). Development of a novel score for the prediction of hospital mortality in patients with severe sepsis: the use of electronic healthcare records with LASSO regression. Oncotarget, 8(30), 49637.
Zhou, M., Thayer, W. M., & Bridges, J. F. (2018). Using latent class analysis to model preference heterogeneity in health: a systematic review. Pharmacoeconomics, 36(2), 175-187.