Feasibility study of multi-site split learning for privacy-preserving medical systems under data imbalance constraints in COVID-19, X-ray, and cholesterol dataset

Image credit: Anna Yoo Jeong Ha


It seems as though progressively more people are in the race to upload content, data, and information online; and hospitals haven’t neglected this trend either. Hospitals are now at the forefront for multi-site medical data sharing to provide ground-breaking advancements in the way health records are shared and patients are diagnosed. Sharing of medical data is essential in modern medical research. Yet, as with all data sharing technology, the challenge is to balance improved treatment with protecting patient’s personal information. This paper provides a novel split learning algorithm coined the term, “multi-site split learning”, which enables a secure transfer of medical data between multiple hospitals without fear of exposing personal data contained in patient records. It also explores the effects of varying the number of end-systems and the ratio of data-imbalance on the deep learning performance. A guideline for the most optimal configuration of split learning that ensures privacy of patient data whilst achieving performance is empirically given. We argue the benefits of our multi-site split learning algorithm, especially regarding the privacy preserving factor, using CT scans of COVID-19 patients, X-ray bone scans, and cholesterol level medical data.

Scientific Reports, 12
Anna Yoo Jeong Ha
Anna Yoo Jeong Ha
Ph.D. Student in Computer Science

My research interests include adversarial machine learning and security in AI.