News Release

Multimodal learning-based prediction for nonalcoholic fatty liver disease

Peer-Reviewed Publication

Beijing Zhongke Journal Publising Co. Ltd.

The AI system DeepFLDDiag for NAFLD prediction.

image: 

Researchers captured volunteers′ facial image data using a 3D face instrument and collected their medical metadata from their physical examinations, extensive questionnaires, and so on. Through the proposed data analysis and machine learning method, researchers can improve NAFLD diagnosis with facial images and high-quality data, targeted analysis and management.

view more 

Credit: Beijing Zhongke Journal Publising Co. Ltd.

Chronic liver diseases are prevalent factors contributing to morbidity and mortality on a global scale, with liver-related ailments responsible for more than 2 million annual fatalities worldwide. Nonalcoholic fatty liver disease (NAFLD), also known as metabolic-associated fatty liver disease, is one of the most common chronic diseases and metabolic complications of obesity. As obesity rapidly increases, the prevalence of NAFLD is increasing globally, ranging from approximately 30% in the general population to approximately 80% in morbidly obese individuals. NAFLD, a spectrum of liver abnormalities ranging from NAFLD to nonalcoholic steatohepatitis (NASH), is predicted to be the most common indication for liver transplantation by 2030. NAFLD is characterized by excessive fat accumulation and is a major risk factor for the development of NASH, liver fibrosis, and cirrhosis. Early diagnosis and treatment are critical for reducing associated complications and mortality.

 

For centuries, physicians in the clinic have employed several techniques to detect NAFLD, of which liver biopsy has been evaluated as the gold standard, yet it is considered intrusive and costly. Some radiological techniques and ultrasonography are effective alternatives for liver biopsy, but they have limited access to remote areas due to the high cost of instruments and tests. Therefore, noninvasive and inexpensive NAFLD diagnostic techniques are extremely promising.

 

In previous studies, there have been several alternative methods for detecting NAFLD. Leung et al. used a machine learning model to classify NAFLD according to human serum and stool. In addition, Fibrosis-4 (FIB-4), nonalcoholic fatty liver disease fibrosis score (NFS), and neck circumference have also been studied for the diagnosis of NAFLD. However, most of the above methods are concentrated only on unilateral factors, which may be due to the lack of comprehensive data. Considering multifaceted perspectives for NAFLD prediction could be effective.

 

To predict NAFLD from multiple perspectives, researchers intend to create a comprehensive clinical database that includes questionnaires, physical examinations, laboratory tests, and imaging examinations (routine blood examination, urinalysis, and so on). In fact, the image of the face is a convenient window into the internal organs’ function. Facial images have been used as an important diagnostic tool in traditional Chinese medicine and Western medicine clinical fields. At present, studies have proven that human facial features can reflect developmental syndromes, biological age, and the aging degree of organs. Many studies have fully proven the auxiliary value of facial images in disease diagnosis, which can easily and conveniently help clinicians make disease judgments, especially in traditional Chinese medicine. Recently, deep convolutional neural networks (CNNs), as one of the most efficient networks in computer vision, have been widely used for image-based disease diagnosis such as heart disease and small-bowel disease. CNN-based deep-learning algorithms have achieved near-human-level performance in disease classification, and even surpassed humans in subtle points that humans cannot observe. Therefore, facial images, which can be acquired rapidly, noninvasively, and freely, may be potential and essential information for the screening and prediction of NAFLD. In this study, researchers aim to build comprehensive clinical data, including facial images, physical examinations, and so on. As is well known, no prior studies have exploited the association between facial images and NAFLD.

 

In this paper published in Machine Intelligence Research, an NAFLD diagnosis system is developed to distinguish NAFLD using multimodal input, encompassing facial images and metadata. First, researchers compile a comprehensive medical dataset FLDData, by gathering physical tests, laboratory and imaging studies, questionnaires, and facial images from a pool of volunteers. Next, researchers employ a collaborative approach utilizing a joint indicator-based data analysis to quantitatively examine and identify the clinical metadata that holds the most relevance to NAFLD within the medical dataset. Based on the selected data, researchers propose a multimodal-based NAFLD prediction method DeepFLD, which incorporates both facial images and metadata. Due to the intricate nature of NAFLD, it is difficult to extract effective features directly from facial images. In DeepFLD, a medical constraint-based auxiliary task is designed to extract valid image features. Compared with the indicators selected by considering only the Pearson correlation coefficient, the indicators researchers selected can improve the classification accuracy of NAFLD. The proposed DeepFLD model with multimodal input exhibits superior performance compared to models that rely solely on metadata as input. The DeepFLD model demonstrated satisfactory performance when applied to previously unseen data. It can achieve competitive results using only facial images as input rather than metadata, which is encouraging.

 

In summary, researchers’ main contributions of this paper are as follows:

 

1) A comprehensive human clinical dataset is constructed by aggregating facial images, physical examination data, laboratory test data, imaging information, and questionnaires. Researchers employed a joint indicator-based data analysis method to quantitatively examine the key medical indicators associated with NAFLD.

 

2) Researchers propose an NAFLD prediction method DeepFLD with multimodal input and medical constraints, that facilitates valid feature extraction from the high-dimensional facial images. As the first to introduce facial images into NAFLD prediction, DeepFLD with multimodal input outperforms other methods with only metadata as input, and verifies satisfactory performance on other unseen datasets. Furthermore, compared to metadata, DeepFLD can achieve competitive results with only facial images as input, providing a viable method for a more robust and simpler noninvasive diagnosis of NAFLD.

 

3) Researchers analyze the NAFLD prediction results by exploring the facial characteristics of people with NAFLD. Among these characteristics, dark skin color and the presence of melasma can be supported by previous medical studies.

 

Section 2 reviews related works on NAFLD diagnosis and multimodal learning. For NAFLD diagnosis, liver biopsy is the gold standard but invasive and costly, while radiological techniques and existing alternative methods (e.g., metabolomics-based models, scoring systems like FIB-4 and NFS) have limitations such as high cost, limited accessibility, or insufficient accuracy for large-scale screening. In terms of multimodal learning, current fusion paradigms include early, late, and deep fusion, with deep fusion integrating high-level features and raw data to leverage complementary information from different modalities, as exemplified by the TransFG method that fuses image patches and blood indicator features via a vision transformer.

 

In Section 3, researchers present the proposed NAFLD diagnosis system, called DeepFLDDiag, which comprises three main components: data collection, data processing, and disease prediction. First, the dataset FLDData is constructed, encompassing the facial image data and medical indicators. Second, a quantitative data analysis method is employed to examine and identify the clinical information that exhibits the strongest correlation with NAFLD in the medical datasets. Finally, researchers design an NAFLD prediction method called DeepFLD. In particular, it is crucial to note the following points: 1) The data researchers collected contains very comprehensive medical information about volunteers: face data and a set of 480–dimensional indicators, including physical examination data and questionnaire data; 2) during the data analysis process, researchers not only account for the influence of individual factors on the data, but also employ a joint indicators approach to uncover the combined impact of multiple factors on the outcomes; and 3) in DeepFLD, researchers designed an auxiliary task based on multimodal data fusion and, medical constraints to extract valid image features. Sections 3.1–3.4 specifically describe each component.

 

For the proposed NAFLD model, there are several aspects to be asked about: 1) The NAFLD prediction results with indicators input, and whether adding images can improve the prediction performance? 2) Good prediction results are obtained when migrated to other data? 3) What are the prediction results of images as input alone? 4) Do the method prediction results have any explanations? Hence, researchers performed a series of experiments. In Section 4, researchers compare the performance of the proposed DeepFLD method with multimodal input and the models with metadata as input, to answer the Question 1) (see Section 4.3.1). To answer Question 2), researchers migrate the trained model to an unseen dataset collected in other years (see Section 4.3.3). And conducting experiments only using facial images as input to answer Question 3), (see Section 4.3.4). Moreover, ablation studies are conducted to validate 1) the significance of the clinical metadata selected by the joint indicator-based data analysis method and 2) the effectiveness of the proposed DeepFLD method with facial image input. Finally, researchers analyze the relationship between the face and the NAFLD through visualization to answer Question 4) (see Section 4.5).

 

Section 5 concludes this paper. This paper presents an intelligent NAFLD diagnosis system, DeepFLDDiag, with a comprehensive clinical dataset, FLDData, and a novel NAFLD classification algorithm to investigate whether the facial image contributes to the prediction of NAFLD. FLDData includes faces and 480 physical examination and lifestyle indicators for participants. Through joint indicator-based data analysis, researchers determine the eight most useful indicators for NAFLD classification. With multimodal input and medical constraints, the proposed DeepFLD method can facilitate the extraction of image features from high-dimensional facial images. With multimodal input, the proposed DeepFLD achieves better performance than with metadata and acceptable performance on unseen data. Inspiringly, Deep-FLD can achieve competitive results using only facial images as input rather than metadata, paving the way for a more robust and less invasive approach to NAFLD diagnosis.

 

Despite the promising results of their DeepFLD method in diagnosing NAFLD using facial images and metadata, there are inherent limitations to consider. Currently, the model is predominantly based on 2D facial images, which may not fully capture the complexity of facial structures and underlying health conditions. The integration of three-dimensional facial data could offer significant advantages by providing spatial linkages and potentially enhancing the identification of NAFLD. Future research directions may explore the application of three-dimensional facial data for the direct prediction of NAFLD, aiming to further refine and advance the diagnostic capabilities of their system.

 

 

See the article:   

Multimodal Learning-based Prediction for Nonalcoholic Fatty Liver Disease

http://doi.org/10.1007/s11633-024-1506-4


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.