Publications

You can also find my articles on my Google Scholar profile.

Journal Articles


KG4NH: a comprehensive knowledge graph for question answering in dietary nutrition and human health

Published in IEEE Journal of Biomedical and Health Informatics, 2023

This study constructs a comprehensive knowledge graph on nutrition and human health by extracting triples from vast literature sources. A query-based question-answering system is developed to address three types of queries over this graph. The proposed model outperforms state-of-the-art methods in nutrition-disease relation extraction, achieving a precision of 0.92, recall of 0.81, and an F1 score of 0.86. The question-answering system attains an accuracy of 0.68 and an F1 score of 0.61. Five experiments validate the knowledge graph’s data structure, demonstrating its potential for diet recommendations, patient care, and clinical decision-making.

Recommended citation: Fu, C., Pan, X., Wu, J., Cai, J., Huang, Z., van Harmelen, F., ... & He, T. (2023). KG4NH: a comprehensive knowledge graph for question answering in dietary nutrition and human health. IEEE journal of biomedical and health informatics.
Download Paper

A Semantic Web Technology Index

Published in Scientific Reports, 2022

This paper introduces a Semantic Web (SW) technology index to standardize development and evaluate the quality of SW applications across domains like medicine, finance, and geology. While SW technology has a general architecture, it lacks concrete guidelines for assessment. The proposed index consists of 10 criteria that quantify quality as a score, with detailed explanations of each criterion. Validation is conducted through case studies, demonstrating the index’s effectiveness. The study concludes that this index provides a valuable framework for guiding and assessing SW technology development, ensuring well-designed and high-quality implementations.

Recommended citation: Lan, G., Liu, T., Wang, X., Pan, X., & Huang, Z. (2022). A semantic web technology index. Scientific reports, 12(1), 3672.
Download Paper

Exploring the microbiota-gut-brain axis for mental disorders with knowledge graphs

Published in Journal of Artificial Intelligence for Medical Sciences, 2020

This study introduces MiKG, a knowledge graph designed to explore the microbiota-gut-brain (MGB) axis by integrating research on gut microbiota, neurotransmitters, and mental disorders. While these relationships have been widely studied, existing findings remain fragmented. MiKG systematically organizes decentralized research, enabling the identification of potential associations. It connects to biomedical ontologies such as UMLS, MeSH, KEGG, and SNOMED CT and is extendable for future integrations. MiKG provides a structured framework to investigate the influence of gut microbiota on mental health, offering a valuable tool for uncovering neurotransmitter-mediated pathways in brain-related diseases.

Recommended citation: Liu, T., Pan, X., Wang, X., Feenstra, K. A., Heringa, J., & Huang, Z. (2021). Exploring the microbiota-gut-brain axis for mental disorders with knowledge graphs. Journal of Artificial Intelligence for Medical Sciences, 1(3), 30-42.
Download Paper

Predicting the relationships between gut microbiota and mental disorders with knowledge graphs

Published in Health Information Science and Systems, 2020

This study constructs MiKG4MD, a knowledge graph that systematically organizes research on gut microbiota, neurotransmitters, and mental disorders. While most studies examine these relationships separately, MiKG4MD integrates dispersed findings into a structured knowledge base to identify and predict potential links. The graph is extendable, allowing integration with ontologies such as UMLS, MeSH, and KEGG. Performance is demonstrated using three SPARQL query test cases, showing that MiKG4MD effectively predicts gut microbiota-mental disorder relationships. This work highlights the importance of structured knowledge representation in advancing research on the gut-brain axis.

Recommended citation: Liu, T., Pan, X., Wang, X., Feenstra, K. A., Heringa, J., & Huang, Z. (2021). Predicting the relationships between gut microbiota and mental disorders with knowledge graphs. Health information science and systems, 9, 1-9.
Download Paper

Conference Papers


A RAG Approach for Generating Competency Questions in Ontology Engineering

Published in 18th International Conference on Metadata and Semantics Research (MTSR2024), Athens, 19-22 November, 2024

Competency Question (CQ) formulation is essential in ontology development but often demands significant effort from domain experts. This study introduces a Retrieval-Augmented Generation (RAG) approach that leverages Large Language Models (LLMs) to automate CQ generation using scientific papers as the domain knowledge base. The research examines how varying the number of input papers and adjusting the LLM’s temperature settings affect performance. Experiments conducted with GPT-4 across two domain ontology tasks reveal that incorporating relevant domain knowledge into the RAG framework enhances CQ generation compared to zero-shot prompting. Evaluation metrics, including precision and consistency, indicate that this method effectively reduces the reliance on manual CQ crafting by domain experts.

Recommended citation: Pan, X., Ossenbruggen, J.v., de Boer, V., Huang, Z. (2025). A RAG Approach for Generating Competency Questions in Ontology Engineering. In: Sfakakis, M., Garoufallou, E., Damigos, M., Salaba, A., Papatheodorou, C. (eds) Metadata and Semantic Research. MTSR 2024. Communications in Computer and Information Science, vol 2331. Springer, Cham. https://doi.org/10.1007/978-3-031-81974-2_6
Download Paper | Download Slides

Column Vocabulary Association (CVA): Semantic Interpretation of Dataless Tables

Published in 2024 Semantic Web Challenge on Tabular Data to Knowledge Graph Matching, SemTab 2024 - Baltimore, United States, 11-15 November, 2024

This study explores the Column Vocabulary Association (CVA) task in semantic table interpretation (STI) using only metadata. It evaluates several approaches, including Large Language Models (LLMs) with Retrieval Augmented Generation (RAG) and a SentenceBERT-based similarity method. In a zero-shot setting, LLMs perform well at lower temperatures, achieving 100% accuracy on the challenge test set. However, traditional methods outperform some LLMs when metadata and glossaries are closely related. Initial results on the full dataset suggest a 70% accuracy, indicating potential test set discrepancies that require further analysis.

Recommended citation: Martorana, M., Pan, X., Kruit, B., Kuhn, T., & van Ossenbruggen, J. (2024). Column Vocabulary Association (CVA): Semantic Interpretation of Dataless Tables. In 2024 Semantic Web Challenge on Tabular Data to Knowledge Graph Matching, SemTab 2024 (pp. 27-42). CEUR-WS. org.
Download Paper

Closed-Source vs Open-Source RAGs: lessons learned from the SemTab challenge

Published in Retrieval-Augmented Generation Enabled by Knowledge Graphs Workshop at ISWC2024 - Baltimore, United States, 11-15 November, 2024

This study explores the use of Retrieval-Augmented Generation (RAG) systems in classifying column headers in tabular data, as part of the SemTab24 challenge. Seven Large Language Models (LLMs), including GPT models and open-source alternatives, were evaluated with different temperature settings. GPT models used OpenAI’s RAG system, while open-source models leveraged LLamaIndex. Findings highlight challenges in vectorizing data and the role of assistant instructions. Results indicate that input data significantly impacts RAG effectiveness, suggesting the need for tailored approaches for optimal performance. This work provides insights into the growing role of RAG in enhancing LLM capabilities.”

Recommended citation: Martorana, M., Pan, X., Kruit, B., Kuhn, T., & van Ossenbruggen, J. (2024). Closed-Source vs Open-Source RAGs: lessons learned from the SemTab challenge.
Download Paper

Enhancing Scholarly Paper Recommendation by Modelling Diversity of Research Interests

Published in 16th Asian Conference on Intelligent Information and Database Systems (ACIIDS2024), Ras Al Khaimah, UAE, 15-18 April, 2024

This study enhances scholarly paper recommendation by introducing a feature that captures the diversity of a researcher’s interests based on their past publications. Unlike existing weighting schemes that prioritize recent works, this approach considers content-wise relationships among papers. The feature is integrated into two weighting schemes and tested using Word2Vec text representations. Experiments on a dataset of 50 researchers show that while accuracy varies with parameters, optimal settings improve performance, as measured by NDCG@10 and P@10, compared to existing methods. This highlights the potential of diversity-aware interest modeling in scholarly recommender systems.

Recommended citation: Pan, X., Wang, S., Liu, T., van Ossenbruggen, J., de Boer, V., Huang, Z. (2024). Enhancing Scholarly Paper Recommendation by Modelling Diversity of Research Interests. In: Nguyen, N.T., et al. Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2024. Communications in Computer and Information Science, vol 2145. Springer, Singapore. https://doi.org/10.1007/978-981-97-5934-7_16
Download Paper | Download Slides