AI-Powered Affiliation Insights: LLM-Based Bibliometric Study of European Medical Informatics Conferences
This repository contains detailed bibliometric analyses of MIE Conferences to be published at MIE 2025.
In this repository, we have used the conference MIE dataset and transformed the affiliation information into a structured form through a pipeline designed with langflow (Fig. 1) using the gpt-3.5-turbo-0125 model. We used a prompt for this purpose and received the outputs in json form. Then we analyzed it
Fig 1. The pipeline of langflow
All output is generated using the Python programming language and is available here.
This section provides a quantitative overview of the dataset analyzed, including the total number of publications, authors, citations, average citations per publication, and the diversity of contributing countries and institutions. It establishes the scope and scale of the research landscape covered by the bibliometric study
| Index | Value |
|---|---|
| Total Publications | 4606 |
| Total Authors | 11308 |
| Total Citations | 6191 |
| Average Citations | 1.34 |
| Total Countries | 95 |
| Total Universities | 352 |
| Total Unclean Universities | 1553 |
The top 10 most cited works of literature
This part highlights the most influential articles by citation count, showcasing key contributions that have had significant impact within the field.
Annual and Cumulative Publication Trends
This visualization tracks how the number of publications has changed over time, both annually and cumulatively, revealing patterns of growth and periods of increased research activity.
Articles with No Citations vs At Least One Citation by Year
This chart compares the number of uncited articles to those with at least one citation each year, offering insight into the visibility and influence of conference outputs over time.
Trends in Citation Patterns and Future Predictions
This section analyzes how citations accumulate annually and cumulatively, and may include projections to anticipate future citation trends, helping to assess the evolving impact of the field.
Top Authors by Articles and Citations
This section lists the most cited authors, identifying key contributors and research leaders whose work has shaped the field’s development.
This section examines the geographical distribution of research output and impact, showing which countries contribute most to publications and citations, and how their roles have evolved over time. It also visualizes collaboration patterns and research productivity through various charts and maps
Percentage of Annual Publications by Top 10 Countries
This visualization displays the share of annual publications from the leading countries, illustrating shifts in research leadership and international engagement.
Bubble chart to visualize the top 10 countries
These bubble charts provide a comparative, visual representation of the top publishing countries, making it easy to spot dominant players and emerging contributors.
Citation per Article Index by Country
This section compares countries based on the average citations per article, highlighting differences in research impact and influence.
Heatmap of Top 10 Country Co-occurrence
The heatmap illustrates collaboration intensity among the top countries, revealing international research networks and partnerships.
Number of Articles geomap
This map visualizes the global distribution of published articles, offering a spatial perspective on research activity.
Countries Collabration
This network analysis file and visualization show how often countries collaborate, mapping the structure of international research cooperation
This section presents data on research output and citation impact at the institutional level, allowing comparison of the most productive and influential institutes in the field.
Here, the focus narrows to universities, showing their publication and citation metrics, as well as their collaboration networks, often visualized using network analysis tools like VOSviewer.
This section analyzes the most frequently used keywords in publications, revealing major research topics, emerging trends, and thematic evolution over time. Network visualizations further illustrate how topics are interconnected within the field
After modifying the MIE Dataset and using LLM, a new structure called structural_affiliations was added to the previous dataset, which contains the following fields. The final dataset can be found here.
structural_affiliations fields sample:
"structural_affiliations": [
{
"country": "",
"institute": "",
"department": "",
"university": "",
"city": "",
"postalcode": "",
"email": "",
"Status": "",
"universityf": ""
}
]If you use this article or the dataset in a scientific publication, we would appreciate references to the following paper:
Biblatex entry:
@article{bitaraf-2025,
author = {Bitaraf, Ehsan and Jafarpour, Maryam},
journal = {Studies in health technology and informatics},
month = {5},
title = {{AI-Powered Affiliation Insights: LLM-Based Bibliometric Study of European Medical Informatics Conferences}},
year = {2025},
doi = {10.3233/shti250474},
url = {https://doi.org/10.3233/shti250474},
}Please see our contributing guidelines for more details on how to get involved.
This Repository is available under the CC0-1.0 license.




















