publications
For the most up-to-date list, please see my Google Scholar profile. This page is occasionally updated and generated by jekyll-scholar.
2026
- Beyond Composite Indices: Comprehensive Social Determinants Improve Heart Failure Readmission PredictionJournal of the American Heart Association, 2026
- Subtypes of newly diagnosed type 2 diabetes and risk of complications: analysis of electronic health records in the USADiabetologia, 2026
2025
- Predicting Atrial Fibrillation Ablation Outcomes: Machine Learning Model Development and Validation Using a Large Administrative Claims DatabaseJMIR cardio, 2025
- Using large language models to address the bottleneck of georeferencing natural history collectionsNature Plants, 2025
- Beyond Random Splitting: Evaluating the Impact of Data Partitioning Strategies on Ventilator-Associated Pneumonia Prediction Using EHRsIn AMIA Annual Symposium Proceedings, 2025
- AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-PlayIn The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
- Impact of skin tone and cupping on erythema and thermal imaging measurementsScientific Reports, 2025
- Retrieval-augmented GUI Agents with Generative GuidelinesIn Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
- Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?In Machine Learning for Healthcare Conference, 2025
- Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM CollaborationIn Second Conference on Language Modeling, 2025
- Impact of skin tone, environmental, and technical factors on thermal imagingPlos one, 2025
- TransFed: cross-domain feature alignment for semi-supervised federated transfer learningMachine Learning, 2025
- The next stage of biodiversity informatics: community-driven synthesis and integration of biodiversity databasesBioScience, 2025
- A review on knowledge graphs for healthcare: Resources, applications, and promisesJournal of Biomedical Informatics, 2025
- A Large Language Model Analysis of Global Inequities in Precision Medicine Research on DiabetesAnnals of Epidemiology, 2025
- LLMs as Medical Safety Judges: Evaluating Alignment with Human Annotation in Patient-Facing QAIn Proceedings of the 24th Workshop on Biomedical Language Processing, 2025
- SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized DomainsIn Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2025
- Unraveling Complex Temporal Patterns in EHRs via Robust Irregular Tensor FactorizationAMIA Summits on Translational Science Proceedings, 2025
- Evaluating Safety of Large Language Models for Patient-facing Medical Question AnsweringIn Proceedings of the 4th Machine Learning for Health Symposium, 2025
- LLMs-based Few-Shot Disease Predictions using EHR: A Novel Approach Combining Predictive Agent Reasoning and Critical Agent InstructionIn AMIA Annual Symposium Proceedings, 2025
2024
- EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health RecordsIn Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
- Bmretriever: Tuning large language models as better biomedical text retrieversIn Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
- HypMix: Hyperbolic Representation Learning for Graphs with Mixed Hierarchical and Non-hierarchical StructuresIn Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024
- TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR DataIn Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024
- LLMSYN: Generating Synthetic Electronic Health Records Without Patient-Level DataIn Proceedings of the 9th Machine Learning for Healthcare Conference, 2024
- Knowledge-infused prompting: Assessing and advancing clinical text data generation with large language modelsIn Findings of the Association for Computational Linguistics ACL 2024, 2024
-
- EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health RecordsIn ICLR 2024 Workshop on Large Language Model (LLM) Agents, 2024
- PromptLink: Leveraging Large Language Models for Cross-Source Biomedical Concept LinkingIn Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
- From Basic to Extra Features: Hypergraph Transformer Pretrain-then-Finetuning for Balanced Clinical Predictions on EHRIn Proceedings of the fifth Conference on Health, Inference, and Learning, 27–28 jun 2024
- A Flexible Generative Model for Heterogeneous Tabular EHR with Missing ModalityIn The Twelfth International Conference on Learning Representations, 27–28 jun 2024
- EMBA: Entity Matching using Multi-Task Learning of BERT with Attention-over-AttentionIn Proceedings of the 27th International Conference on Extending Database Technology, 27–28 jun 2024
2023
- PGB: A PubMed Graph Benchmark for Heterogeneous Network Representation LearningIn Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 27–28 jun 2023
- An AdaBoost-based algorithm to detect hospital-acquired pressure injury in the presence of conflicting annotationsComputers in Biology and Medicine, 27–28 jun 2023
- CONSchema: Schema matching with semantics and constraintsIn European Conference on Advances in Databases and Information Systems, 27–28 jun 2023
- Evaluating Natural Language Processing Packages for Predicting Hospital-Acquired Pressure Injuries From Clinical NotesCIN: Computers, Informatics, Nursing, 27–28 jun 2023
- A survey on knowledge graphs for healthcare: Resources, application progress, and promiseIn ICML 3rd Workshop on Interpretable Machine Learning in Healthcare, 27–28 jun 2023
- Weakly-supervised scientific document classification via retrieval-augmented multi-stage trainingIn Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 27–28 jun 2023
- Hypergraph Transformers for EHR-based Clinical PredictionsAMIA Summits on Translational Science Proceedings, 27–28 jun 2023
2022
2021
- CATAN: Chart-aware temporal attention network for adverse outcome predictionIn IEEE International Conference on Healthcare Informatics, 27–28 jun 2021
- Examination of the accuracy of coding pressure injury amount, site, and stage in MIMIC-IIIApplied Clinical Informatics, 27–28 jun 2021
- CATAN: Chart-aware temporal attention network for adverse outcome predictionIn IEEE International Conference on Healthcare Informatics, 27–28 jun 2021
- Profiles of intra-day glucose in Type 2 Diabetes and their association with complications: An analysis of continuous glucose monitoring dataDiabetes Technology and Therapeutics, 27–28 jun 2021
- Comparing the documented pressure injury in MIMIC-III: An “UpSet" visualization27–28 jun 2021American Nursing Informatics Association Annual Conference
- Cross-modal memory fusion network for multimodal sequential learning with missing valuesIn 43rd European Conference on Information Retrieval, 27–28 jun 2021
- Examination of the accuracy of coding pressure injury amount, site, and stage in MIMIC-IIIApplied Clinical Informatics, 27–28 jun 2021
- CATAN: Chart-aware temporal attention network for adverse outcome predictionIn IEEE International Conference on Healthcare Informatics, 27–28 jun 2021
- Profiles of intra-day glucose in Type 2 Diabetes and their association with complications: An analysis of continuous glucose monitoring dataDiabetes Technology and Therapeutics, 27–28 jun 2021
- Privacy-preserving sequential pattern mining in distributed EHRs for predicting cardiovascular diseaseIn AMIA Informatics Summit, 27–28 jun 2021
2020
- Pressure ulcer injury in unstructured clinical notes: Detection and interpretationIn AMIA Annual Symposium, Mar 2020
- Accelerated SGD for tensor decomposition of sparse count dataIn ICDM Workshop on High Dimensional Data Mining, Mar 2020
- MMiDaS-AE: Multi-modal missing data aware stacked autoencoder for biomedical abstract screeningIn Proceedings of the ACM Conference on Health, Inference, and Learning, Mar 2020
- Spatio-temporal tensor sketching via adaptive samplingIn Machine Learning and Knowledge Discovery in Databases, Mar 2020
- You sound like you watch action movies: Towards predicting movie preferences from conversational interactionsIn WSDM Workshop on Conversational Systems for E-Commerce Recommendations and Search, Mar 2020
- Domain-guided task decomposition with self-training for detecting personal events in social mediaIn Proceedings of The Web Conference, Mar 2020
- MMiDaS-AE: Multi-modal missing data aware stacked autoencoder for biomedical abstract screeningIn Proceedings of the ACM Conference on Health, Inference, and Learning, Mar 2020
2019
- Privacy-preserving tensor factorization for collaborative health data analysisIn Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Nov 2019
- CP tensor decomposition with cannot-link intermode constraintsIn SIAM International Conference on Data Mining, May 2019
- FuzzyGap: Sequential pattern mining for predicting chronic heart failure in clinical pathwaysIn AMIA Informatics Summit, Mar 2019
2018
- Phenotyping through semi-supervised tensor factorization (PSST)In AMIA Annual Symposium, Mar 2018
- Phenotype Instance Verification and Evaluation Tool (PIVET): A Scaled Phenotype Evidence Generation Framework Using Web-Based Medical LiteratureJournal of Medical Internet Research, Mar 2018
- Best Student PaperPIVETed-Granite: Computational phenotypes through constrained tensor factorizationIn KDD Workshop on Machine Learning for Medicine and Healthcare, Mar 2018
2017
2016
- Automated verification of phenotypes using PubMedIn Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, Mar 2016
- Phenotyping using structured collective matrix factorization of multi–source EHR dataarXiv:1609.04466 [stat.AP], Mar 2016
2015
- Uncovering medication usage patterns of patients with chronic fatigue syndrome via nonnegative tensor factorizationIn AMIA Joint Summits on Translational Science, Mar 2015
2014
- Septic shock prediction for patients with missing dataACM Transactions on Management Information Systems, Mar 2014
- Limestone: High-throughput candidate phenotype generation via tensor factorizationJournal of Biomedical Informatics, Mar 2014
2013
- DYNACARE: Dynamic cardiac arrest risk estimationIn International Conference on Artificial Intelligence and Statistics, Mar 2013
- Multivariate temporal symptomatic characterization of cardiac arrestIn 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Mar 2013
2012
- An imputation-enhanced algorithm for ICU mortality predictionIn Computing in Cardiology, Mar 2012
2005
- Using context-aware computing to reduce the perceived burden of interruptions from mobile devicesIn Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Mar 2005