What powers Livi

Data, knowledge & benchmarks

Every Livi agent is personalized with real-world data, grounded in trusted clinical and nutrition knowledge, and validated against rigorous, peer-reviewed benchmarks. Here's the full ecosystem.

Real-world data brings personalization into every conversation — streamed in live through our data partner.

★ Featured data partner

Powered by Centralive

A no-code, closed-loop research platform from UC Irvine's Institute for Future Health that unifies wearables and biosignals, context-aware EMAs and patient-reported outcomes (PROs), just-in-time interventions, and passive life-logging — and streams them into Livi in real time.

Wearables & biosignalsContext-aware EMAsPROs & self-reported outcomesJust-in-time interventionsLife logging
Visit Centralive → Source: Centralive (IAI Systems) · UC Irvine Institute for Future Health

Trusted knowledge bases & graphs ground every answer in verified clinical, pharmacological and nutrition science.

Clinical terminologies & ontologies 4

UMLS
Terminology
The NIH/NLM Unified Medical Language System — a metathesaurus that unifies 200+ biomedical vocabularies (including SNOMED CT, ICD, RxNorm and MeSH) under common concept identifiers.
Normalizes patient and clinician language to standardized clinical concepts so reasoning stays consistent and interoperable.
Bodenreider O. Nucleic Acids Research, 2004.
SNOMED CT
Terminology
A comprehensive, multilingual clinical terminology of ~350,000 concepts with defined relationships — the most widely used clinical reference terminology worldwide.
Encodes diagnoses, findings and procedures precisely for structured clinical reasoning.
SNOMED International.
RxNorm
Drug terminology
A normalized naming system for clinical drugs from the U.S. National Library of Medicine, linking brand and generic names across pharmacy systems.
Resolves medication names to a single standard for accurate lookups and interaction checks.
Nelson SJ, et al. JAMIA, 2011.
ICD-10/11
Classification
The World Health Organization's International Classification of Diseases (ICD-10 and ICD-11) — the global standard for coding diseases and health conditions.
Maps conditions to internationally recognized diagnostic codes.
World Health Organization (ICD-10 & ICD-11).

Drugs & pharmacology 1

DrugBank
Drugs
A richly annotated database pairing detailed drug data with comprehensive drug-target, pharmacology and drug–drug interaction information.
Lets agents surface medication details and flag potential interactions.
Wishart DS, et al. Nucleic Acids Research.

Food & nutrition 4

Nutritionix
Nutrition
One of the largest verified nutrition databases, covering packaged foods, common foods and restaurant menu items with calories and macronutrients.
Powers dietary guidance, meal logging and nutrition Q&A.
Nutritionix — commercial nutrition database & API.
USDA FoodData Central
Nutrition
The USDA's integrated food and nutrient database, providing authoritative nutrient profiles for thousands of foods.
Supplies trusted nutrient values for dietary analysis and recommendations.
U.S. Department of Agriculture, Agricultural Research Service.
FoodKG
Knowledge graph
A unified food knowledge graph linking recipes, ingredients, nutrition and food ontologies to support food recommendation and question answering.
Enables food recommendations, ingredient substitutions and diet-aware answers.
Haussmann S, et al. ISWC, 2019.
FoodOn
Ontology
A harmonized food ontology describing foods and their sources, processing and properties to standardize food data across systems.
Gives food and diet reasoning a consistent semantic backbone.
Dooley DM, et al. npj Science of Food, 2018.

Biomedical knowledge graphs 2

PrimeKG
Knowledge graph
A multimodal precision-medicine knowledge graph integrating 20 resources to describe 17,080 diseases with 4M+ relationships across genes, drugs, phenotypes and more.
Grounds reasoning about diseases, drugs and their biological relationships.
Chandak P, Huang K, Zitnik M. Scientific Data, 2023.
Hetionet
Knowledge graph
An integrative biomedical 'hetnet' of 47,000+ nodes and 2.2M+ relationships across genes, diseases, drugs, pathways and anatomy, built for drug-repurposing analysis.
Supports reasoning over connections between drugs, genes and diseases.
Himmelstein DS, et al. eLife, 2017.

Rigorous benchmarks validate agents before they reach a patient — including our own, peer-reviewed evaluations.

Medical knowledge & QA 5

MedQA (USMLE)
Medical QA
An open-domain medical QA benchmark built from US Medical Licensing Exam–style questions, testing clinical knowledge and reasoning.
Checks an agent's core medical knowledge.
Jin D, et al. 2021.
MedMCQA
Medical QA
A large-scale (194k) multiple-choice benchmark from Indian medical entrance exams (AIIMS & NEET PG), spanning 21 subjects and 2,400 topics.
Stress-tests breadth of medical knowledge across specialties.
Pal A, Umapathi LK, Sankarasubbu M. CHIL, 2022.
PubMedQA
Literature QA
A biomedical QA benchmark of research questions answered yes / no / maybe using the corresponding PubMed abstracts.
Tests evidence-grounded reasoning over the literature.
Jin Q, et al. EMNLP-IJCNLP, 2019.
MMLU — clinical topics
Knowledge
The clinical and biomedical subsets of the Massive Multitask Language Understanding benchmark (clinical knowledge, medical genetics, anatomy, professional medicine).
Measures general medical knowledge alongside reasoning.
Hendrycks D, et al. ICLR, 2021.
BioASQ
Biomedical QA
A long-running challenge for large-scale biomedical semantic indexing and question answering over the biomedical literature.
Benchmarks biomedical retrieval and answer quality.
Tsatsaronis G, et al. BMC Bioinformatics, 2015.

Holistic & conversational evaluation 5

MultiMedQA
Holistic suite
A composite benchmark combining several medical QA datasets (MedQA, MedMCQA, PubMedQA, MMLU clinical and consumer questions) with a human-evaluation framework for factuality, harm and bias.
Provides a holistic view of medical accuracy and safety.
Singhal K, et al. Nature, 2023.
OpenAI HealthBench
Conversational
An open-source benchmark of 5,000 realistic, multi-turn health conversations graded with rubrics authored by 262 physicians across 60 countries.
Measures agent accuracy, safety and communication against a physician-defined standard.
Arora RK, et al. OpenAI, 2025.
Foundation Metrics
Ours · Metrics
A unified set of evaluation metrics for healthcare conversational AI — spanning accuracy, trustworthiness, empathy and user-centered quality — developed by our team.
Defines the dimensions Livi scores agents on.
Abbasian M, et al. npj Digital Medicine, 2024.
Livi · openCHA
Ours · Clinical
Our clinical-agent benchmarks built on the open-source openCHA framework, including Type 2 Diabetes management.
On T2D management, openCHA scored 92.1% accuracy vs GPT-4's 51.8% — the validated baseline Livi builds on.
Abbasian M, et al. JAMIA Open, 2025.
RD Exam (Nutrition)
Ours · Nutrition eval
A nutrition-domain benchmark that evaluates leading LLMs on 1,050 Registered Dietitian (RD) licensing-exam questions, measuring accuracy and consistency across prompt-engineering and knowledge-retrieval techniques.
Validates nutrition and diet agents for accuracy and consistency before they reach users.
Azimi I, et al. Scientific Reports, 2025.