Skip to content
Xplore
Evaluate

Know exactly how accurate your research agents are.

Research agents retrieve, synthesize, and cite. Evaluation measures source quality, citation accuracy, synthesis completeness, and whether conclusions follow from evidence — on your real tasks.

Evaluate

Trust the sources behind every answer.

Are retrieved sources authoritative, current, and relevant? Each source is scored against ground-truth sets from your domain. You see which sources the agent chose and which it missed.

Eval · pharma-research-agent
Source quality
0.91
Citation accuracy
0.88
Synthesis
0.74
Reasoning
0.82
Contradiction handling
0.69
Weighted 0.81
Evaluate

Every factual claim traced to a source.

Citation evaluators check that every claim in the output has a traceable source. No hallucinated facts. No unsupported conclusions. Attribution completeness and correctness scored independently.

Clinical research
For: Medical affairs
Train: source quality, citation
Source quality 0.93
Citation 0.91
Contradictions 0.85
Market intelligence
For: Strategy teams
Train: synthesis, reasoning
Synthesis 0.78
Reasoning 0.84
Recency 0.92
Evaluate

Know when the analysis is complete.

Does the agent combine multiple sources into a coherent answer? Does it surface contradictions rather than hiding them? Synthesis scoring separates good retrieval from good research.

5
axes
Research-specific eval dimensions
Configurable
0.91
Source quality — top quartile
Leaderboard
0.88
Citation accuracy — domain average
Benchmarks
0.74
Target: 0.85
Synthesis — room to improve
Evaluate

Trace any conclusion back to its evidence.

Every source, every citation, every synthesis step resolves through a typed graph. Auditors and domain experts can trace any conclusion back to its evidence chain.

Node resolution graph showing provenance chain

Trust the research your agents deliver.

Source quality, citation accuracy, and synthesis depth — scored on your domain data.