A disclosure that discloses nothing
Open almost any biomedical journal today and you will find some version of the same sentence: "Artificial intelligence tools were used for language editing and grammar checking." It appears in author contribution statements, in footnotes, sometimes in a dedicated AI disclosure box that journals have hastily introduced over the past two years. It ticks a box. It tells us almost nothing.
The problem is not that researchers are being dishonest. The problem is that we have built a disclosure culture around transparency of process while systematically ignoring transparency of risk. Knowing that an author used ChatGPT does not tell me whether the conclusions of their study can be trusted. And in biomedical research — where findings eventually touch clinical decisions, health policy, and patient lives — that is the question that matters.
What AI does to validity
When AI tools enter a research workflow, they do not merely accelerate it. They introduce specific, identifiable threats to the validity of the findings. An AI system trained predominantly on English-language, Western, high-income-country literature will systematically amplify certain assumptions and suppress others. A language model used to synthesize evidence does not randomly hallucinate — it confabulates in patterned ways that reflect the biases of its training data. An AI-assisted literature search will find what its architecture makes it likely to find, and miss what it does not.
These are not hypothetical concerns. They are documented, reproducible validity threats — the same class of threats that we routinely require researchers to address when they describe their statistical models, their sampling strategies, or their measurement instruments. We have simply not yet applied that same standard to AI.
The VOID framework: disclosing what matters
The framework I propose — VOID, for Validity-Oriented Informed Disclosure — starts from a different premise than current practice. Rather than asking "was AI used?", it asks "how does AI use affect what we can validly conclude?"
VOID organises disclosure around four dimensions where AI creates validity risk: the representativeness of the data or literature AI was trained on; the potential for AI-introduced systematic bias in evidence synthesis or interpretation; the opacity of AI-generated outputs that cannot be independently verified; and the degree to which AI involvement altered the inferential logic of the study itself. An author using AI only to smooth sentence flow has a minimal VOID disclosure. An author using AI to screen abstracts, generate summary statements, or identify thematic patterns has a disclosure that should look very different.
This is not about punishing AI use. It is about giving readers — and ultimately clinicians, policymakers, and patients — the information they need to calibrate how much to trust a finding.
Why biomedical science cannot afford to wait
Science has always moved faster than its self-regulatory mechanisms. We are in that lag period now with AI. Journals are adopting disclosure policies, but those policies are largely cosmetic. They satisfy an appearance of accountability without creating the epistemic accountability that the moment demands.
I have spent the past several years studying how biomedical neuroimaging research systematically misrepresents populations — how datasets built from Western, young, healthy, university-affiliated participants are used to draw conclusions about all human brains. The AI problem is structurally identical. It is the same error of treating a biased input as a neutral one and then building a literature on top of that assumption without disclosing it.
The solution, in both cases, is the same: name the threat, describe its likely direction, and let the reader decide how much weight to give the conclusion. That is not a burden. That is science.
Thorsten Rudroff
The writer is Docent of Experimental Neuroimaging at Turku PET Centre, University of Turku and Turku University Hospital. His research focuses on neuroimaging of fatigue in neurological disorders and on validity and diversity challenges in biomedical methodology.