Integrity & Provenance Hub — What We Check, When (AI-Assisted, Human-Led)

Integrity & Provenance Hub — What We Check, When

Our review process is AI-assisted and human-led. That means faster screening where machines excel, and careful editorial judgment where it matters. Below we explain each check, when it runs, and exactly what editors do if a flag appears.

Similarity Paper-mill signals Stats sanity Reference DOIs C2PA Content Credentials

When checks run (timeline)

  1. On upload (pre-submission or submission): similarity screening, image authenticity check, reference DOI lookup.
  2. Editor triage (before external review): paper-mill/manipulation signals, stats sanity pass, compliance checklist.
  3. During peer review: structured review forms prompt reviewers to confirm data/code availability, reporting standards, and any concerns.
  4. Pre-accept: editors verify that flags are resolved; references are complete; figures carry Content Credentials.
  5. Post-publication: corrections and updates are tracked; provenance is visible on the article record.

We do not rely on black-box automation to make decisions. All outcomes are made by human editors.

The five core checks (2–3 sentence explainer + what editors do)

1) Text similarity (Crossref Similarity Check / iThenticate)

We compare the manuscript’s text to a large corpus to detect substantial overlap with published or web content. Overlap can be legitimate (methods, boilerplate) or problematic (uncited reuse, salami slicing).

What editors do: We review the report manually, focusing on context not raw percentages. Common phrases and references are discounted; uncredited overlap prompts a request for revision or clarification.

Example: A 23% overall similarity with most matches in the methods section. Editors note acceptable reuse of standard procedures; two paragraphs match an earlier preprint by the authors → authors add a citation and rephrase.

2) Paper-mill & manipulation signals (STM Integrity Hub)

Pattern-based checks highlight risky combinations such as fabricated email domains, recycled images across submissions, off-topic citations, or implausible author affiliations. These are signals, not verdicts.

What editors do: We verify identity details, request raw data or ethics documentation where needed, and may consult an image-forensics report. Strong signals pause the workflow until concerns are resolved.

Example: Multiple manuscripts from unrelated authors list the same non-institutional domain and share figure layouts. Editors request underlying data; one paper is withdrawn; others proceed after satisfactory evidence.

3) Statistics sanity passes (e.g., statcheck for NHST)

Where applicable (e.g., reported t/F with p values), we auto-recalculate basic tests to catch common transcription inconsistencies. This is advisory, not a substitute for specialist review.

What editors do: If inconsistencies appear, we ask for a checked analysis file or clarification in the response letter. Complex models are directed to subject-expert reviewers or a statistical editor.

Example: Reported t(58)=2.10, p=0.12 recalculates to p=0.04. Authors correct a rounding error and upload their notebook; reviewers confirm conclusions remain supported.

4) Reference validation (Crossref DOIs)

We resolve reference metadata and DOIs automatically and flag missing or malformed entries. Clean references improve citation tracking and help readers reach the right source.

What editors do: We return a highlighted list for author correction, fix obvious typos, and ensure key datasets/software are cited with persistent identifiers.

Example: 8 of 42 references lack DOIs; two have mismatched years. After automated suggestions, authors add DOIs and correct metadata before the manuscript moves forward.

5) Image authenticity (C2PA / Content Credentials)

We verify figures and graphical abstracts with Content Credentials (C2PA). Manifests record when, how, and with which tools media were created or edited, making alterations transparent.

What editors do: If credentials are missing or inconsistent with claims, we request the original files or a signed explanation. Clear manipulations result in rejection; honest edits with labels are acceptable.

Example: A gel image shows duplicated bands. Authors provide raw images and a corrected figure labeled “contrast adjusted.” The record includes a brief note; the corrected figure proceeds.

How we handle flags (severity, actions, timelines)

  • Info Minor issues (missing DOIs, unclear labels). Action: Author fixes during revision.
  • Caution Ambiguity (text overlap needing citation, stats mismatch needing clarification). Action: Editorial query; proceed once resolved.
  • Critical Likely misconduct or unreliable data (fabrication, image manipulation, ethics gaps). Action: Pause, request evidence, or reject per policy.

We aim to resolve most Info items within 3–5 working days; Caution items within a standard revision cycle; Critical items vary by case.

A day in the life of a manuscript (short narration)

Dr. Sharma uploads a manuscript on a new composite material. Within minutes, the system shows a similarity report (fine—mostly methods), confirms DOIs for 90% of references, and flags that two figures are missing Content Credentials. She re-exports the images with credentials and re-submits.

During triage, an editor sees low paper-mill risk, one statcheck inconsistency, and a missing dataset citation. Dr. Sharma uploads her Jupyter notebook and deposits the dataset with a DOI. Reviewers later use structured forms to focus on methods and practical relevance, not formatting. The paper moves to acceptance with a clean provenance trail.

How authors can help (quick checklist)

  • List all authors’ ORCID iDs; use institutional names that auto-complete to the correct affiliation.
  • Ensure every reference has a DOI (where available) and matches the source exactly.
  • Attach a data/code availability statement and provide repository links with persistent identifiers.
  • Export figures with Content Credentials or retain original files for verification.
  • If you used AI tools, add a short AI-use disclosure (what, where, how you verified outputs).

Privacy & confidentiality: Do not paste confidential manuscripts or data into public AI tools. If you need assistance, use institutionally approved services and keep a note of any tool usage in your disclosure.

FAQs

Do you reject papers based on a similarity percentage alone?

No. We assess where and why the overlap occurs. Methods and standard phrases are normal; unattributed reuse is not.

Are paper-mill checks the same as misconduct findings?

They are signals that prompt questions—not verdicts. Editors review evidence and contact authors before any decision.

What if my field doesn’t use NHST or standard test reporting?

Stats checks run only where applicable. Complex analyses are evaluated by subject-expert reviewers and, when needed, a statistical editor.

What are Content Credentials (C2PA) on images?

They are tamper-evident manifests describing how media was created and edited. If missing, we may request originals or a signed explanation.

Will readers see the provenance?

Yes. Post-publication, a provenance section summarizes key checks, links to datasets/software, and shows update status.

Last updated: 12 Aug 2025 • This page describes our AI-assisted, human-led integrity workflow for submissions and peer review.