Four fine-tuned models. Calibrated assessments. Actionable feedback.
Idea Gatekeeper harnesses four fine-tuned AI models -- GPT-4.1, GPT-4.1-nano, OB-30B, and OB-4B -- trained on 10,000+ editorial decisions, to evaluate your research concepts on Novelty and Usefulness, delivering calibrated tier ratings and actionable improvement suggestions.
Begin EvaluationTrusted by researchers to evaluate ideas before committing resources
Present your research idea in natural language, at any stage. You can even work with AI to refine it.
Four fine-tuned AI models independently assess Novelty and Usefulness.
Receive calibrated tier ratings with detailed probability analysis and improvement suggestions.
Every idea is placed on a four-tier scale anchored to real-world journal standards.
Premier journals such as AMJ, AMR, ASQ, JAP, Org Science, SMJ. Exceptional research potential with high novelty and usefulness.
Near-premier journals such as JOM, OBHDP, PPsych. Strong foundations with clear merit and minor refinements needed.
Mid-tier field journals. Solid concept with identifiable strengths that benefits from targeted iteration.
Lower-tier journals. Early-stage idea requiring further development, with feedback highlighting key areas for improvement.
“The gap between a good idea and a great one is often just the right feedback at the right time.”
What we have been building
Added Econ-30B local model (Qwen3-30B A3B) for economics evaluation. Economics domain now uses 2-model ensemble (GPT-4.1-nano + Econ-30B). Mobile chat reliability improved with timeout handling and error recovery. Admin analytics dashboard launched with live confidence distribution charts.
FeatureEnsemble tier prediction now uses 3-model probability averaging (GPT-4.1, GPT-4.1-nano, OB-30B). OB-4B retains the best individual accuracy (59.2%) and participates in consensus voting, but its probability distributions are excluded from the ensemble average due to overconfidence -- improving calibration by +9.1pp at high confidence thresholds.
UpdateFull UI redesign with login animation, navigation restructure, Idea Lab 3-tab layout, chat split panel, and mobile-responsive breakpoints.
UpdateA new SFT model trained on economics editorial decisions is now available. Select Management or Economics before submitting an evaluation.
FeatureBlind-rate AI-generated research ideas, earn credits for accuracy, and compete on the leaderboard. 200 ideas seeded at launch.
LaunchIdea Gatekeeper is developed by Professor LI Ning and the research group at Tsinghua University, focused on the intersection of artificial intelligence and scientific evaluation. Our work demonstrates that AI can learn the tacit evaluative judgment -- "scientific taste" -- that guides editorial decisions.
Advance AI for science by building tools that facilitate and accelerate the research process. We aim to democratize access to evaluative feedback that was previously available only through lengthy peer review cycles.
The underlying methodology is detailed in our paper "Machines acquire scientific taste from institutional traces" which benchmarks SFT models against 48 expert journal editors and 174 doctoral researchers, using 120 held-out papers with known publication outcomes.
Read the full paper