Background: Multiple risk scores claim to predict the probability of postoperative pancreatic fistula (POPF) after pancreatoduodenectomy. It is unclear which scores have undergone external validation and are the most accurate. The aim of this study was to identify risk scores for POPF, and assess the clinical validity of these scores. Methods: Areas under receiving operator characteristic curve (AUROCs) were extracted from studies that performed external validation of POPF risk scores. These were pooled for each risk score, using intercept-only random-effects meta-regression models. Results: Systematic review identified 34 risk scores, of which six had been subjected to external validation, and so included in the meta-analysis, (Tokyo (N=2 validation studies), Birmingham (N=5), FRS (N=19), a-FRS (N=12), m-FRS (N=3) and ua-FRS (N=3) scores). Overall predictive accuracies were similar for all six scores, with pooled AUROCs of 0.61, 0.70, 0.71, 0.70, 0.70 and 0.72, respectively. Considerably heterogeneity was observed, with I2 statistics ranging from 52.1-88.6%. Conclusion: Most risk scores lack external validation; where this was performed, risk scores were found to have limited predictive accuracy. Consensus is needed for which score to use in clinical practice. Due to the limited predictive accuracy, future studies to derive a more accurate risk score are warranted.