Despite the relatively small number of items in the GAD-7, fewer items are increasingly sought to shorten testing time in large-scale mental health screenings. As a result, short forms based on the GAD-7, the GAD-2, and GAD-mini, have become popular. However, the GAD-2 and GAD-mini have reported lower diagnostic accuracy in some cultural contexts, implying that a validated short-form version of the GAD-7 may be lacking in large-scale cross-cultural anxiety screening. Based on this, to develop an optimal short form of the GAD-7 with cross-cultural stability, we utilized seven GAD-7 datasets from six different countries, totaling 47,484 participants. Five 2 to 6 item short forms of the GAD were constructed using the Riskslim machine learning algorithm. We evaluated the diagnostic accuracy of the GAD-7 short forms in the training and test sets based on the coefficient of determination(R2) and area under the curve(AUC) metrics, and the results showed that GAD-R2 performed poorly in some cultures, and all of the 3 to 6 item short forms of the GAD performed good in cross-cultural diagnostic rates, with the GAD-R6 showing the highest diagnostic accuracy in all cultures; GAD-R3 outperformed GAD-R2, GAD-2, and GAD-mini in all cultures; GAD-R3 had higher generalizability across cultures and special populations; Given that the GAD-R3 was shorter and nearly as accurate as the GAD-R6, we recommend the use of the GAD-R3 in clinical studies and epidemiologic investigations. And we recommend the optimal actual cutoff value of 15 for GAD-R3. Overall, we recommend GAD-R3 as the short-form version of GAD-7 in cross-cultural studies. However, the 2-item GAD scale is also optimal for the short-form version in clinical practice.