Abstract:We discussed the influence of analytical data selection of the ITS2 sequences as DNA barcode identification ability in the Cucurbitaceae. Three data sets were built which contained the ITS2 sequence from different Cucurbitaceae samples: Dataset 1 (Experimental group), Dataset 2 (sequences both from experiment and GenBank group), and Dataset 3 (portion of Dataset 2 group). By comparing intra- and inter-specific variation, barcoding gap, and identification efficiency among the three data sets with BLAST 1 and Nearest Distance methods, the influence on ITS2 identification capacity among different data selection was eva-luated. Results showed that the rate of successful identification using ITS2 sequence among the three datasets reached 100% at the genus level and 100%, 67.8%, and 90.6%, respectively, with the BLAST 1 method at the species level, and 100%, 52.5%, and 66.5%, respectively, with the Nearest Distance method. Clearly, the different selection of data led to the large discrepancy of the identification success rate. Among the three data sets, only the barcoding gap in Dataset 2 was not obvious. Therefore, the inclusion criteria of data in the DNA barcode analysis deserves further investigation.