Skip to content

Test Sets

You can provide manually labeled test sets so that you can get an independent assessment of the quality of your models. Duet provides a quality metric that is updated with every model update and it is calculated on the whole sampling set (unlabeled dataset) associated with the model. However, the manually labeled test set is an additional optional checkpoint for the user on the model quality. Test sets are uploaded in CSV format only, with three columns that show the text document with column name "text", the full path of the category in the schema (e.g. top category/child level 1/child level 2) with column name "sub-category", and whether the label is positive or negative with column name "label". Once you upload a manually labeled test set where the categories in the test set exactly match your model schema, Duet will calculate a quality metric (F score) for each category in your schema. Please note that the full path of the categories in the "sub-category" column has to exactly match the schema. If you edit your schema, you need to update the "sub-category" column in your test set. There is a maximum limit of 5 MBs on the size of a test set you upload.

  1. Once you've reached a point where you want to test your model, press the Test tab at the top nav bar.

  2. Select the Upload testset button at the top of the Test your Model widget.

  3. Either upload a new file or select one of the previously uploaded test sets.

  4. If you upload a new test set, you need to ensure that it's in the proper format. Test sets for classification are only in CSV format. The CSV must have three columns with the following header text, sub-category, and label. The sub-category name has to be the full path of the schema (e.g., feedback/complaint), and the label is either positive or negative.

    You can download a schema format sample here.
    .

  5. Select the test set that you'd like to use. Click "Test" and the results for each category will display on the right. The results above 50% are green. Otherwise, it is red. Some categories are ignored. They are ignored because their path in the sub-category column doesn't match the schema.

Back to top