Continue Teaching to Get to the Level of Accuracy You Aspire To¶
Teaching happens interactively. Continue retrieving more documents to label by clicking the next sample button or searching for keywords. Read the document and find words that represent the entities in your schema on the right.
Once you move onto the next document, the top right of the page will update to reflect the total number of documents you labeled so far. The counters next to each entity on the right will also update to reflect the number of segments you labeled for each entity. Please note that one labeled document might have one or more labeled segments.
Prior to the first three labeled documents, no predictions or errors are shown on the documents. After this, predictions are shown as solid underlines and the labels are shown as highlights when the relevant schema category is selected. Dotted red underlines are errors in prediction. An error on prediction is only shown when the user-provided labels (word highlights) mismatch the system prediction (underlines).
From here, you can label in a few ways:
You can select the proper entity or sub-entity on the right, and then highlight the words in the document by clicking and dragging your cursor over the word(s) or by double-clicking a single word if you want to label a single word. It is recommended at the early stage of the model that you label the sub-entities as well as the top parent entity so that you teach the system the proper grouping of the sub-entities by labeling the top parent entity. In this example, an address is being predicted in a document. The prediction is shown as solid underlines that are colored in correspondence with schema categories to the right. It looks correct, so let's label. Select "Address" on the right and click and drag starting at "1299" and drag all the way to "20004". Also, select each sub-entity and do label the address components.
Once labeled, the label for each entity is a highlight over each segment. The colors of the highlights each correspond to a schema entity. Each entity gets its own color. You must select a schema entity on the right in order to see its labels on the document. Here, "Address" is selected and thus its green highlight is visible. You can also see that the solid prediction underlines are still present alongside the highlight.
With every new document that you retrieve to label, the system will prompt you to verify the prediction of a particular phrase with respect to a particular entity. In the example below, the system is prompting you on the word "a" with the question "is this part of a street". You need to answer with "+", "-" or "?". Clicking on "+" will submit "a" as a positive label to street, clicking "-" will submit "a" as a negative label to street and clicking on "?" will submit "a" as "I don't know" for street.
When the system generates positive predictions that you want to label negative, you can label one or more tokens as negative. The way you label a token as negative is by clicking the "Flip Label" button and clicking on the token. The Flip Label button lets you flip a single token from to a positive label if it was predicted negative or flip a token to a negative label if it was predicted positive with respect to the entity selected on the right. Providing negative labels would fix the false positive predictions.
There are various buttons at the top-right of the document widget.
|Icon Description||Icon Pictures|
|The "Add Label" allows you to highlight tokens to label them for the schema entity selected on the right when selected.|
|The "Flip Label" button will toggle the label of a single token (either positive or negative) to its opposite.|
|The "Erase Label" button will erase a single label. When it is selected, you need to click and drag on the label you want to erase and it will be erased.|
|The "Prediction" button allows you to navigate to predictions in the document. If any predictions are present, notification numbers corresponding to the number of them present will appear. If the button is selected, you can easily navigate through items using the right arrow on your keyboard as a shortcut.|
|The "Conflicts" button allows you to navigate to conflicts in the document. If any conflicts are present, notification numbers corresponding to the number of them present will appear. If the button is selected, you can easily navigate through items using the right arrow on your keyboard as a shortcut.|
|The "Show Features" button shows and hides the ML features triggering on the document you see. Feature triggers show both the triggering features from the list of features you have already added to the model as well as the suggested features. Use feature triggers when you are debugging conflicts as they help justify the prediction behavior of the system on a given document. You can click on the tooltip shown over the word to initiate feature creation.|
|The "Erase All Labels" button will erase all labels on the document. This will allow you to easily correct documents where you were off base and start over with labeling in accordance with a new feature or schema decision.|
At the beginning of the model teaching process, the model won't have enough features to detect entities correctly. At this point, you might see the system showing plenty of positive predictions. There is an example of an overshot below. When this occurs, a simple fix is to label one or more tokens that are predicted positive as negative to the parent entity "Address". The way you label a token as negative is by clicking the "Flip Label" button and clicking on the token. The Flip Label button lets you flip a single token from to a positive label if it was predicted negative or flip a token to a negative label if it was predicted positive with respect to the entity selected on the right. Providing negative labels would fix the false positive predictions.
You can check the quality of your model by testing it. You may find yourself unhappy with the results. There are two ways to raise the quality metric: schema edits and resolving conflicts via adding suggested features. Building your schema through adding, renaming, deleting and moving schema labels ought to be done as you label and add features to fix conflicts. The continuously updated model will show you conflicts as they emerge. In Duet, you address the conflicts by reviewing and adding suggested features, which are then reinforced with further labeling. This is called the iterative teaching process.
If you skip any documents by pressing "Next Sample" without labeling, you can access them by pressing the gear button in the top right.
Upon doing so, you will see a list of the skipped documents. Click a document to retrieve it and have the opportunity to label segments within the document.