Find Features in Documents

To find important features in a document you are labeling or reviewing, you need to look at the phrases of text that can help discriminate the categories from each other. A single word (token) is the smallest unit of phrase you can add to a dictionary. A token is a contiguous span of letters or numbers that has no spaces. Punctuation marks also serve as their own tokens.

In document classifiers, you can highlight words to initiate a new feature when labeling a new document or reviewing the label of a previously-labeled document. You can highlight or double-click a single word to initiate the creation of a dictionary item or add the selected phrase to an existing feature. Please note that highlighting or double-clicking words will not initiate feature creation in entity extractors.

