quill Reference Manual - Introduction

This is the reference manual for the quill application, which is a program to help designers of pen-based user interfaces create better gestures. A gesture is a mark made with a pen or stylus that causes the computer to perform an action. For example, if you were editing text, you might use the following gesture to delete a word:



This reference manual describes the features of quill.



Table of Contents




Reference Manual

The Gesture Hierarchy

This section describes the different types of objects that the user can operate on with quill. In increasing size, they are:

Gestures

A gesture is a particular mark that invokes a command. For example, the following mark might invoke the "paste" command:




Gesture Categories

A gesture category is a collection of gestures that tell the recognizer how a type of gesture should be drawn. For example, an application might have gesture categories for "cut", "copy", and "paste" operations. Usually approximately 15 examples are adequate, although sometimes more may be required. A typical application will have a gesture category for each operation that the designer wants to be available using a gesture. However, if two gestures with very different shapes should invoke the same command, the recognition may be better if they are in separate gesture categories than if they are all in the same gesture category. For example, a "square" category includes very small squares and very large squares, it might be recognized better if it were two categories, "big square" and "small square."

Groups

A group is used by the designer to organize gestures. For example, one might have a "File" group for gestures dealing with file operations, a "Format" group for formatting gestures, etc. The recognizer ignores groups.

Sets

A set is a collection of gesture categories and/or groups. quill uses two different kinds of sets for two different purposes. A training set is a set that is used to train the recognizer. A test set is used to test the recognizability of the training set. For example, you might have a test set whose gestures are very neatly drawn and one whose gestures are sloppily drawn, to see how well neat vs. sloppy gestures are recognized.

There must be a training set for the recognizer to do recognition.

Packages

A package holds all the gesture information for one application. (Although applications that have multiple modes may require more than one package.) A package contains a training set and may contain one or more test sets. quill creates one top-level window for each package. All quill data files store exactly one package each (although older legacy gesture files may each store a set or a gesture category).

Terminology

Before the elements of the interface can be described, some interaction terms need to be defined.

click
Click the left mouse button or tap the pen.
right click
Click the right mouse button or press the barrel button on the pen.
double-click
Click twice rapidly. If at first it does not work, try to do it faster. It is sometimes difficult to perform, especially with a pen, so all double-click operations can also be done with the menu.
shift click
Hold down the shift key and click.
control click
Hold down the control key and click.
You can use a pen or a mouse with quill, although many people find it difficult to draw gestures using a mouse.

The Main Window

The quill main window is shown below:





Figure: main window


The remainder of this section describes the various parts of the main window.

Training area

This area shows the training set and the groups, and the gesture categories they contain. Individual gestures are not shown here. A line showing a category might look like this:


Clicking on the name, folder icon, or gesture icon selects an object (and deselects everything else in the tree). Clicking on the warning icon scrolls the log at the bottom of the window so that the relevant warning is in view. Shift-clicking extends the selection. Control-clicking toggles the selection of one object. Double-clicking creates a new window that shows the object.

Selected objects in the tree view may be edited using the Edit menu. The placement of newly created objects (see Gesture menu section) and the behavior of the gesture drawing area are determined by the selection.

Windows

Windows appear in the right part of the main window and are used to browse gesture categories, groups, and sets and to show information about suggestions in the log window (such as misrecognized training examples). Windows can be resized by dragging their title bars up and down. Clicking anywhere in a window will select it. Clicking the close box on the right side of the title bar will close the window.

In windows that display objects, individual objects may be selected using the same mechanisms as in the tree view. That is, clicking on an object selects it (and deselects everything else in the subwindow). Shift-clicking extends the selection. Control-clicking toggles the selection of one object. Double-clicking creates a new view of the object in a new window.

Selected objects in windows may be edited using the Edit menu. The placement of newly created objects (see Gesture menu section) and the behavior of the gesture drawing area are determined by the selection.

The following subsections describe specifics of particular types of windows.

Set window




This window shows the gesture categories and groups contained in a gesture set (e.g., the training set or a test set).

Group window




This window shows the gesture categories contained in a group.

Gesture category window




This window shows the gestures for a gesture category.

Gesture window




This window shows a single gesture.

Misrecognized gesture window




This window shows gestures that were misrecognized. Each gesture has a label and a button. The green label says which gesture category the gesture is supposed to be. The red button says which gesture category it was recognized as. Clicking on the red button creates a window that shows the gesture category it was recognized as.

Drawing area

This area is used for drawing gestures. If a gesture category is selected in the tree view or if a gesture category window is active, a gesture drawn here will be added to the selected gesture category. If a gesture window is active, a gesture drawn here will replace it. If a gesture group or set is selected in the tree or a gesture group or set window is active, an example drawn here will be recognized. Results of the recognition will be shown in the log, and the label for the recognized gesture will turn green in the training area and in any windows in which it appears.

During certain operations (e.g., training the recognizer), the application is unable to accept drawn gestures, and during this time this area will turn gray.

Log

This view is the primary means for the application to communicate to the user about what it is doing. Many different types of messages may appear here. Some examples are:

Menu bar

This section describes the menus and their operations.

File menu

Operations:

Edit menu

Operations:

For rules about where in the package pasted objects are placed, see the rules for new objects in Gesture menu.

View menu

Operations:

Gesture menu

Operations:

Newly created objects will be placed in the selected object, or in the container object closest to the selected object that can contain the new object. For example, a group cannot be inside of another group, so if gesture A inside group B is selected and the "New Group" operation is performed, the new group will be added after B (i.e., as a sibling of B), not inside of (i.e., as a child of) A or B.

Help menu

Operations:

Analyses and Suggestions

To help the designer create good gestures, quill periodically analyzes the training set and provides suggestions. There are several different analyzers, which are described below.

Training Example Misrecognition

One of the analyzers looks for gestures in the training set that are recognized as something other than the gesture category they belong to. Such gestures are usually caused by one of two things:

Gesture Categories Too Similar for Recognizer

Another analzyer looks at all possible pairs of gesture categories and computes how similar they are to the recognizer. If they are too similar, it will be more difficult for the recognizer to tell them apart, and so more likely to misrecognize gestures. This can be corrected by changing the gesture categories so that they are more different to the recognizer, usually by changing their training gestures to have different values for one or more features. When quill detects that two gesture categories are too similar, it will provide information about how to make them more different.

Duplicate Names

You rarely want to have multiple gesture categories with the same name. quill often refers to gesture categories by name in its displays and messages, so multiple gesture categories with the same name will be confusing to users and to other designers.

Also, when using test sets, the recognizer uses gesture category names to determine if the recognition is correct or not. If gesture category names are not unique, test gestures which should be marked incorrect may be marked correct.

Gestures Too Similar for Humans

Another analzyer looks at all possible pairs of gesture categories and computes how similar humans will perceive them to be. Although it is unproved, we think it likely that if dissimilar operations have similar gesture categories, people may confuse them and find it harder to learn and remember them. On the other hand, it is likely fine for similar operations to have similar gesture categories, and may even be beneficial for learning and remembering them.

Outlying Gesture Category

Normally, it is easier for the recognizer to recognize gesture categories if they are very different from one another. However, sometimes the type of recognizer that quill uses can have difficulty recognizing gesture categories if many gesture categories are similar in some way but one is very different from them (i.e., is an outlier). When this problem occurs, quill will notify you and tell you how to change the gesture category to make it more like the others and improve the recognition.

Outlying Gesture

Sometimes designers misdraw training gestures, especially if they are unused to using a pen interface. quill looks for training gestures that are very different from others for the same gesture category, and notifies the designer of those gestures since they may be misdrawn. If you see a gesture labeled an outlier that is not misdrawn, you should press the "Gesture Ok" button to tell the computer that this outlier is ok.

How Recognizers Work

Designers can take advantage of most of the features of quill without knowing how the gesture recognizer works. However, in some cases it will be useful to know how the recognizer "sees" gestures in order to improve their recognition. In the future, quill may support multiple recognizers, but as of now it only supports the recognizer by Dean Rubine1. This recognizer is described in the following section.

Rubine Recognizer

Overview

Rubine's recognizer is a feature-based recognizer because it categorizes gestures by measuring different features of the gestures. Features are measurable attributes of gestures, and are often geometric. Examples are length and the distance between the first and the last points. To recognize an unknown gesture, the values of the features are computed for it, and those feature values are compared with the feature values of of the gestures in the training set. The unknown example is recognized as the gesture with features values that are most like the feature values of the unknown gesture.

The following sections describe in more detail how the recognizer works. First, the features are explained. Then the recognizer training process is described. Finally, recognition is described in more detail.

Features

Rubine's recognizer is a feature-based recognizer because it categorizes gestures by measuring different features of the gestures. Features are measurable attributes of gestures, and are often geometric. quill uses the following features:

Bounding box

This is not really a feature in itself, but several of the features use it. The bounding box for a gesture is the smallest upright rectangle that encloses the gesture.






Cosine of the initial angle

This feature is how rightward the gesture goes at the beginning. This feature is highest for a gesture that begins directly to the right, and lowest for one that begins directly to the left. Only the first part of the gesture (the first 3 points) is significant.




Sine of the initial angle

This feature is how upward the gesture goes at the beginning. This feature is highest for a gesture that begins directly up, and lowest for one that begins directly downt. Only the first part of the gesture (the first 3 points) is significant.

Bounds size

This feature is the length of the bounding box diagonal.

Bounds angle

This feature is the angle that the bounding box diagonal makes with the bottom of the bounding box.

Ends distance

This feature is the distance between the first and last points of the gesture.

Ends angle cosine

This feature is the horizontal distance that the end of the gesture is from the start, divided by the distance between the ends. If the end is to the left of the start, this feature is negative.

Ends angle sine

This feature is the vertical distance that the end of the gesture is from the start, divided by the distance between the ends. If the end is below of the start, this feature is negative

Total length

This feature is the total length of the gesture.

Total angle

This feature is the total amount of counter-clockwise turning. It is negative for clockwise turning.

Total absolute angle

This feature is the total amount of turning that the gesture does in either direction.

Sharpness

This feature is intuitively how sharp, or pointy, the gesture is. A gesture with many sharp corners will have a high sharpness. A gesture with smooth, gentle curves ill have a low sharpness. A gesture with no turns or corners will have the lowest sharpness.

More precisely, all gestures are composed of many small line segments, even the parts that look curved. This feature measures the angular change between each line segment, squares them, and adds them all together. The angular change is shown here:






Training

For every gesture, the recognizer computes a vector of these features called the feature vector. The feature vector is used in training and recognition as follows.

The recognizer works by first being trained on a gesture set. Then it is able to compare new examples with the training set to determine to which gesture the new example belongs.

During training, for each gesture the recognizer uses the feature vectors of the examples and computes a mean feature vector and covariance matrix (i.e., a table indicating how the features vary together and what their standard deviations are) for the gesture.

Recognition

When an example to be recognized is entered, its feature vector is computed and is compared to the mean feature vector of all gestures in the gesture set. The candidate example is recognized as being part of the gesture whose mean feature vector is closest to the feature vector of the candidate example.

For a feature-based recognizer to work perfectly, the values of each feature should be normally distributed within a gesture, and between gestures the values of each feature will vary greatly. In practice, this is rarely exactly true, but it is usually close enough for good recognition.

Human Perception of Gesture Similarity

quill tries to predict when people will perceive gestures as very similar. Its prediction is based on geometric features, some of which the recognizer also uses. There are some features that are used for similarity prediction that are not used for recognition. These features are described below.

Aspect

This feature is the extent to which the bounding box differs from a square. A an example with a square has bounding box aspect of zero.






Curviness

This feature is how curvy, as opposed to straight, the gesture is. Gesture with many curved lines have high curviness while ones composed of stargith lines have low curviness.






A gesture with no curves has zero curviness. There is no upper limit on curviness.

Roundaboutness

This feature is the length of the gesture divided by its endpoint distance.






The lowest value it can have is 1. There is no upper limit.

Density

This feature is how intuitively dense the lines in the gesture are. Formally, it is the length divided by the size of the bounding box.






The lowest value it can have is 1. There is no upper limit.

Log area

This feature is the logarithm of the area of the bounding box.






Log aspect

This feature is the logarithm of the aspect.

Appendix: Logarithm

The logarithm is a mathematical function on positive numbers2. The logarithm of a number x is written as "log(x)". For our purposes, its important properties are:

The logarithm function looks like this:




1Rubine, D. "Specifying Gestures by Example". Proceedings of SIGGRAPH '91, p. 329-337.

2For this reference, we'll pretend that you cannot take the logarithm of a negative number.