|Back to: Projects Home||Last Modified: 11 April 1996|
AbstractThis document describes a mechanism for specifying generic querying interfaces between an HTML document and a remote Internet resource. Such functionality is desired when specifying generic document-related tools, such as glossaries, dictionaries, navigational query tools and so on, where one needs particularly simple ways of extracting data from a document and send it to a resource. These cases do not require a detailed specification of the user input mechanism (as in a fill-in FORM), but rather a generic specification of the desired selection mechanism and the interface for sending the data to a remote resource. This document proposes a HEAD-level element that specifies the interface and encoding mechanisms, as well as the desired input mechanisms. The actual implementation of the user interface is left entirely up to the the user agent. Some possible interface mechanisms for user input are discussed.
A common problem in Web-based document systems is connecting a particular document to related server-side resources such as dictionaries, glossary tools, etc. As an example, one might want to allow a reader to select a word and quickly query a glossary or dictionary database for information about the word. Or, alternatively, one might want to allow the user to select strings or text segments from a document and query for documents corrrelated with the selected text.
As a study of such querying techniques, one research group [Masum 95] has developed an adaptation of Mosaic that allows the user to select strings from a document and submit a boolean database query based on the selected words. With the system, a user can easily mark and relate text in the document using a mouse button, click a button on the Mosaic window frame, and automatically send a query to a search engine, the result being returned to the client as an HTML [RFC 1866] document.
However, this relatively succesful experiment is not easily generalizable, as it is hard-coded into the NCSA Mosaic (UNIX) source code. What is needed to make this a practical mechanism is a way of specifying, within the HTML markup, generic interface characteristics between a document and some server-side resource. For example, one would like to specify, within an HTML document, that the user can select a word or phrase from the text and send it to a glossary or dictionary engine for interpretation or explanation. Alternatively, and in the same document, one might wish the user to be able to send a block of text, for example a paragraph or several paragraphs, to a WAIS search database to search for documents correlated with the selected text. Finally, one might want the user to be able to choose a selection of computer code and send it to a syntax checker for validation.
At present, query interfaces within HTML documents are only possible using HTML FORMs. Thus, to do any of the above, a user must: (a) cut and paste text into appropriate boxes, (b) follow appropriate rules (this box takes only single words, this phrases, etc.) and (c) press the right button. This is awkward, and unwieldy, to the point where such applications are rarely found in current Web-based applications.
In fact, FORMs are largely inappropriate, as most queries of this type are intimately related to the textual or graphical content of the document, and depend very little on auxiliary text or FORM-field input by the user. Also, it is generally desirable that a single document contain many querying interfaces linked to many different tools (dictionaries, glossaries, relevance-feedback search engines). This is particularly difficult with FORM elements, as each tool requires its own FORM. The result is at best a document cluttered and obscured by many largely irrelevant FORMs.
The querying problem can be broken down into two issues:
- Indicating the resource that serves as the query tool, and the nature of the query tool (index, glossary, database, etc.)
- Giving the interface specification required by that tool, in a manner that can be unambiguously understood by a user agent.
The first issue is the realm of a LINK element, which specifies related resources and their relationship to the document containing the LINK element. For example, the element:<LINK HREF="http://www.where.edu/cgi-bin/gloss" REL="glossary">would indicated that the resource at the indicated URL is a glossary tool related to the current document.
The second issue can currently only be handled via FORM elements. But as noted above, the resulting functionality is not user friendly for the types of applications just mentioned: the user must actively copy and paste text, while multiple FORMs must be specified for each required service, with each form must occupy space on the displayed area even if it is never used.
A FORM is inappropriate because we really want to specify an interface between a document and a resource, and not the detailed mechanisms for user input. There are then two parts to what is required: a specification for the encoding of the data and the transport mechanism by which the data are sent to a specified server resource, and generic input types that describe in very general, but unambiguous terms, how data can be selected for user input. Finally, the interface and input mechanisms should be largely hidden from the user unless called into action -- they are auxiliary to the text, and should not be presented with it. This falls under the category of document meta-information, since we are specifying an interface to the document, and not document content itself.
I propose a new HEAD-level METAFORM element to address these issues. METAFORM inherits data encoding and transport mechanisms from traditional FORM elements, but supports a different, and limited, selection of user-input mechanisms. METAFORM is non-empty, but can contain only MINPUT elements (modeled after INPUT elements). These elements specify how data can be selected as an element VALUE. As with traditional FORMs, MINPUT elements also take NAME attributes, which associate variable names with the user-assigned input element VALUE.
Finally, and different from a FORM, METAFORM does not specify the resource to which the encoded metaform data should be sent. Instead, this is specified by a LINK element, which specifies both the URL of the targeted resource and its relationship with the current document. This separates the specification of the interface from the specification of the resource.
The details of the METAFORM and MINPUT elements, and the required modifications to the LINK element, are discussed below.
The LINK element specifies associated resources (using the HREF attribute) and their relationship to the current document (using REL or REV). LINKed resources can, in principle and where appropriate, be presented to the user in a configurable button bar, whereupon the user could select the resource they want simply by selecting from a document-defined menubar (or whatever method is appropriate, given the nature of the user agent). Some example relationships are discussed in a draft document by Maloney and Quin [RelRev].
However, a LINK does not specify the interface between the document and the target resource. I propose the addition of a new LINK element attribute, called INTERFACE, which specifies a reference to the relevant interface. This reference would be a URL plus a fragment identifier, or simply a fragment identifier if the resource is within the same document. Here are two simple examples:<LINK HREF="http://www.bla.edu/cgi-bin/gloss" TITLE="Glossary Tool" REL="glossary" INTERFACE="#name"> <LINK HREF="http://www.bla.edu/cgi-bin/waisquery" TITLE="Wais: relevance feedback interface" REL="index.wais" INTERFACE="./specs/interfaces.html#name">In the former, the interface is specified somewhere in the document HEAD, while in the latter, the interface is specified in the document
interfaces.html, located in the
specs/subdirectory. This would be a useful way of archiving a set of generic interfaces, such that they can be shared by a collection of documents.
METAFORM is a head-level element that define data naming, input and encoding schemes. From FORM, METAFORM inherits the METHOD and ENCTYPE attributes for specifying the HTTP method and encoding mechanism for data transport. Unlike FORM, a METAFORM does not specify the URL to which the data are to be sent: instead, METAFORM elements are labeled by ID attributes, which take name tokens values. The METAFORM-defined interface is then referenced from a LINK element, (or elements), through the INTERFACE attribute that references the METAFORM ID.
METAFORM is non-empty, but can contain only the MINPUT element. As MINPUT is empty, METAFORM is backward-compatible with current user agents.
Here is a code example illustrating how a METAFORM and LINK element would work:<HEAD> <LINK HREF="url_to_resource" TITLE="Text description of the resource" REL="resource_type" INTERFACE="#name"> <METAFORM ID="name" METHOD="http_method" ENCTYPE="encoding"> <MINPUT TYPE="textselect" NAME="keywords" DESC="Text description of input quantity" SELTYPE="multiphrase" > <MINPUT TYPE="hidden" NAME="string0" VALUE="value0"> <MINPUT TYPE="hidden" NAME="string2" VALUE="value1"> </METAFORM> ..... more fun stuff ...... </HEAD>
The LINK indicates the URL of the targeted resource, described by TITLE, with REL giving the nature of the resource. INTERFACE references the interface specification, given here by the subsequent METAFORM element.
METAFORM Attribute Specifications
The METAFORM attributes are essentially the same as with standard FORMs. The main differences are the absence of an ACTION attribute (this is obtained indirectly through the associated LINK element) and the addition of an ID attribute to label the METAFORM. The ID attribute is references by a LINK, to bind the METAFORM interface to a designated target URL. METHOD and ENCTYPE take the same values as for FORM, and specify the HTTP method and data encoding required by the interface. The attribute specifications are:
METHOD="HTTP method" (Optional)
METHOD specifies the HTTP method used for sending data to an HTTP server. Typical values are GET and POST. METHOD is optional, and in the absence of a specified method, the default value is GET.
ENCTYPE specifies the encoding mechanism used to encode the data. The possible values are application/x-www-form-urlencoded and multipart/form-data [RFC1867]. ENCTYPE is optional: in the absence of a specified encoding type, the default value is application/x-www-form-urlencoded.
ID specifies the name token label associated with this METAFORM. This label can be referenced by a LINK element to associate the METAFORM interface with a designated Internet resource. ID is mandatory for a METAFORM.
User input mechanisms within a METAFORM are defined through MINPUT elements, These are similar to standard INPUT elements, but modified for the different desired input mechanisms.
All MINPUT element can take the standard attributes NAME, VALUE, MAXLENGTH and DESC. The NAME, VALUE and MAXLENGTH have the same meaning and function as with standard FORM INPUT elements. The DESC attribute takes a text description of the element, which is used to provide a description of the intent and/or role of the input element. This can also be presented as an error message, should the input element be used in an inappropriate way. Finally, the TYPE attribute specifies the type of the input element, while the SELTYPE attribute specifies the text-selection rules applicable to this element.
TYPE determines the type of input element. TYPE can take four values: "hidden" (which has the same function as with a FORM), "submit" "textselect", and "boolean". Note that a single METAFORM can have any number of TYPE="submit" or TYPE="hidden" MINPUT elements. However, it can have only one TYPE="textselect" or "TYPE="boolean" element. and cannot have both.
- These are hidden, non-editable elements, as with standard FORMs. This can be used to pass default parameters with the query. A METAFORM can contain any number of TYPE="hidden" MINPUT elements.
- Unlike the case within a FORM, TYPE="submit" elements are optional withing a METAFORM -- the action of choosing the text and selecting the resource is enough to send the data. Instead, The presence of multiple TYPE="submit" elements is a directive to the user-agent indicating that the user, upon making a text selection and attempting to submit a query, should be presented with a list of options, those options being given by the different TYPE="submit" elements. The optional DESC attribute can contain a text explanation for each option. In the absence of DESC values, the list of options should be presented as a list of the different TYPE="submit" element VALUEs.
A METAFORM can contain any number of TYPE="hidden" MINPUT elements.
- This type selector tells the user agent that the user can select text from the displayed document and stow it under the assigned variable name. This could be accomplished by a cut-and-paste into a window, or by drag-and-drop. The description string DESC provides a text description of the intent of the input element, to act as a guide to the use. Thus DESC could be assigned a text string explaining the purpose of this input element (e.g. "Select words or phrases you want present in related documents"). Note that the actual text selection mechanisms are specified via the SELTYPE attribute.
Note Regarding Data Encoding -- When multiword selection is enabled, An extra level of data encoding is required to distinguish phrases such as "open the door" from a sequence of words and phrases, such as "open the" "door". I propose a simple three-stage encoding mechanism for all TYPE="textselect" VALUE:
As we shall see below, additional levels of encoding are required to represent boolean operations.
- Each selected word or phrase is URL encoded [RFC 1738], [RFC 1808] following the standard encoding mechanisms.
- Each word or phrase is surrounded by round brackets (e.g. "open the door" becomes (open+the+door) );
- The strings are joined together, using a colon (:) as separator.
A METAFORM can contain at most one TYPE="textselect" input element. Also, TYPE="textselect" and TYPE="boolean" MINPUT elements are mutually exclusive: a METAFORM containing a TYPE="boolean" MINPUT element cannot contain a TYPE="textselect" MINPUT element, and vice versa.
- This type selector tells the user agent that the user can select text from the displayed document and structure this text into a boolean query: the allowed relationships are and, or, and not. The mechanism for constructing this query is left entirely up to the user agent: some suggestions for an interface are given below.
Note on Data Encoding -- As with TYPE="textselect" extra levels of encoding are requred to appropriately group text strings and logical operations. I propose the following simple mechanism:
It is of course the responsibility of the user agent to construct a boolean query that can be unambiguously encoded in this way .
- Each selected word or phrase is URL encoded, following the standard encoding mechanisms.
- Each word or phrase is surrounded by round brackets (e.g. "open the door" becomes (open+the+door), to form an object
- An object logically combined with another object is grouped with this object, separated by a symbol representing the logical operation: the characters denoting these operators are:
This grouping is in turn surrounded by round brackets, to form another object. For example ((open+the+door):(close+the+window)).
Table: Special Characters Denoting Boolean Operators Operator Symbol Description AND : (colon) OR ; (semi-colon) NOT ? (question mark)
- The preceding step is continued until there is only a single object, representing the encoded boolean query.
A METAFORM can contain at most one TYPE="boolean" input element. Also, TYPE="textselect" and TYPE="boolean" MINPUT elements are mutually exclusive: a METAFORM containing a TYPE="boolean" MINPUT element cannot contain a TYPE="textselect" MINPUT element, and vice versa.
NAME="string" (Mandatory)NAME specifies the variable name associated with the element VALUE.
VALUE="string" (Optional) "submit")VALUE specifies a default value for the input element, to be use in the absence of any user-selected data. Note that VALUE is the only mechanism for assigning data to a NAME for TYPE="submit" or TYPE="hidden" elements. In the absence of any specified value, the user agent should assume the value to be a null string.
DESC="text" (Optional)DESC specifies a text description string, which can be used to provide a text description of the intent or purpose of the input element. For example: "Select words or phrases you want present in related documents" DESC may be used by user-agent help utilities, or by non-graphical user-agents.
MAXLENGTH="n" (Optional)MAXLENGTH specifies the maximum number of bytes that can be placed within the VALUE of the element, and is used to restrict the size of strings sent to the server. The default is no limit.
SELTYPE="word"|"multiword"|"phrase"|"multiphrase"|"block"|"multiblock" (Optional; valid only with TYPE="textselect" or "boolean")Text selection requires control over how text can be selected. For example, dictionaries can only take words, while relevance-feedback queries will want long blocks of text. The attribute SELTYPE defines how text can be selected for text input element . The possible values are "word" (only single words can be selected), "phrase" (only single phrases can be selected, "multiword" (multiple single words can be selected), "multiphrase" (multiple multi-word phrases can be selected), "block" (large blocks of text spanning many sentences can be selected) and "multiblock" (multiple blocks of text can be selected). In this context, a phrase is defined as a sequence of words not containing a sentence delimiter (i.e., a period followed by one or more whitespace characters).
TAGS (Optional; valid only with TYPE="textselect")
By default, tag markup is not included with the text. In some cases you may with to select both text and the surrounding markup (for example, when sending document content to a search tool that understands HTML markup) The attribute TAGS directs the user agent to include the markup with the selected text.
We will consider a document with the following document HEAD, defining four interfaces to four different resources: a glossary tool, a dictionary tool, an index tool and a general relevance-feedback querying tool. These are defined by the following markup:<HEAD> <LINK REL="glossary" HREF="http://www.saa.edu/htbin/gloss" INTERFACE="glos"> <LINK REL="dictionary" HREF="http://www.saa.edu/htbin/dict" INTERFACE="dict"> <LINK REL="index" HREF="http://www.saa.edu/htbin/dict" INTERFACE="index"> <LINK REL="query" HREF="http://www.saa.edu/htbin/dict" INTERFACE="query"> <METAFORM ID="glos" METHOD="post"> <MINPUT TYPE="textselect" NAME="phrase" DESC="Enter word or short phrase for glossary search" SELTYPE="phrase" > </METAFORM> <METAFORM ID="dict" METHOD="post"> <MINPUT TYPE="textselect" NAME="lookup-word" DESC="Enter word for dictionary search" SELTYPE="word" > <MINPUT TYPE="submit" NAME="dictionary" VALUE="oed" DESC="Select Oxford English Dictionary"> <MINPUT TYPE="submit" NAME="dictionary" VALUE="funk" DESC="Select Funk and Wagnals Dictionary"> <MINPUT TYPE="submit" NAME="dictionary" VALUE="harrap" DESC="Select Harraps Dictionary"> </METAFORM> <METAFORM ID="index" METHOD="post"> <MINPUT TYPE="boolean" NAME="index-string" DESC="Boolean Keyword search of the Index" SELTYPE="multiword" > </METAFORM> <METAFORM ID="query" METHOD="post"> <MINPUT TYPE="textselect" NAME="query-string" DESC="Search for document related to selected text" SELTYPE="block" > </METAFORM> </HEAD>
Here is one possible behaviour for the user interface. Upon reading the document, the browser tiles the top control panel with four buttons, representing glossary, dictionary index and query tools respectively. The user can then use the mouse to select text. If the user selects a single word, then all four tools are available: the user could drag the selection to any four of the buttons (or depress any of the buttons) to submit the search. If, however, the user selects the dictionary button s/he is presented with a selection menu, allowing a choice of dictionaries to use.
Text Selection Controls Allowed OptionsSince the different tools have different rules for allowed input, user selection of text will disable unacceptable options. Thus, if the user selects a word phrase, then the dictionary option is automatically grayed out, since this interface cannot accept multi-word input. If the user selects two non-adjacent words, then the glossary, dictionary and query tools are disabled, since their input conditions cannot been satisfied. If the user selects a phrase, then only the glossary and query tools are available, while selecting a block of text (spanning multiple sentences) leaves only the query tool available.
Boolean operations are signified by graphically grouping and joining select words and/or phrases. For example, simply selecting two words might be equivalent to a logical OR, while joining them together, (by, for example, left-mouse dragging from one selected region to another and leaving a line joining the regions) could mean a logical and [Masum 95]. Items could be grouped by selecting with the right mouse button, or by selecting items and using a keystroke (such as shift-mouse-click) to select and group. Groups could then be encircled with a line, negation indicated by dashed lines or different colors, and so on.
METAFORM interfaces could be applied to A anchors (A would need to take on an INTERFACE attribute). You could then use METAFORM to encode generic query data that should be appended to several different anchors (for example, a string encoding path-dependent information appropriate to a particular explored sequence of documents) -- every anchor referencing this METAFORM would automatically pass the appropriate information back to the server. Server preprocessing of a document to encode this path information prior to delivery to a user agent would only require processing the METAFORM content. This somewhat simplifies maintentance issues, at some expense to language complexity.
It should also be possible to include additional controls within the query interface, allowing the user to tune things such as word proximities (find document with word A near word B, where the word separation is controlled by a numeric input element). However, as the complexity of the query rises, the advantages of METAFORM-like tools are largely lost, and it is probably best to use a fully implemented FORM-based utility.
Of course, what happens with the returned data is another issue that has not been addressed here. Ideally, one would like returned glossary or dictionary queries to appear in secondary windows or sidebars (à la Netscape frames).
The author would like to thank Manny Noik and Gene Golovchinsky for their comments and suggestions.
The following DTD fragment is based on the HTML 2.0 FORM dtd, and includes dependencies on that DTD (i.e., through entity definitions such as %linkExtraAttributes and ICADD SDA attributes.<!-- CHANGES TO LINK --> <!-- Only change from HTML 2.0 is to add INTERFACE attribute --> <!ELEMENT LINK - O EMPTY> <!ATTLIST LINK HREF CDATA #REQUIRED %linkExtraAttributes; INTERFACE CDATA #IMPLIED %SDAPREF; "Linked to : #AttVal (TITLE) (URN) (HREF)>" > <!-- LINK Link from this document --> <!-- LINK HREF="..." Address of link destination --> <!-- LINK URN="..." Lasting name of destination --> <!-- LINK REL=... Relationship to destination --> <!-- LINK REV=... Relationship of destination to this --> <!-- LINK TITLE="..." Title of destination (advisory) --> <!-- LINK METHODS="..." Operations allowed (advisory) --> <!-- LINK INTERFACE="..." Address of Interface to use --> <!-- NEW ADDITIONS RELATIVE TO HTML 2.0 -------- --> <!-- **************************************** -------- --> <!-- The elements METAFORM and MINPUT are new -------- --> <!ELEMENT METAFORM - - +(MINPUT)> <!ATTLIST METAFORM METHOD (%HTTP-Method) GET ENCTYPE %Content-Type; "application/x-www-form-urlencoded" ID ID #IMPLIED %SDAPREF; "<Para>Form:</Para>" %SDASUFF; "<Para>Form End.</Para>" > <!-- METHOD=... HTTP Method for submission --> <!-- ENCTYPE="..." Encoding mechanism for data --> <!-- ID=... Name token identifier for metaform --> <!ENTITY %MetaInputType "(BOOLEAN | TEXTSELECT | SUBMIT | HIDDEN)"> <!-- TEXTSELECT user selection of document text --> <!-- HIDDEN hidden fields --> <!-- SUBMIT submit data to server --> <!ELEMENT MINPUT - O EMPTY> <!ATTLIST MINPUT TYPE %MetaInputType TEXT NAME CDATA #IMPLIED VALUE CDATA #IMPLIED MAXLENGTH NUMBER #IMPLIED DESC CDATA #IMPLIED SELTYPE (word|multiword|phrase|multiphrase|block|multiblock) #IMPLIED TAGS (TAGS) #IMPLIED %SDAPREF; "Input: " > <!-- TYPE=... Type of input interaction --> <!-- NAME=... Name of form datum --> <!-- VALUE="..." Default/initial/selected value --> <!-- MAXLENGTH=... Data length maximum --> <!-- DESC="..." Text description of input element --> <!-- SELTYPE=... Selection mode for TYPE="textselect" --> <!-- TAGS "textselect" includes markup tags -->