OpenAjax Metadata Specification Descriptive
From MemberWiki
Notes:
- This wiki page holds a portion of the latest internal editorial draft for what ultimately will become the OpenAjax Metadata 1.0 Specification. The base wiki page for this (latest) version of the draft specification is at http://www.openajax.org/member/wiki/OpenAjax_Metadata_Specification.
Here is the proposed way for the IDE WG to edit this specification:
- All textual content that has normal text properties (i.e., black) and has no colored status text (e.g., does not say Tentatively approved or Approved) represents preliminary proposed text that requires further review and discussion.
- Members of OpenAjax Alliance are encourage to place inline comments into this document, preferably by identifying your name or initials before your comment, such as "Mary: This bullet should be moved above previous bullet".
- When an item has been deemed complete by the WG chair due to committee consensus, then the annotation Approved will be placed at the end of an approved block of text (e.g., paragraph or bullet). Note that during the development phase of this specification, the group can change its mind and "un-approve" text that was previously approved.
- When an item has been deemed near complete by the WG chair due to committee consensus, but further review is necessary (e.g., review final wording), then the annotation Tentatively approved will be placed at the end of given block of text (e.g., paragraph or bullet).
8 Descriptive elements and attributes
Introduction
This chapter describes the various elements and attributes that provide descriptive metadata. The sections below describe:
- <aboutMe> element: supplemental text information about an author
- <author> and <authors> elements: information about one or more authors
- <description> element: a text description (presumably longer than <title>:
- <example> and <examples> elements: one or more illustrative examples
- <icon> and <icons> elements: one or more small images (i.e., icons)
- <license> element: textual information about licensing
- <quote> element: a quotation that a person associates with his signature
- <reference> element and <references> element: indicates a reference to other related documentation
- <remarks> element: supplemental textual comment beyond the information found in the <description>
- <title> element and <directoryTitle> element: short descriptive text suitable as a label or title bar
- 'type' attribute: indicates whether an element's textual content is plain text (i.e.,
text/plain) or HTML rich text (i.e.,text/html)
Subsequent sections describe the relevant attributes and elements.
Element and attribute collections defined in this chapter
The following is the schema definition for descriptive_elements, which is referenced by other elements defined in other chapters:
descriptive_elements = ( description_element? & examples_element? & remarks_element? & references_element? & reference_element* & title_element? )
Elements defined in this chapter
AboutMe elements
<aboutMe>
aboutMe_element = element aboutMe {
aboutMe_content & aboutMe_attributes & foreign_attributes
}
aboutMe_content = (
plain_text_or_html
)
aboutMe_attributes = (
locid? & type?
)
The <aboutMe> is a child element of <author> and can be used to provide arbitrary supplemental textual information about the author beyond the information found in the <author> element's other attributes and sub-elements.
The type attribute specifies a MIME type that specifies the type of the content. For further details, see the section titled "Content that can be plain text or HTML" that is found later in this chapter.
Here is an example:
<widget xmlns="http://openajax.org/metadata">
<authors>
<author name="Charlie Chaplin">
<aboutMe>I have many skills</aboutMe>
</author>
</authors>
...
</widget>
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
2008-05-29 Tentatively approved: all of the above
Author elements
<author>
author_element = element author {
author_content & author_attributes & foreign_attributes
}
author_content = (
aboutMe_element? & quote_element?
)
author_attributes = (
email? & location? & name? & organization? &
photo? & website?
)
The <author> element describes one of the authors of the given APIs. Supplemental information about the author beyond what is expressed in the various attributes can be supplied in either the <aboutMe> or <quote> sub-elements.
The name attribute contains the author's name.
The email attribute contains the author's email address.
The location attribute specifies the author's primary geographical location, such as "Paris, France".
The organization attribute contains the organization (e.g., company) name with which the author is associated.
The photo attribute contains a URL where there is a picture of the person.
The website attribute contains the URI of the author's website.
2008-05-13, 2008-05-29 Tentatively approved: all of the above
<authors>
authors_element = element authors {
authors_content & authors_attributes & foreign_nodes
}
authors_content = (
author_element*
)
authors_attributes = ( empty )
The <authors> element holds zero or more <author> child elements.
DRAFT CONSENSUS: All of the black text above reflects draft consensus from previous discussion.
Description elements
<description>
2008-04-24: Draft consensus. Include <description>, <examples>, <remarks> in widgets.
description_element = element description {
description_content & description_attributes & foreign_attributes
}
description_content = (
plain_text_or_html
)
description_attributes = (
locid? & type?
)
The <description> element provides a textual description for the <description> element's parent. For example, in the following case:
<widget ...> <description>Bar chart widget with configurable title and axis labels.</description> ... </widget>
the <description> element provides a textual description for the <widget> element (its parent).
The type attribute specifies a MIME type that specifies the type of the content. For further details, see the section titled "Content that can be plain text or HTML" that is found later in this chapter.
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
DRAFT CONSENSUS: All of the black text above reflects draft consensus from previous discussion.
Example elements
<example>
2008-04-24: Draft consensus. Include <description>, <examples>, <remarks> in widgets.
example_element = element example {
example_content & example_attributes & foreign_attributes
}
example_content = (
plain_text_or_html
)
example_attributes = (
locid? & type?
)
The <example> element provides a single illustrative example (often, source code)for the <example> element's grandparent (i.e., the parent of the [note the plural] <examples> element).
The type attribute specifies a MIME type that specifies the type of the content. For further details, see the section titled "Content that can be plain text or HTML" that is found later in this chapter.
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
<examples>
2008-04-24: Draft consensus. Include <description>, <examples>, <remarks> in widgets.
examples_element = element examples {
examples_content & examples_attributes & foreign_nodes
}
examples_content = (
example_element*
)
examples_attributes = ( empty )
The <examples> element holds zero or more <example> child elements.
Icon elements
<icon>
2008-04-24 Tentatively approved: Will have both <icons> and <icon> elements.
icon_element = element icon {
icon_content & icon_attributes & foreign_attributes
}
icon_content = (
empty
)
icon_attributes = (
width? & height? & src
)
The <icon> element specifies the URI for a graphical representation for the <icon> element's grandparent element (i.e., the parent of the [note the plural] <icons> element).
2008-04-24 Tentatively approved: <icon> will only have width, height, and src attributes, where width and height are in pixels. Other information such as intended purpose for the icon is expected to be tool-dependent, so we decided that any such information should be provided via XML namespace extensibility. Tools can pick the icon that appears to be the closest match to their needs and rescale as necessary. We felt color depth wasn't worth the effort since most design-time environments support color.
2008-04-24: Editor: include information that suggests what icon sizes are in popular use, such as 16x16, 32x32 and 64x64
JON: Various details before the spec ships: Are width and height required or optional? (Presumably optional.) If optional, then what are the defaults? (How about 32x32?) What happens if the width and height are wrong? (How about leaving that scenario as undefined and whatever happens happens.) Is it OK for tools to sniff the width/height from the image and ignore the specified width/height? (I would say yes. In other words, the width/height attributes are recommended so that tools that don't open the images have information that they can use when attempting to use the icons, but it's OK for tools to ignore these attributes.)> Editor: I think the answers from 2008-06-17 were optional, no defaults (i.e., implementation dependent), undefined, and ok to sniff. Need to research minutes to make sure, and then update the spec to reflect this.
The width attribute specifies not yet written.
The height attribute specifies not yet written.
The src attribute specifies not yet written.
An example:
<widget ... >
...
<icons>
<icon width="36" height="36" src="lightwindow_pi.gif" />
<icon width="16" height="16" src="lightwindow.gif" />
</icons>
...
</widget>
2008-06-17 DRAFT CONSENSUS: <icon> does not need 'alt'. Alt text can be determined from surrounding info Editor: update the spec to say that conforming user agents needs to manufacture an 'alt' attribute on generated HTML img tags. But still need to see what happens with W3C widgets with 'alt' and optional rich content.
<icons>
icons_element = element icons {
icons_content & icons_attributes & foreign_nodes
}
icons_content = (
icon_element*
)
icons_attributes = ( empty )
The <icons> element holds zero or more <icon> child elements.
2008-04-24 Tentatively approved: Will have both <icons> and <icon> elements.
License elements
<license>
license_element = element license {
license_content & license_attributes & foreign_attributes
}
license_content = (
plain_text_or_html
)
license_attributes = (
locid? & type?
)
The <license> element describes license information about this widget. The textual content might contain the actual license text or might include a URL where the license text can be found.
The type attribute specifies a MIME type that specifies the type of the content. For further details, see the section titled "Content that can be plain text or HTML" that is found later in this chapter.
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
Quote elements
<quote>
quote_element = element quote {
quote_content & quote_attributes & foreign_attributes
}
quote_content = (
plain_text_or_html
)
quote_attributes = (
locid? & type?
)
The <quote> provides either plain text or HTML formatted content that provides a quotation that a person associates with his signature. The quote sometimes is after other information about the person.
The type attribute specifies a MIME type that specifies the type of the content. For further details, see the section titled "Content that can be plain text or HTML" that is found later in this chapter.
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
2008-05-29 Tentatively approved: all of the above
Reference elements
<reference>
reference_element = element reference {
reference_content & reference_attributes & foreign_attributes
}
reference_content = (
plain_text_or_html
)
reference_attributes = (
locid? & type?
)
The <reference> element indicates a reference to other related documentation. The content of the element provides the text that describes the reference.
The type attribute specifies a MIME type that specifies the type of the content. For further details, see the section titled "Content that can be plain text or HTML" that is found later in this chapter.
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
2008-07-29 JON: We gave second-time approval to allowing the custom of a custom protocol handler such as "openajaxmetadata:foo" to indicate a reference to other objects that are described by OpenAjax Metadata, where you would specify such a reference using an <a> element, as in see <a href="openajaxmetadata:foo">foo</a>
<references>
references_element = element references {
references_content & references_attributes & foreign_nodes
}
references_content = (
inclusion_elements &
reference_element*
)
references_attributes = ( empty )
The <references> element holds zero or more <reference> child elements.
2008-07-29 DRAFT CONSENSUS: We added a <references> element.
Remarks elements
<remarks>
2008-04-24: Draft consensus. Include <description>, <examples>, <remarks> in widgets.
remarks_element = element remarks {
remarks_content & remarks_attributes & foreign_attributes
}
remarks_content = (
plain_text_or_html
)
remarks_attributes = (
locid? & type?
)
The <remarks> element provides optional supplemental textual comment (i.e., remarks) about the <remarks> element's parent element. It is assumed that the primary descriptive text is contained in the code><description></code> element. The information from the <remarks> element is expected to appear only after the user indicates that he would like to see the supplemental remarks.
The type attribute specifies a MIME type that specifies the type of the content. For further details, see the section titled "Content that can be plain text or HTML" that is found later in this chapter.
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
Title elements
<title>
title_element = element title {
title_content & title_attributes & foreign_attributes
}
title_content = (
plain_text_or_html
)
title_attributes = (
locid? & type?
)
The <title> element contains short descriptive text suitable as a label or title bar.
The type attribute specifies a MIME type that specifies the type of the content. For further details, see the section titled "Content that can be plain text or HTML" that is found later in this chapter.
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
2008-??-?? Tentatively approved: all of the above
<directoryTitle>
directoryTitle_element = element directoryTitle {
directoryTitle_content & directoryTitle_attributes & foreign_attributes
}
directoryTitle_content = (
text
)
directoryTitle_attributes = (
locid?
)
The <directoryTitle> element contains a short text string that can be used as a widget name when displayed within a widget directory or catalog.
Attributes described elsewhere:
- The
locidattribute is defined in the Localization chapter.
2008-07-29 Draft Consensus: Move directoryTitle from attribute to element so it can be localized.
Content that can be plain text or HTML
Several elements within the OpenAjax Metadata specification, such as the <description> element, allow content that can be either "plain text or HTML". Each of these elements has a type attribute that specifies a MIME type for the element's content.
This specification defines behavior for two possible values for the type attribute:
-
text/plain: (the default) for plain text that should be presented "as is" to the user -
text/html: for rich text expressed as an HTML snippet that should be suitable as theinnerHTMLof an HTML<div>element and that should be processed and rendered by an engine that can parse and render HTML
text/plain
If the type attribute has the value text/plain or is unspecified, then the content of the element should be presented "as is" to the user. For example:
<description>A small amount of text (i.e., "plain text")</description>
should result in the user seeing the following:
A small amount of text (i.e., "plain text")
With text/plain, newlines and white space should be preserved. For example:
<description>This is line 1 This is line 2 This is line 3</description>
should result in the user seeing the following:
This is line 1 This is line 2 This is line 3
Because the OpenAjax Metadata file is expressed in XML, the metadata author MUST take appropriate measures to ensure that the plain text content that does not contain any of the special literal markup characters that have special meaning to an HTML parser: "&", "<" and ">". If the plain text content uses any of these characters, then those character can be expressed using the following HTML entities: "&", "<" and ">". For example, if you want a description that will present this string to the user: "where N is an integer & N > 0 & N < 20"
<description>where N is an integer & N > 0 & N < 20</description>
text/html
If the type attribute has the value text/html or is unspecified, then the content of the element should be an HTML snippet that is presented to the user as rich text.
Here are examples using the <description type="text/html"> element:
[1] <description type="text/html">Text that is to be rendered as HTML but contains no markup</description>
[2] <description type="text/html">A small of amount of text with <b>limited use</b> of HTML markup</description>
[3] <description type="text/html">
<p>This is an example that shows two HTML paragraphs</p>
<p>This is the second paragraph</p>
</description>
[4] <description type="text/html">
<![CDATA[
<p>Some HTML text that is not well-formed XML</p>
<ul>
<li>A bullet without a proper end element
<li>A second bullet
</ul>
]]>
</description>
- Example
[1]should be rendered as "Text that is to be rendered as HTML but contains no markup". - Example
[2]should render the words "limited use" using boldface. - Example
[3]shows an example that has two paragraphs. - Example
[4]shows the use of the CDATA construct to bracket content that is not proper XML (because the</li>end tags are missing).
User agents SHOULD support the ability to render formatted HTML content as rich text; however, if unable to render HTML rich text, the user agent SHOULD still render the content somehow, such as stripping out the markup. For example:
<description type="html"> this word is <b>bold</b>. </description>
which might render as (after stripping out the markup):
this word is bold.
or alternatively the user agent might simply render the original markup exactly as is:
this word is <b>bold</b>.
(Note: Metadata authors may want to format their content such that the content is still readable if a user agents chooses to strip out the markup.) Need to add a note about preventing XSS attacks by converting all HTML characters into entities.
Because the HTML content is meant to be suitable for inclusion within an HTML <div> element, the content SHOULD not contain any of the following HTML elements: <base>, <body>, <iframe>, <frame>, <frameset>, <head>, <html>, or <title>, and most of the time also SHOULD not contain <link>, <meta>, <style>, or <script>.
Metadata authors need to be aware that different user agents will support different subsets of HTML. Therefore, for common cases, metadata authors are advised to use a simple common subset of HTML for any rich text content to maximize portability.
User agents MUST prevent take appropriate measures to prevent malicious HTML content from executing unwanted JavaScript logic, such as cross-site scripting (XSS) attacks due to embedded JavaScript logic secretly embedded within the content. (See discussion of cross-site scripting at: [1]). A recommended approach is to adopt a "whitelist" approach which filters the HTML content to allow only a known subset of any elements and attributes and remove any elements and attributes that are not included in the whitelist.
Note that there are several advanced ways to include executable content within HTML, such as
<a href="javascript:alert('hi')">press me</a>
so it usually makes sense to do careful research and look for off-the-shelf software that addresses all possible cases for the whitelisting.
The detailed processing model for the HTML descriptive content is as follows:
- From the original source metadata file, extract the complete string of characters between the start element tag and the end element tag, including any white space and newlines. For example, with the
<description>element, extract all characters between the<description>start tag and the</description>end tag. For<description>Here is the description </description>, the extracted text would be "Here is the description". - Strip out all CDATA constructs. For example, with
<description>PPP <[CDATA[-bbb-]]> QQQ</description>, the result text string would be "PPP -bbb- QQQ". - Take appropriate measures such as whitelisting to filter out potentially malicious HTML constructs.
- Render the result of the previous steps using an engine capable of parsing and rendering rich text HTML content.
