Markup Scanner

From MemberWiki

Jump to: navigation, search

Contents

Introduction

James Margaris's very early preliminary JavaScript for the OpenAjax Hub for people to review.

Changes from last time (I tried to bold them but bold and pre don't play well together)

  1. Samples have been updated to use toolkit IDs properly rather than goofy names like "redButton"
  2. API changes make it clear that keys are toolkit IDs
  3. The OpenAjax onload handler is automatically added when the script is loaded via the oaa addOnLoad() method. oaa.js is not unified with markupScanner.js but markupScanner uses oaa.js. Putting the call in the <body onload> method is no longer needed.
  4. Toolkits now have a chance to examine nodes themselves to indicate that they would like to handle those nodes - this can be used when the default node examination is insufficient. In addition the examination process can return an arbitrary object that will be passed back to the handler method. This can be used to avoid double-examination - once to determine if the node should be handled, and again during the node handling process. Note that if all toolkits choose to implements their own node examiner a lot of the efficiency will be lost.
  5. When parsing the oaType attribute we handle values in the form of TOOLKIT_ID as well as TOOLKIT_ID:RANDOM_OTHER_STRING. In the second case the RANDOM_OTHER_STRING will be passed back to the handler method when that is called.
  6. Test files now print out expected results.

JavaScript source code

Sorry about having to copy/paste from the inline code below.

// prevent re-definition of the OpenAjax object
if(!window["OpenAjax"]){
	window.OpenAjax= {

		examinerTable : [],

		registerNodeExaminer : function(
			toolkitId, // the toolkit identifier
			refOrName,// function or string that will be called with the HTML subtree root element
			scope){ //object, default is window

			/* summary:
			 *		Registers a callback so each library can scan each node for their own
			 *		markup to help them identify nodes that they handle if they have needs
			 *		beyond our built-in mechanism. If we cannot find a default indication
			 * 		that the node is handled by a toolkit we will let all the registered
			 *		node examiners take a look.
			 *
			 * toolkitId:
			 *		The ID of the toolkit being registered as a node examiner.
			 *
			 * refOrName:
			 *		A function object reference or the name of a function to be
			 *		called with each node. This function should return a non-null
			 *		object if it positively identifies the node. That object will then
			 *		be passed back into the handler method.
			 * scope:
			 *		Optional. An Object in which to execute refOrName when
			 *		handling the event.
			 */


			if(!scope){
				scope = window;
			}

			if(typeof refOrName == "string"){
				// get a function object
				refOrName = scope[refOrName];
			}

			if(typeof refOrName != "function"){
				throw new Error("invalid function reference passed to oaa.subscribe()");
			}

			//TODO what if this is replacing one that is already there? error?

			this.examinerTable[toolkitId] = {
				"func": refOrName, 
				"scope": scope
			};
		},


		handlerTable : [], 

		registerNodeHandler : function( toolkitId, //the toolkit ID to look for
			refOrName,// function or string that will be called with the HTML subtree root element
			scope){ //object, default is window

			/* summary:
			 *		Registers a handler for the given handlerKey that can be found as one of the following:
			 *		<div oa:type="x"/> (actually oaType for now for simplification)
			 *		<oa:x ..../> (doesn't work yet)
			 *		<div class="oa-x someOtherClass etc etc"/> (doesn't work yet)
			 * toolkitId:
			 * 		The toolkit/library key to look for in the various places.
			 *
			 * refOrName:
			 *		A function object reference or the name of a function to be
			 *		called when the document is loaded.
			 * scope:
			 *		Optional. An Object in which to execute refOrName when
			 *		handling the event.
			 */


			if(!scope){
				scope = window;
			}

			if(typeof refOrName == "string"){
				// get a function object
				refOrName = scope[refOrName];
			}

			if(typeof refOrName != "function"){
				throw new Error("invalid function reference passed to oaa.subscribe()");
			}

			//TODO what if this is replacing one that is already there? error?

			this.handlerTable[toolkitId] = {
				"func": refOrName, 
				"scope": scope
			};
		},

		getNodeHandler : function (toolkitId){ //the toolkitId return the handling info for

			/* summary:
			 *		Returns the function/scope object mapped to this typeName.
			 */
			return this.handlerTable[toolkitId];
		},

		scanForIds : function( idsToScanFor ){ //array of IDs in the doc we want to cherry-pick
			//we could also make this scan arguments in order rather than assuming an array

			/* summary:
			 *		Scans the document for the specific array of IDs rather than
			 *		scanning every element.
			 */

			for (var i = 0 ; i<idsToScanFor.length; i++){
				this.scanNode(document.getElementById(idsToScanFor[i]), true);
			}	
		},

		scanDocument : function(){

			/* summary:
			 *		Scans the document. If idsToScanFor is declared this will
			 *		scan only those IDs, otherwise it will scan starting at body
			 */

			if (this.idsToScanFor){
				this.scanForIds(this.idsToScanFor);
			}
			else{
				this.scanNode(document.body);
			}
		},



		scanNode : function ( node , //the node to recursively search for oaType registered things
			shallowScan ){ //if true, only scan this node not any children

			/* summary:
			 *	Scans the given node for an oaType with a valid handler and passes
			 *	the node to the handler function. If the node is not handled
			 *	then children are recursively scanned.
			 *	Note that this will not behave well if the handler method messes
			 *	too much with the structure, for example if it removes the node
			 * 	being handled without replacing it, removes some previous siblings
			 * 	or something to that effect. Adding later siblings or children,
			 * 	or replacing the node with another should not be a problem.
			 * 	If we don't find an oaType we let the nodeExaminers examine the nodes 
			 *	for identifiers.
			 */

			if (!node || node.nodeType!=1) return;
			
			var toolkitInfo = this.extractToolkitInfo( node );


			//if we didn't find the toolkitId using our default mechanism, 
			//let each toolkit look at the node to look for whatever it might be expecting
			if (!toolkitInfo){
				for ( var i in this.examinerTable){
					var returnValue = this.examinerTable[i].func.call(this.examinerTable[i].scope, node);

					//if this returned a real result it means that toolkit is handling this node
					//and we make the returned object part of the info we'll pass in
					if (returnValue){
						toolkitInfo = {
							"toolkitId": i,
							"other": returnValue //TODO get a better name
						};
						break;
					}
				}
			}			


			var nodeHandled = false;

			if (toolkitInfo){
				var handler = this.getNodeHandler(toolkitInfo["toolkitId"]);

				//TODO is this an error if they have a type but it doesn't map to anything?
				if (handler){
					handler.func.call(handler.scope, node, toolkitInfo );
					nodeHandled = true;
				}
			}

			//if the node was handled then do not recursively parse the child nodes
			if (!nodeHandled && !shallowScan){		
				var childNode, i = 0, childNodes = node.childNodes;
				while(childNode = childNodes[i++]){
					this.scanNode(childNode);
				}
			}
		},

		extractToolkitInfo: function( node ){ //the node to look into to find the type

			/* summary:
			 *	Looks at a node and tries to extract toolkit info out of it.
			 * 	Currently this will look for TOOLKIT_ID or TOOLKIT_ID:XXXX
			 *	and return a composite object with "toolkitId" and "other" fields.
			 */

			if (node.nodeType!=1) return;

			//TODO look in className, tag name, oa:type, etc etc
			//and allow optimizations of that

			var toolkitString = node.getAttribute("oaType");
			var randomObject = null;

			if (!toolkitString ){
				return;
			}

			//if the form is TOOLKIT_ID:XXXX break that up into different fields
			var colonIndex = toolkitString.indexOf(":");
			if (colonIndex != -1){
				randomObject = toolkitString.substr(colonIndex+1);
				toolkitString = toolkitString.substr(0, colonIndex);
			}

			return {
				"toolkitId": toolkitString, 
				"other": randomObject //TODO change this name obviously
			};
		}

		
	}	
}

//automatically register the scanDocument method to run on onLoad,
//TODO we need a way to optionally turn this off, you can do that now by
//scanning for IDs with a blank array
//If window.oaa is not there we should just add ourselves directly to onload?
if (window.oaa && window.oaa.addOnLoad){
	window.oaa.addOnLoad("scanDocument", window.OpenAjax);
}


HTML test file 1

Sorry about the inline HTML below.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
	"http://www.w3.org/TR/html4/loose.dtd">
<html>
	<head>
		<title>Open Ajax Alliance Markeup Scanner Reference Implementation Test Case</title>
		<script language="JavaScript" type="text/javascript" src="oaa.js"></script>
		<script language="JavaScript" type="text/javascript">
			function log(){
				var ldiv = document.createElement("div");
				for(var x=0; x<arguments.length; x++){
					ldiv.appendChild(document.createTextNode(arguments[x]+" "));
				}
				document.body.appendChild(ldiv);
			}

			window.oaa.addOnLoad(function(){
				log("Expected results:");
				log("------------------------");
				log("bar nodeName:DIV ToolkitID:toolkit2 Other:Button");
				log("bar nodeName:DIV ToolkitID:toolkit2 Other:Button");
				log("Handle element and recurse nodeName:DIV ToolkitID:toolkit3 Other:Button");
				log("Handle element nodeName:DIV ToolkitID:toolkit1 Other:Button");
				log("Handle element and recurse nodeName:DIV ToolkitID:toolkit4 Other:customerHandlerForToolkit4");
				log("------------------------");
				log("\n\n");
				log("Actual results:");
				log("------------------------");
			});
		</script>

		<script language="JavaScript" type="text/javascript" src="markupScanner.js"></script>
		<script language="JavaScript" type="text/javascript">

	
			function infoString(node, toolkitInfo){
				return "nodeName:" + node.nodeName + " ToolkitID:" + toolkitInfo.toolkitId + " Other:" + toolkitInfo.other;
			}			

			function handleElement( node , toolkitInfo){
				log("Handle element " + infoString(node, toolkitInfo));
			}

			function handleElementAndRecurse( node , toolkitInfo ){
				log("Handle element and recurse " + infoString(node, toolkitInfo));


				var childNode, i = 0, childNodes = node.childNodes;
				while(childNode = childNodes[i++]){
					OpenAjax.scanNode(childNode);
				}
			}

			function lookForSpecialAttribute( node ){
				var s = node.getAttribute("toolkit4");
				if (s && s=="true"){
					return "customerHandlerForToolkit4";
				}
			}


			var foo = {
				bar: function( node , toolkitInfo){ 
					log("bar " + infoString(node, toolkitInfo));
				}
			}

			OpenAjax.registerNodeHandler("toolkit1", handleElement);

			//test scope object and function name as string
			OpenAjax.registerNodeHandler("toolkit2", "bar", foo);

			//test handler doing manual recursion
			OpenAjax.registerNodeHandler("toolkit3", handleElementAndRecurse);

			//add a special node examiner that will look for toolkit4="true"
			//rather than oaType
			OpenAjax.registerNodeExaminer("toolkit4", lookForSpecialAttribute);
			OpenAjax.registerNodeHandler("toolkit4", handleElementAndRecurse);

		</script>
	</head>
	<body>
		This tests a variety of recursive and non-recursive handlers. It also registers a node examiner that looks for the customer attribute named "toolkit4" to augment the default way of determining if elements are handled by a toolkit.
<br><br><hr>
		<div>
			<div oaType="toolkit2:Button"></div>
		</div>
		<div>
			
			<!-- because recursion is not automatic and the handler for toolkit2:button
does not do it manually we won't see toolkit1:button getting handled -->
			<div oaType="toolkit2:Button">
				<div oaType="toolkit1:Button"></div>
			</div>
		</div>

		<div>

<!-- The handler for toolkit3:Button manually recurses so we should see toolkit1:Button handled -->
			<div oaType="toolkit3:Button">
				<div oaType="toolkit1:Button"></div>
			</div>
		</div>

		<div toolkit4="true"></div>

	</body>
</html>

Test scanning only for specific Element IDs

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
	"http://www.w3.org/TR/html4/loose.dtd">
<html>
	<head>
		<title>Open Ajax Alliance Markeup Scanner Reference Implementation Test Case</title>
		<script language="JavaScript" type="text/javascript" src="oaa.js"></script>
		<script language="JavaScript" type="text/javascript">
			function log(){
				var ldiv = document.createElement("div");
				for(var x=0; x<arguments.length; x++){
					ldiv.appendChild(document.createTextNode(arguments[x]+" "));
				}
				document.body.appendChild(ldiv);
			}

			window.oaa.addOnLoad(function(){
				log("Expected results:");
				log("------------------------");
				log("Handle element and recurse nodeName:DIV ToolkitID:toolkit3 Other:button");
				log("Handle element nodeName:DIV ToolkitID:toolkit1 Other:button");
				log("------------------------");
				log("\n\n");
				log("Actual results:");
				log("------------------------");
			});
		</script>
		<script language="JavaScript" type="text/javascript" src="markupScanner.js"></script>

		<script language="JavaScript" type="text/javascript">

			function infoString(node, toolkitInfo){
				return "nodeName:" + node.nodeName + " ToolkitID:" + toolkitInfo.toolkitId + " Other:" + toolkitInfo.other;
			}

			function handleElement( node , toolkitInfo){
				log("Handle element " + infoString(node, toolkitInfo));
			}

			function handleElementAndRecurse( node, toolkitInfo ){
				log("Handle element and recurse " + infoString(node, toolkitInfo));

				var childNode, i = 0, childNodes = node.childNodes;
				while(childNode = childNodes[i++]){
					OpenAjax.scanNode(childNode);
				}
			}

			var foo = {
				bar: function( node ){ 
					log("bar " + infoString(node, toolkitInfo));

				}
			}

			OpenAjax.idsToScanFor = ["myDiv"];
			OpenAjax.registerNodeHandler("toolkit1", handleElement);

			//test scope object and function name as string
			OpenAjax.registerNodeHandler("toolkit2", "bar", foo);

			//test handler doing manual recursion
			OpenAjax.registerNodeHandler("toolkit3", handleElementAndRecurse);
		</script>
	</head>
	<body>
		This file has the same setup as testMarkupScanner.html but only looks at the "myDiv" element
	rather than every element. This is done with <code>OpenAjax.idsToScanFor = ["myDiv"];</code>
		<br><br><hr>

		<div>
			<div oaType="toolkit2:button"></div>
		</div>
		<div>
			
			<!-- because recursion is not automatic and the handler for toolkit2:button
does not do it manually we won't see toolkit1:button getting handled -->
			<div oaType="toolkit2:button">
				<div oaType="toolkit1:button"></div>
			</div>
		</div>

		<div>

<!-- The handler for toolkit3:button manually recurses so we should see toolkit1:button handled -->
			<div id="myDiv" oaType="toolkit3:button">
				<div oaType="toolkit1:button"></div>
			</div>
		</div>

	</body>
</html>

Issues

  1. [JF] ISSUE: Doesn't handlerKey need to support namespaced keys (e.g., mytoolkit:redbutton)? I think it does in order to prevent collisions between multiple Ajax libraries, which might try to register the same names. Right now the examples do not show namespace prefixes. Thus, instead of "mytoolkit:redbutton", the examples just show "redbutton". Does the current approach allow prefixes (with colons) in the values for handlerKey? I would think not, but I'm not knowledgeable enough with JavaScript syntax to know if colons cause JavaScript problems when indexing into handlerObject with a name that contains a colon. [/JF]

    [JM (James Margaris)] The value of an XML attribute should allow colons. The name of the XML attribute may be trickier, which is why I am looking for oaType instead of oa:type. There shouldn't be any problem in the attribute value having a colon, in either Javascript or in XML/HTML. (I don't think, can check and see.) I should update the test code to clarify, really the handler key should be a toolkit specifier like "Dojo" or "Zimbra". So a more real-world example would be oaType="dojo" which means the element is handled by the dojo callback. "Redbutton" and "bluebutton" are poorly chosen names as they reflect the widget type, rather than the toolkit type. [/JM]

    [TT (Ted Thibodeau)] Edited "redButton" and similar to "redKit:redButton"... Perhaps would be better as "red:button" to simultaneously demonstrate toolkit namespace safety? [/TT]

    [JF] Yes, I think red:button would be better. [/JF]

    [JM]The samples have now been changed to reflect these concerns and the javascript code will handle oaType="ID" or oaType="ID:XXX" and give the node handler method access to that XXX string. [/JM]

  2. [JF] ISSUE: Should the hub look for particular strings ("redbutton") used as a tagname or an attribute value within a fixed set of attributes (e.g., oa:type, oaType, and class), or should the hub offer more general and flexible search facilities? My thinking is the latter. Why do we have to constrain the available options? One way to allow greater flexibility is to use a subset of XPath to define the matches. For example, handlerKey might be expressed as "@openajax:type='redbutton'". For class attribute searches, we would need to define an XPath extension function, such as openajaxClass(), as part of the supported XPath grammar. The obvious tradeoff is that the hub will have more JavaScript logic (for the XPath subset parser) and developers will have to learn this subset of XPath, but the Xpath subset will be easy to explain by example and we would get generality, flexibility, and a growth path into the future that is compatible with standards and therefore would be more likely to promote native implementation of parts of the OpenAjax Hub within future browsers. [/JF]

    [JM] One issue with doing something with XPath would be that XPath would have to be run for each distinct query. Under the current model at most you scan the entire tree once. If you had "@openajax:type='redbutton'" and "@openajax:type='bluebutton'" as two separate identifiers each of those would have to be run separately, which would be two separate entire tree traversals. From an end-user perspective I'm not sure if leaving things open is better than having them well-defined more strictly. [/JM]

    [JF]During our phone call, there was little enthusiasm for XPath, which is fine with me, but it is also undesirable to invent complicated new APIs or a new sub-grammar. [/JF]

    [OZ (Ondrej Zara)] Another proposal for this - let each toolkit provide a bool-returning function, which decides whether the element in question should be handled by respective toolkit?

  3. [JF] ISSUE: From a branding perspective, it would be better to use the eight letters "openajax", but obviously "oa" is shorter. In this case, I think the extra six letters are worth it. [/JF]

    [TT] "openajax" seems better from an inline documentation perspective as well -- a quick Googling shows millions of "oaa" and "oa" mentioning pages, but only a few mention "openajax"; likewise, there are more than 1500 matching "oaType", but none yet matching "openajaxType". [/TT]

    [JF]During our phone call, we agreed to use "openajax" for starters and see how cumbersome it is. [/JF]

  4. [JF] ISSUE: I realize that this is a work in progress, but in going through the minutes from the previous meeting, I thought I should mention a feature that we talked about which I don't see yet, which is declarative page scanning. The example above only shows procedural page scanning. One thing to consider is that maybe version 1 should only support procedural page scanning (i.e., the developer must manually invoke the hub's page scanner within the onload handler) and push off declarative page scanning to the future (although we should think it through now to make sure we aren't painting ourselves into a corner). [/JF]

    [JM] I'm not sure what exactly the distinction is between these. (Declaritive vs. procedural scanning) Rather than having the scanner invoked by onLoad() I would rather have it invoked automatically. (Unless turned off) Including the script files that includes the scanner would be giving tacit approval to scan the page. Ideally the page scanner would register itself as one of the onLoad() handlers on the OpenAjax event hub. So the event hub (the thing that handles delegation of load() and unload() and other events) would be loaded first, then the scanner would be loaded and register itself for the onLoad() callback and run then. This is one thing I would like to do in the near future. [/JM]

    [JF]During our phone call, it was pointed out that most Ajax toolkits today work via implicit, automatic actions, associated with some sort of load event, so I believe there was consensus that the markup scanner needed to have a mechanism where in the default case it scanned the document without requiring the developer to make a JavaScript call to set it off. However, we definitely agreed that the developer needs a way to turn off the (default) scanning and have a way to invoke the scanning process himself.[/JF]

    [JM]The code now scans by default by hooking into the oaa.js mechanism for adding onload events. The user no longer has to do anything themselves in the onload() method of the body tag.[/JM]

    [OZ] Our toolkit handles inclusion of used libraries (.js files) by itself, by dynamically creating & appending appropriate <script> nodes. This leads to the following issue: under certain circumstances (IE6), the 'onLoad' event is fired before all libraries are included. This is too early for Markup Scanner to commence work. I opt for manual invocation of Markup Scanner (e.g. once all needed files were successfully loaded; our system checks this).

  5. [OZ] ISSUE: The Markup Scanner should notify all participating toolkits that scanning have finished. Some widgets can not be created piece-by-piece, but only after all adequate pieces were marked for execution. Example: Tab control, which consists of one content container, multiple 'clicker' elements and multiple 'content' elements. Individual Tab pages ('clicker' & its 'content') should be created only after the 'main content area' was processed.

Personal tools