Rich Text Editing

From RuntimeWiki

Jump to: navigation, search

Contents

Title

More Standard and Consistent Rich Text Editing Support

Detailed write-up

Description

Rich text editing is supported by various browsers in various ways that are inconsistent and have all sorts of issues.

Why Is This Important?

Rich text editing is at the core of many wildly popular web applications (such as blogging and wiki) and will be at core for many more web applications going forward as the community adopts more Ajax.


Possible Workarounds

Some Ajax libraries (and commercial products) have developed some heroic JavaScript to enable rich text editing in the browser. However, text editing is a hugely complex feature, including complex internationalization issues involving arcane language-dependent issues relating to text direction (left-to-right, right-to-left, top-to-bottom, glyph selection, glyph rotation, and text wrapping. Operating systems have to implement this stuff anyway. Expose it within the browser as a native feature instead of requiring hundreds of thousands of lines of JavaScript by people who don't have full expertise.


Possible Solutions

Operating systems have to implement this stuff anyway. Expose it within the browser as a native feature instead of requiring hundreds of thousands of lines of JavaScript by people who don't have full expertise.

However, "exposing it within the browser" is much easier said than done. There are so many complexities that if attempted in a standards body, it might take years to work everything out within a committee setting. Perhaps it would be better if one of the browsers took a crack at addressing this in a way that the other browsers could implement, and then propose that approach to a standards body.

HTML5 proposes a 'contenteditable' attribute ([1]) as the markup and provides some information about the underlying functionality that needs to be supported, but text editing is a complicated technology area and the HTML5 spec probably only captures a fraction of what is needed to serve the world's editing requirements.


Background material that request this feature

to be added.


Discussion

In this section, the contributors should express their opinions about this feature request, such as providing particular technical analysis or describing in prose why this feature is important (or not). It is recommended that each contributor create his own level-3 sub-section

Coach Wei Comments

There are workarounds for this with hundreds of lines of Javascript. This is such a key feature for making the web more interactive and useful(don't we spent most of our time working on different kinds of documents?), enhanced rich text editing is key. What I'd like to see:

  • more unified editing API support;
  • Easy API to turn on/off a section (DIV, iFrame, etc) from "viewing mode" to "edit mode";

Jon Ferraiolo Comments

I'm not sure what is the best technical approach here, whether to define a 'contenteditable' attribute, which turns on a black-box text editing feature, and then let the browsers compete on who offers the best editing features, or to have the browsers provide a set of low-level editing APIs that will allow Ajax libraries to compete by offering better text editing at the JavaScript level. Ultimately, we probably need both.

Adrian Herscu Comments

It seems that the serialization format will be a big issue... Consider for example that Joe is editing his blog on MS-Windows. His text will be serialized using some MS-Word component (perhaps). What will happen when Sally will want to read or re-edit Joe's text on a Mac? Jon Ferraiolo: Adrian, what do you mean by serialization format? Wouldn't it just be (X)HTML? I realize that HTML isn't the best possible format for text editing, but it has the virtue of being widely supported.

@Jon: (X)HTML is about structure; the style is defined using CSS rules. Now, suppose that the rich text editor on MS-Windows/IE will support images as a bullet style, but Firefox (or Opera, Safari, etc.) will not support such a feature. The user of these browsers will not be able to manipulate these bullets correctly. Moreover, there might be an issue with the CSS rules names: some editor might call his bold rule "ms-bold" and the other one "moz-bold". What will happen when the user will put his cursor on something that is flagged as "ms-bold" in a non-MS browser? Will the bold button in the toolbar be activated? And there are many more examples... Try to export a DOC from MS-Word as (X)HTML and then import it into OO-Writer and vice-versa. There are even issues between different versions of the same software. Jon Ferraiolo: Thanks for the explanation, and I understand what you are saying and recognize the importance, but what you are asking for is not going to be easy from an industry politics perspective. We are right now in a MAJOR and expensive standards war in the editable office documents arena taking place at ISO and within various government standards bodies between companies who believe ODF should be the one and only standard and others (led by Microsoft) who want to get the standards world to approve OOXML (the Office file format) as a second official standard. This standards battle is so epic that it has been covered by the mainstream consumer press.

Brad Neuberg's comments

The web was meant to be a two-way medium. Microsoft was an early innovator here in actually getting a rich text control into the browser (thanks Microsoft!), but the amount of JavaScript to paper over the differences (and add new features) is serious black magic cross browser. I say we just see what one of the browsers is doing that has the best feature-set and does thing well, document it, and say that is the de-facto defined standard for text-editing, similar to what Ian did with the canvas tag in HTML 5 by just seeing what Safari was doing (with a few tweaks if appropriate but done with restraint).

Serialization formats are hard, and run into lots of (fun) tech industry political battles. Is there anyone who has actually hacked on this stuff who can comment? What is the best serialization format you've seen so far: basic HTML, HTML + styles, XHTML, etc. Please only comment if you've actually built rich text editor normalization code that papers over the serialization differences between the different rich text editors, whether doing the normalization on the server-side (ala Blogger I think) or on the client-side (ala FCKeditor I think). Lets just grab one that is well done and call this the standard serialization format.

Frederico Caldeira Knabben's comments

Having developed FCKeditor for more than 5 years made me love and hate browsers in many ways, for many different reasons. The root of this is that we miss precise and in-depth standards for editing features. To workaround that, we have almost completely ignored the browser features and implement all of them at our code.

The definition of editing standards is definitely urging, considering that it will still take several years to be completed and implemented across all browsers. The HTML5 group is working to address this field. I'm just afraid it isn't still the way to go. My reflections about it can be found in the following discussion taken early this year in the W3C public-html mailing list:

http://lists.w3.org/Archives/Public/public-html/2008May/0294.html

I understand that the easy way, for backward compatibility also, would be sticking to a de-facto standard. But, to summarize, the current de-facto standards for editing, brilliantly proposed by MS with IE4 and wrongly adopted by other vendors, are outdated.

The most important thing developers look for it a rich text editor is: flexibility. They don't want editors to say how to edit, and how the output will look like. They want to precisely instruct the editor about that. For example, they want to have a <b> when hitting the Bold button... or a <strong>... or a <span style="font-weight:bold">... or even <span class="MyBoldClass">. So, it is waste of time to keep on defining how the execCommand('bold') call should work.

I think that any editing standard should instead be focused on providing that basic set of features that can help and make it possible to JavaScript applications like FCKeditor to provide such flexible rich editing experience. We have implemented many of these features inside our editor with thousands of lines of code, but certainly browsers could take care of most of it. The following are some of the things we have customized by code:

  • The ENTER and SHIFT+ENTER key behavior.
  • The behavior of other keys, like TAB, BACKSPACE or DELETE.
  • Correct positioning of the caret inside block and inline elements.
  • A generic way to create blocks like headings, paragraphs, etc.
  • Proper creation of special block elements, like tables and horizontal rules.
  • Creation of lists.
  • Indentation and alignment.
  • Creation of grouping elements, like blockquote and div.
  • Links
  • A generic way to apply formatting/semantic inline elements, including the basic ones like bold and italic.
  • Well controlled remove format.
  • "selectionchange" event.
  • Visual presentation for invisible/dynamic content, like anchors and flash content.
  • Find / Replace.
  • Output serialization to *any* format.
  • Output formatting, for readability.

Almost all the above could be provide by the browser, if we would have flexible and predictable results.

There are other things that are provided by the browser that we don't handle by code just because it is out of our possibilities, or would make us crazy to implement. These include text selection, caret positioning, full keyboard behavior control, resizing of controls (like images and tables) and real clipboard and drag and drop control.

One thing to consider, that touches also the serialization aspect, is that while people use a browser to edit text, that text is not necessarily intended to be published in a standard web page. It could end on e-mail messages, mobile devices, paper books, or even be streamed inside other applications, like Flash. So, browsers and standards should keep in mind that it should not dictate the semantic value of the content. It is enough to simply respect a generic globally accepted DTD like XHTML 1.0 Transitional to generate the DOM structure.

Then, having a well supported DOM structure, the serialization stuff is easy. Browsers could have a simple serialization function, making it possible to specify the desired format (with a doctype, or a code like "html/xhtml"), having ways for the JavaScript code to participate in the serialization process.

So my conclusion is, if standards are to be written, let's take a real editor out there and understand how and why things are done in that way there. After all, those are the people specialized on that, which face the final end users everyday. You can be sure that we are open for it at FCKeditor.

Phase I Voting - Vote for Your Top 5 Features

NOTE: PHASE I VOTING IS NOW OPEN. (2008-04-01) We have now changed the voting procedure. Instead of putting votes on each separate wiki page, we are asking people to cast their Phase I Votes on the following wiki page:


Phase II Voting

More about this later.

Personal tools