The Web’s Syntax Problem

As @aefaradien notes, the web has a syn­tax prob­lem. It’s this: A user wishes to post some­thing com­pli­cated — text with links, for­mat­ting, even inline graph­ics. They go to a web­site and are faced with a text box and a flash­ing cur­sor. What do they type? What syn­tax will help them achieve their goal?

It depends entirely on which web­site they’re on and what pow­ers it. With any luck the text box itself might have an area below explain­ing how to use it, but chances are, the user won’t read it. The knowl­edgable user has a whole bunch of questions:

  • Can I use HTML? The inter­net is made of HTML (and cats). Once the post is sub­mit­ted, it’ll be sent to every­one else’s browser as HTML, so can I just write in HTML any­way? But HTML is com­plex, am I restricted to a cer­tain sub­set? Do I have to worry about break­ing the website’s for­mat­ting? Is the site using some weird CSS that’s going to dis­tort my post? Could I intro­duce secu­rity vulnerabilities?
  • Is the syn­tax HTML-like? Am I using a phpBB–pow­ered forum, or oth­ers that sup­port its syn­tax? Some­thing else HTML-like but not true HTML? To make some­thing bold, do I write <b> or [b]?
  • Is the syn­tax Wiki-like? And what even is Wiki-like? Medi­aWiki, which pow­ers Wikipedia, prob­a­bly has the most pop­u­lar syn­tax out there, but each wiki is sub­tly dif­fer­ent. If I Camel­Case words, will they become links? If I sur­round a word with *aster­isks*, will it become bold? What about apos­tro­phes? Forward-slashes?
  • Is it some­thing much stranger? Could it be some­thing like Mark­down, which could inter­pret some unin­ten­tional mean­ing from my text because I don’t know its syntax?

To my mind, there’s no sim­ple solu­tion to this prob­lem. Each has its own strengths and weak­nesses, and devel­op­ers of each web plat­form, blog or forum app have their own pref­er­ences. BBcode has some trac­tion, but it’s so close to HTML — why not just use HTML? Wiki markup’s great for link­ing to inter­nal wiki pages, not so great for any­thing else. And Mark­down and its cohort of tech­ni­cally supe­rior solu­tions just don’t have any trac­tion in the real (non-geek) world.

I think if this prob­lem were to ever be solved — and I must say I don’t think it’s likely — we have no option but to pick the low­est com­mon denom­i­na­tor, because noth­ing else will ever have enough traction.

And here’s where I make myself unpop­u­lar: the com­mon denom­i­na­tor is HTML. But HTML used with some intelligence:

  • Auto-link URLs, but deal with it if users want to use <a> tags. Nothing’s more annoy­ing than hav­ing to copy-paste a URL into your loca­tion bar because it’s not actu­ally a hyper­link. Also, it breaks the web.
  • Deal grace­fully with spe­cial char­ac­ters. If a user doesn’t know HTML, they should be penalised as lit­tle as pos­si­ble for using tri­an­gu­lar brack­ets in their text.
  • Limit HTML as lit­tle as pos­si­ble. Sure, don’t allow <IFRAME> or <SCRIPT>, but if there’s no way a user’s HTML could be harm­ful (includ­ing to lay­out and design), let them use it.
  • Don’t use weird CSS. If you don’t want users to use <h3> because your <h3> is 72px high, change your CSS. You design a web­site for its users, and that includes giv­ing them what they expect when they use their own HTML in their posts.

And that’s that. By auto-linking URLs and grace­fully deal­ing with tri­an­gu­lar brack­ets, we’re giv­ing users that don’t know the syn­tax what they expect. For users that know HTML, we’re not mak­ing them learn some other new syn­tax that offers a slight improve­ment. And for users that want to learn the syn­tax so that they can do more com­plex things, they’ll be learn­ing HTML, and that opens up far more of the inter­net to them than know­ing BBcode or Mark­down syntax.

Thoughts, as always, appreciated!

Leave a Reply

Connect with:

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">