xmdp-brainstorming

From Microformats Wiki
Revision as of 18:21, 30 July 2008 by Tantek (talk | contribs) (rm tying of a brainstorm to a version number - it's premature and bad practice to imply any specific brainstorm will go into any specific version, noted prev attempts at full parsability have failed)
Jump to navigation Jump to search

XMDP Brainstorming

contributors

Add your name here if you make significant contributions to this page and wish to take responsibility for them.

introduction

Tantek Çelik developed XMDP to define extensions to XHTML including rel values, class names, and <meta name> properties and values. Per the XMDP spec, a link to a microformat's XMDP in the profile attribute of head element indicates that that microformat's vocabulary is formally defined in the document. A parser could read the allowed attribute values from the linked XMDP and thus know explicitly which microformats may be in use, and which class names are meant to convey which meanings.

This page is for exploring possible additions / extensions to XMDP.

See xmdp-faq and xmdp-issues for questions and issues.


Possible XMDP Additions

resolving when microformats may be in use

Currently the potential existence of microformats in a document can be declared by referencing the profile URLs for those microformats in the profile attribute of the head element of that document.

In addition to the profile attribute, the rel-profile value is being strongly considered for inclusion in an update to XMDP. See the rel-profile page for details.

In short: another way would be to include the <a rel="profile" href="XMDP URL">powered by microformat xyz</a> within the container element for the microformat. The XMDP spec could then specify that when the <a> element is used in this way, it indicates that the microformat is used by the element containing the <a> element.

Issues:

  • Not every microformat has a container element. Consider rel-tag one of the most widely used microformats.
    • RESOLVED. This is easily resolved by having the context of the rel-profile be the parent of the element with rel-profile and descendants, or perhaps latter siblings of the element with rel-profile and their descendants.
  • To some extent, using microformats adds to the size of the document, just as using markup adds to the size of a plain text document. Putting <a> elements with each microformat adds unwanted links on top of that.
    • RESOLVED. There is no need to add an <a> for each instance of a microformat, as the profile for a microformat can be declared once, perhaps near the top of the body of the document. In practice, many pages that use microformats already link to the microformats specs themselves with badges or "powered by" links which could easily be modified to link to profiles using <a rel="profile"> hyperlinks, no additional links needed.

root class name identification

It could be quite convenient for "generic/universal" microformat parsers if they could read an XMDP profile and understand which of the defined class names were root class names for microformats, and thus be able to distinguish those object boundaries.

One simple thought would be that the first class name defined in a profile (e.g. hcard-profile) is the root class for that microformat. Problems:

  • What about an XMDP that defines multiple microformats?
  • What about a microformat that defines multiple possible root class names (e.g. hCalendar permits "vcalendar" or "vevent", hAtom permits "hfeed" or "hentry")?

Another possible solution

<!-- This profile link indicates that "vcard" is a root class name. -->
<head profile="http://www.w3.org/2006/03/hcard#vcard">

linking to the XMDP

As hinted in the note on "when microformats may be in use", there are additional methods under discussion for linking to the XMDP in addition to the current method of using the profile attribute of the head element:

  • Using <link rel="profile" href="link to XMDP"/>. This method can be used now and will be formalized in XHTML 2.
    • A problem with this method is that it (still) requires access to the head element.
  • Using <a rel="profile" href="link to XMDP">powered by microformat xyz</a> in the body of the document.
    • As noted by a number of people, this approach has the added benefit of creating a viral marketing opportunity for the microformats used. For instance, developers could add badges saying they are using microformat xyz as suggested by the example.
    • Blog authoring environments allow you to insert links at will, so this squarely obviates the need to access the head element.

includes / aggregate profiles

Methods for including one or more values, properties, or an entire XMDP into an other XMDP as a way of creating an aggregate profile that effectively contains definitions from multiple profiles would be quite useful. They would enable documents with microformats to simply refer to a single profile URL rather than a complete space separated set of all the profile URLs of the microformats that may be in use.

vocabulary aliasing

An XMDP document could be used to define a microformat profile that is nothing more than a simple dictionary mapping between an existing, non-standard set of HTML classes and the terms in a standard microformat profile. This would allow a publisher to support a given microformat by merely using the URI of a new profile document as the value of an individual document's head/profile attribute, rather than modifying the individual class values throughout each document to conform to an existing profile. Initial suggestion with use case description in this microformats-discuss post. Note (from Kevin's response) that HTML class attributes can contain multiple values, e.g. class="post hentry", so a publisher doesn't have to discard their existing class values to use those of a microformat.

subclassing / ontology addition

One may want to introduce a new property (or value) and base it on an existing property (or value). In this sample XMDP, the value "self" is defined, based on the value "me" from XFN 1.1:


<dl class="rel">
  <dt id='self'><a href="http://www.gmpg.org/xfn/11#me" rev="extends">self</a></dt>
   <dd>This is a pointer to me, it extends the "me" value of XFN</dd>
</dl>

There are two interesting pieces that have been added, a URL with an anchor to another XMDP profile and a rev attribute. The rev value in this example is 'extends'. These means that the page this is refering too, is extended by the property SELF. So you could make an XMDP that lists all the possible rev attributes, 'extends', 'inverse', 'equivalent', etc. Then you could 'alias' one microformat property to another.

A universal XMDP validator/parser/etc could extract data across two or more XMDP profiles and potentially reason between them. This could create a small ontology.

It is not clear if this idea actually has utility or is simply a solution looking for a problem.

XMDP XML Schema

The link shows a bad example of creating XMDP from an XSD schema. The big question I guess is why? Having XMDP defined in XSD should make it easier for machines to read Microformats, rules and strict data typing will allow Microformats to be validated when contained within an XML/XHTML document. If a document is using microformats with and XSD behind simple XPath queries can be used to harvest the information, this can then be rendered to straight XML for translation to RDF or other XML transport formats.

XSD behind XMDP also has distinct advantages for CMS authors, the XSD sitting behind xforms or sxforms to allow data entry into a CMS can be used to generate XMDP and valid Microformats when rendering content. This in theory should make it easier for CMS authors to develop a semantic core around data before exporting to XHTML + Microformats, RDF etc. and/or make data querying via web services a little more straightforward.

Follow up

Having looked into Microformats a little more I realise how bad that example is; however I still feel that placing a schema behind XMDP is a worthwhile exercise. I don't mind spending a little time on this if anyone feels it's a worthwhile exercise, but I'd propose the following:

  • Define a loose set of microformat conventions (i.e. a meta property will be bound to an attribute etc.), and have these defined in a microformat namespace (mf:?).
  • Create a XSD for common microformat fields without structures (dtStart etc.), with XSD typing and mf: rules (i.e. mf:optional-html-attribute-binding="title" or mf:html-attribute-binding="href" - names were never my strong point )
  • Start working towards creating XSD schema including the common schema for agreed specifications

There would still need to be some form of link between the XMDP and the defining XSD (profile attribute or link element?). With these in place it should be possible for an application like tails, or new apps to pick up on any Microformat in a page and display the data, without the application having to be aware of the specific Microformat standard.

Microformats are cool, especially the fact that you don't have to be a rocket scientist to start using them. However if there can be a way of interleaving grassroots microformat adoption into the more complex semantic forms (RDF etc.), through XML then that's got to be a bonus?

more here

ID Attribute

A problem that I've had using XDMP is that it requires the use of the ID attribute (e.g. <dt id="foo">foo</dt>) to define the term "foo". As (X)HTML only allows one element with any given ID, this raises problems if you need to define the same term multiple times -- e.g. to define "category" as a class within both hcard and hcalendar, or to define "copyright" as both a class value and a rel value. TobyInk 06:26, 18 Feb 2008 (PST)

automatic parsability enabling

The current XMDP is useful for people to read and learn about a microformat, but of very limited utility to automate parsing microformats/poshformats (simply identification of vocabulary to parse for, and what attributes to parse for them). It would be nice if people could design their own poshformats, create an XMDP profile, and for the poshformat to be thus instantly parsable by machines. Here is the information that I think would need to be added to XMDP for this to be possible:

For each profile defined:

  • What is/are the root class name(s) (as previous brainstormed above: root class name identification) of the microformats being defined by the XMDP (required)
  • What are the properties of each microformat? Or alternatively (and preferably), which microformat(s) may a property be used with? (to handle the common and encouraged case of vocabulary re-use across microformats) (required)

For each property defined:

  • A human-readable description of what the property means (XMDP already has this)
  • Is it a class/rel/id (or rev, but deprecated) value (XMDP already has this)
  • Is it singular or plural? (default: plural)
  • What datatype is it? (e.g. text, URI, email, datetime, duration. default:text)
  • Might it contain a nested poshformat/microformat? If so, then this profile should link to the profile of the nested poshformat /microformat. (Multiple formats could be defined in the same XMDP profile, using ID attributes to link from one to the other.)
  • What nested subproperties might be found within it? Or alternatively (and preferably), whether a property is actually a subproperty, and if so, which properties may it be used inside? (again, to handle the common and encouraged case of vocabulary re-use) (Perhaps this could be indicated using a nested profile.)

We must expect that there will always be some parsing rules (e.g. hAtom's "hunt the author" game) which will not be expressible in a machine readable profile format, but it may be possible to cover 90% of the information a parser should need for most microformats.

Indeed experience has shown that any "real world" semantic markup languages that get significant use requires LOTS of special custom parsing rules (e.g. HTML is not fully parseable simply from the DTD, nor is RSS from the RSS DTD).

Thus while it may make sense to take incremental steps towards capturing more about a microformat in XMDP, full enabling of machine parsability should not be a short-term (nor even medium-term) goal, as others have tried (DTD, RelaxNG, XML Schema) and failed to achieve this.

See Also

Parsing Microformats