PDA

View Full Version : Advanced semantic linking and transclusion.



kgun
07-03-2008, 02:07 PM
Below you find an example of how future markup and linking may be written:

1. links.xml


<?xml version="1.0"?>
<links xmlns:xlink="http://www.w3org/1999/xlink/namespace/">
<link xlink:type="extended" xlink:role="product-manufacturer">
<loc xlink:type="locator"
xlink:href="products.xml#xpointer(id('23428'))"
xlink:label="item"/>
<loc xlink:type="locator"
xlink:href="manufacturers.xml#xpointer(id(ABC))"
xlink:label="madeby"/>
<go xlink:type="arc"
xlink:from="item"
xlink:to="madeby"/>
</link>
<link xlink:type="extended" xlink:role="similar-products">
<loc xlink:type="locator"
xlink:href="products.xml#xpointer(id('23428'))"/>
<loc xlink:type="locator"
xlink:href="products.xml#xpointer(id('75386'))"/>
<loc xlink:type="locator"
xlink:href="products.xml#xpointer(id('11111'))"/>
</link>
<link xlink:type="extended" xlink:role="similar-products">
<loc xlink:type="locator"
xlink:href="products.xml#xpointer(id('99999'))"/>
<loc xlink:type="locator"
xlink:href="products.xml#xpointer(id('11111'))"/>
</link>
</links>


2. Manufacturers.xml


<?xml version="1.0"?>
<mcatalog>
<manufacturer id='ABC'>
<title>ABC</title>
<description>ABC Ltd.</description>
</manufacturer>
<manufacturer id='XYZ'>
<title>XYZ</title>
<description>XYZ Ltd</description>
</manufacturer>
<manufacturer id='QRS'>
<title>QRS</title>
<description>QRS Ltd</description>
</manufacturer>
</mcatalog>


3. Products.xml



<?xml version="1.0"?>
<catalog>
<product id='23428'>
<title>ABC Microwave Oven - Model 34X</title>
<description>Great oven!</description>
</product>
<product id='75386'>
<title>XYZ Microwave Oven - Model TRL7</title>
<description>Even a better model!</description>
</product>
<product id='11111'>
<title>QRS Microwave Oven - Model SDF</title>
<description>The ultimate in oven construction!</description>
</product>
<product id='99999'>
<title>QRS convention Oven - Model LKJG</title>
<description>You won't believe how good this oven is!</description>
</product>
</catalog>

4. Products.xsl


<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3org/1999/XSL/Transform"
xmlns:xlink="http://www.w3org/1999/xlink/namespace/">

<xsl:output method="html"/>
<xsl:template match="product">
<xsl:variable name="prod-id" select="@id"/>
<h1><xsl:value-of select="title"/></h1>
<p>These files can not combined in modern browsers</p>

<h3>Other similar products</h3>
<!--
Select all similar products where
- they are different from the current product
- there is a link that
- has the correct role (i.e., product similarity)
- includes the similar product
- includes the current product
-->

<xsl:for-each select="document('products.xml')/catalog/product">
<xsl:variable name="this-prod-id" select="@id"/>

<xsl:if test="($this-prod-id != $prod-id) and
document('links.xml')/links/link
[@xlink:role='similar-products']
[loc/@xlink:href[substring(substring-after(string(),
'#xpointer(id('), 2, 5)=$prod-id]]
[loc/@xlink:href[substring(substring-after(string(),
'#xpointer(id('), 2, 5)=$this-prod-id]]">
<xsl:value-of select="title"/><br/>
</xsl:if>
</xsl:for-each>

<hr/>
</xsl:template>
</xsl:stylesheet>

Source: Erik Wilde and David Lowe: XPath, XLink, XPointer, and XML: A Practical Guide to Web Hyperlinking and Transclusion (http://dret.net/transcluding/) chapter 8.4.3. Link Semantics.

There may be errors in the code that I have had no time to locate.


What are
The implications for WebBrowsers and SeBots. (At least it should be easier to crawl and index than JavaScript and Flash.
What are the SEO (linking) implications?I hope for a good discussion while I am on holiday.

danlefree
07-03-2008, 03:40 PM
XML + XSLT already works quite well in major browsers, so the only reason to avoid creating content as you've posited would be spidering with some SE's (I somehow doubt that Yahoo has embraced this format, though I may be wrong).

I have not tested across Google, MSN, and Yahoo for regular content pages, however, I like to style my RSS feeds with XSLT (why bother having a news page *and* an RSS feed?) and I have not noted any problems there.

Google Base's flexibility should demonstrate Google's willingness to embrace "the future" - though I have a hard time seeing XML + XSLT catch on across the board any time soon. Putting together a functional and valid XHTML document or XHTML-generation application without abstracting the content and presentation is hard enough for most as it is.

The benefits - simplicity, reduction of irrelevant markup, semantic correctness, metadata inclusion - of moving to pure XML are definitely worth the effort, especially when the potential for sites to adopt standards which allow search heuristics to immediately identify key data comes to fruition.

Update: I might add that everything old will be new again in the blackhat arena - I'd expect that keyword spamming, tag spamming, hidden text, and all the other old tricks will see the light of day again (though I'd expect they'll also earn penalties faster than ever).

kgun
07-03-2008, 03:52 PM
XML + XSLT already works quite well in major browsers, so the only reason to avoid creating content as you've posited would be spidering with some SE's (I somehow doubt that Yahoo has embraced this format, though I may be wrong).

I agree to that, but if you study the code above in more detail, you will see non standard linking - XLink with different roles given to links. In addition XPointer was not well implemented in the major browser last time I tried. May be Google are ahead of the other SE's and browser developers.



I have found that Google can crawl almost any link in an XML+XSLT site, be it a link contained in the XML file itself or a link generated by the XSLT stylesheet. This seems to me pretty impressive considering how uncommon this technology is at this time.

http://www.webproworld.com/submit-your-site-review/65074-please-review-my-xml-xslt-site.html#post351678

Are you aware of this

http://www.webproworld.com/webmaster-resources-discussion-forum/64362-xml-driven-site-read-here.html#post346534

sticky?



The benefits - simplicity, reduction of irrelevant markup, semantic correctness, metadata inclusion - of moving to pure XML are definitely worth the effort, especially when the potential for sites to adopt standards which allow search heuristics to immediately identify key data comes to fruition.
And especially for:

Consistent document handling (XML Schema) in large companies and
One (XML) source and many applications (XSL(T) transformations).