Clément Renaud blog

  • Works
  • About
  • Web Tools
  • Journalism
  • Thought
  • Reviews
 
 
 

Find me online

TwitterWeiboFacebookRSSLinkedinDeliciousTumblrGithubYoutubeDoubanSlideshareGoogle PlusEmail

French Chinglish

This blog is writing in a language between Chinglish and Frenglish, so please accept my apologies for such thing has bad grammar or wrong vocabulary ! And feel free to post any corrections/improvements in the comments...

Recent Posts

  • From Gdocs to WordPress : a complete web-first workflow for newspaper
  • Self portrait – Visualizing myself
  • Display articles from the Guardian in a timeline
  • Data Journalism Handbook
  • Happy Open Data Day everybody !

Tags

aiweiwei api A propos d\'Open Newsroom Bibliographie China collaborative tools conference CRI data data visualization dataviz dipity Economie 2.0 education French Internet GDocs guardian Heroku Hunter S. Thompson Journalisme Journalisme en ligne lecture Liens multimedia multimedia grammar online desk online writing open data day outils Outils internes project management Publication S3 Scenari self portrait sharism SIMILE talk teaching Teambox Technologies Web telephonie timeline tools toyhouse
Tech, tutorials, Web Tools

Display articles from the Guardian in a timeline

December 9, 2011

0 Comments

I finally had some time to check The Guardian API. I really wanted to use it somehow, so I decided to make a very basic test in putting all articles related to Chinese artist Ai Weiwei on a timeline. Here is a short tutorial on how to do it :

An API to access all the content of the Guardian

To get a first grip on the API, you can try their manual content explorer, which have a research interface for your favorite topics. You can use multiple queries to precise your search. Here I first browse the website itself to find Ai Weiwei dedicated item name (which is a tag).

 

With that, you get an URL that will provide you with a XML containing titles, publication dates and several information from The Guardian database (all content related to Ai Weiwei). If you want to get whole articles, you will have to register for an account. The basic free version has a limitation of 12 calls per second and 5,000 calls per day, which will be far enough for what we are doing here.

 

Put your content in a timeline

Now let’s find a timeline service somewhere. My first preference goes to SIMILE Timeline because I have used it in many projects already. Anyway, several problem with it : 1. I will have to host it ; 2. The config process for colors, size, etc has to be coded in JS, not really user friendly ; 3. I don’t want to spend time implementing this timeline in another website as it is just a test.

Fortunately, web services has evolved since SIMILE project started and I recently get across Dipity, where you can host and design timelines online. It is a pretty cool website where you can make timeline with many sources included RSS, Flickr, Twitter, etc. And even XML following SIMILE Timeline DTD, which will be my choice, so I can re-use the script if one day I need to create a timeline myself.

So we have XML provided by The Guardian API and we need to transform it into a SIMILE Timeline XML. We will simply apply a XLS stylesheet and we are done. Here is the XLS file :

<pre><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" media-type="text/xml" />

	<xsl:template match="/">
		<data>
			<xsl:apply-templates/>
		</data>
	</xsl:template>

	<xsl:template match="results/content">
		<event>
			<xsl:attribute name="start"><xsl:value-of select="@web-publication-date"/></xsl:attribute>
			<xsl:attribute name="isDuration">false</xsl:attribute>
			<xsl:attribute name="title"><xsl:value-of select="@web-title"/></xsl:attribute>
			<xsl:value-of select="@web-url"/>
		</event>
	</xsl:template >

</xsl:stylesheet></pre>

Now we need a little PHP script to trigger the transformation. The max results per page on the Guardian API is 50, so we need to add an attribute for pagination to be able to retrieve all the results (134 articles related to Ai Weiwei) . Here is the code :

<pre>//get page numbers for xml pagination
	$page = $_GET["page"];
	$xml_url = "http://content.guardianapis.com/artanddesign/ai-weiwei?format=xml&page=".$page."&page-size=50&order-by=newest&api-key=";

	// create XSLT processor resource
	$xp = new XsltProcessor();

	// create a DOM document and load the XSL stylesheet
  	$xsl = new DomDocument;
  	$xsl->load('guardian.xsl');

	// import the XSL styelsheet into the XSLT process
	$xp->importStylesheet($xsl);

 	// create a DOM document and load the XML datat
  	$xml = new DomDocument;
	$xml->load($xml_url);

	// transform the XML into HTML using the XSL file
	  if ($result = $xp->transformToXML($xml)) {
	      echo $result;
	  } else {
	      trigger_error('XSL transformation failed.', E_USER_ERROR);
	  } // if</pre>

Ok we are done. Now, just upload our PHP and XSL file somewhere, copy the URL of the PHP script and we are ready to add our sources to our Dipity timeline.

 

After 2 seconds, our timeline has been updated with the articles, we’re done !

on Dipity.

Tweet

What's next ?

→ Self portrait – Visualizing myself

← Data Journalism Handbook

Read more in Tech, tutorials, Web Tools

Leave a commentCancel reply

Creative Commons License
All content licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.