Topics

User Functions

Events

There are no upcoming events

What's New

Stories

1 new Stories in the last 2 weeks

Comments last 2 weeks


Trackbacks last 2 weeks

No new trackback comments

Links last 2 weeks

No recent new links

NEW FILES last 14 days

No new files

Welcome to Geeklog Saturday, May 25 2013 @ 07:55 PM EDT

> >

Report from the IKS Workshop in Amsterdam

  • Saturday, December 11 2010 @ 08:00 AM EST
  • Contributed by:
  • Views:
    2,955

IKS (Interactive Knowledge Stack) is an initiative to help Content Management Systems enrich their content with semantic information (think Semantic Web). The initiative, which is in part funded by the European Union, has now reached a point where a first usable version is available. The IKS Project is running an early adopters program and a series of workshops to help get CMS vendors become familiar with the system and so that they can provide early feedback.

One such workshop was held on December 9 + 10 in Amsterdam, Netherlands. I (Dirk) attended this workshop, representing Geeklog, to try and get a better idea of what this project is all about.

What the IKS Project has produced so far is a webservice with a REST API that CMS can send their content, e.g. an article, to. The webservice then sends back some RDFa that contains the semantic information for the article. So if the article would contain the word "Paris", you'd get back information stating that Paris is a place and a list of possible places called Paris that the article may be referring to. Since there are several places in the world called Paris, the RDFa will also include a confidence level for each of the options. The quality of the results depends on the semantic engines behind the webservice.

The webservice used to be called FISE but has since been accepted by the Apache Software Foundation as an incubating project and will be known as Apache Stanbol from now on.

Stanbol can be thought of as a middleware with the REST API on one end and a collection of semantic engines on the other end. As mentioned above, the quality of the results depends on these semantic engines. The idea is that you can select which engines you are using for your CMS or website. This allows for specialized engines that can identify, say, medical terms. Or you may drop the engine that identifies places and use one that knows about Greek sagas, where "Paris" would most likely refer to a person.

From a technical point of view, you would typically run your own instance of Stanbol (although it can also be shared between sites). Being written in Java, Stanbol is a bit of a heavy-weight but simple enough to install. In preparation for the workshop, I hacked together a very primitive Geeklog plugin that would simply send every story to Stanbol when it is saved. The REST API makes this really easy. The more tricky part is parsing and interpreting the RDFa and to use it for something interesting. For my prototype I settled on making it highlight the places and persons it had identified. There will probably be PHP libraries developed as part of Stanbol, which would also offer a way for Geeklog to contribute to the project.

Speaking of Geeklog: What does all this mean for Geeklog now? That's up to us, i.e. the Geeklog community, really. Is there enough interest in getting involved with what looks like a promising project to finally make the Semantic Web (proposed in 1999, after all) a reality? Can you think of use cases for adding semantic information to your Geeklog site? Please leave comments below.

To throw out just one idea: Google is already interpreting some semantic information embedded in pages (see Google Rich Snippets), so this may help in SEO, at least for some sites.

Trackback

Trackback URL for this entry:
http://www.geeklog.net/trackback.php/iks-workshop-amsterdam

The following comments are owned by whomever posted them. This site is not responsible for what they say.

  • Report from the IKS Workshop in Amsterdam
  • Authored by:suprsidr on Saturday, December 11 2010 @ 11:33 PM EST
Wow, probably 4+ years since I read anything about RDFa - but I've always considered it a brilliant format.
Honestly anything to boost SEO is a step forward.
Is a plugin going to be hearty enough, or should it be a built-in service for any plugin to take advantage of?

-s
---
FlashYourWeb and Your Gallery with the E2 XML Media Player for Gallery2 - http://www.flashyourweb.com
  • Report from the IKS Workshop in Amsterdam
  • Authored by:Dirk on Sunday, December 12 2010 @ 07:13 AM EST

Whether this should become a plugin or core functionality really depends on the use cases we come up with. Which is why I'm reaching out to the community for ideas and feedback.

From a technical point of view, my prototype plugin simply hooks into PLG_itemSaved. Add some functions to make the parsing of the RDFa easier, and this could very well be implemented as a plugin (plus maybe some tweaks in the plugin API here and there). However, if there's a lot of interest in this or a great use case, we may want to make it a core feature ...

  • Report from the IKS Workshop in Amsterdam
  • Authored by:suprsidr on Sunday, December 12 2010 @ 09:06 AM EST
In your tests was there any noticeable delay for processing? long/short article.
Would be a great addition to the commerce plugins and would also make sense to make it available to staticpages - heck any plugin should have access to it.

-s
---
FlashYourWeb and Your Gallery with the E2 XML Media Player for Gallery2 - http://www.flashyourweb.com
  • Report from the IKS Workshop in Amsterdam
  • Authored by:Dirk on Sunday, December 12 2010 @ 04:13 PM EST

I haven't done any performance tests. It's probably too early in the development of Stanbol for that anyway.

Post a Comment

Your Name
Create Account
Allowed HTML Tags:
 

Security code
This question is for testing whether you are a human visitor and to prevent automated spam submissions.

What code is in the image?
Enter the bolded text, case sensitive!
Important Stuff
  • Please try to keep posts on topic.
  • Try to reply to other people comments instead of starting new threads.
  • Read other people's messages before posting your own to avoid simply duplicating what has already been said.
  • Use a clear subject that describes what your message is about.
  • Your email address will NOT be made public.