Monday, June 24, 2013

Semantic Trilogy preparation

The Swedish Midsummer weekend is over and it's time to look forward. Saturday 6th to Friday 12th of July I'll attend the Semantic Trilogy in Montreal, Qc, Canada.

I plan to attend these events during the week:
In 2011 I, together with three colleagues, attended the ICBO 2011 event (see my three blog post: Preparations part 1 and part 2,  report). So, I look forward to reconnect with people in the OBO (The Open Biological and Biomedical Ontologies) community.

And to meet F2F interesting people in the W3C HCLS (Semantic Web Health Care and Life Sciences Interest Group). And people interested in ontologies and semantic web working for e.g. Sanofi, Novo Nordisk, Mayo Clinic.

I'm also very happy that I'll get the opportunity to attend my third semantic web related event in Canada.
  • In 2007 I attended the WWW2007 conference in wonderful Banff.
"During the WWW2007 conference a breakthrough of the Linked Data idea happened in a session where web experts demonstrated the power of a new generation of the web, a web of data. For us attending the session it was hard to imagine the full potential on what this idea would mean for individual scientists and for a pharmaceutical company." 
From  Linked Data, an opportunity to mitigate complexity in pharmaceutical research and development, Bo Andersson and Kerstin Forsberg, LWDM 2011 

And yes, I do hope to also get some time during the weekend to visit the Jazz Festival.

Tuesday, June 11, 2013

Standards for common aspects

Through the last three years I have been engage with different groups working on standards, both for data exchange, such as CDISC, and for vocabularies such as MedDRA MSSO and NCI EVS. As they now start to see the value of using "standards for standards".

Push Back
From Flickr bitpuddle

Standards for standards

So, "I push back" to standard organisations to use semantic web standards and linked data principles to make their standards directly usable for humans and for machines.

A good example is CDISC and their growing interest in using semantic web standards (based on RDF, Resource Description Framework): CDISC2RDF. For some background see Clinical studies and the road to Linked Data. Today FDA, CDISC, pharma:s, CRO:s and software vendors are working together on this in a FDA working group for Semantic Technology organised by PhUSE.

Standards for common aspects

The last year or so, I have also tried to keep up to date with groups developing RDF-based standards for common aspect such as:
  • data descriptions (VoID)
  • data provenance and versioning (PROV and PAV)
  • concept based vocabularies and value sets (SKOS)
  • multi-dimensional statistical data (RDF Data Cube)
I try to ensure that we have a good view of the maturity and applicability of these standars so we can use them in our internal“integration factory”. But most of all “push back” to vendors. I foresee that we in the same way started to add requirements on web-interfaces for better end user usability back in the late 90:ies, we now should start to add requirements on web-interfaces for better machne usability. So we need to to understand how to incorporate these common aspects in our URS:s, RFI:s RFP:s etc..

For software vendors to use RDF-based standards for common aspects, for example:
  • MediData's Rave and Perceptive's IMPACT to describe datasets using VoID.
  • Accelrys' Pipeline Pilot to use W3C PROV.
  • Microsoft's SharePoint to use term sets for tagging in SKOS.
  • SAS Institute's Drug Development to create analysis results using RDF Data Cube.

So, this interview with Reza B'Far, Vice President of Development, Oracle on the W3C blog made me vryy glad: Oracle on Data on the Web
Oracle to use W3C provenance standard to create a single audit time line across systems
"One of the hugest problems we faced was maintaining transaction audit trails in a heterogeneous environment in a standard and compatible way. Audit trails are described with literally millions of different formats in different organizations. This used to mean it was impossible to create a single audit time line. PROV solves this problem. We now provide (and consume) a PROV feed that unifies the audit trails generated by transactions across heterogeneous systems."
See also the Implementation report with 60+ examples of usage of the W3C Provenance specifications.

For a nice intro to the W3C Provenance Specifications, see the tutorial by Paul Groth (@pgroth) at the Extended (European) Semantic Web conference.