Data Portability, Traceability and Data Ownership - Part IV
[return to Part III]
Connecting Portability to Traceability
Let’s begin this final part with a nicely presented video interview of Tim Berners-Lee, the widely acclaimed inventor of the World Wide Web, by Technology Review.
Berners-Lee has a degree in physics from The Queen’s College, Oxford. He well expresses in the video the insight of an academic technologist preaching the benefits of the emerging Semantic Web as, essentially, one big, connected database.
For instance, Berners-Lee discusses life sciences not once but twice during this interview in the context of making more and better semantically connected information available to doctors, emergency responders and other healthcare workers. He sees this, and rightly so, as being particularly important to fight both (a) epidemics and pandemics, and (b) more persistent diseases like cancer and Alzheimer’s. Presumably that means access to personal health records. However, there is no mention in this interview about concerns over the ownership of information.
Here’s a more recent interview excerpt in March, 2008, initiated by interviewer Paul Miller of ZDNet, in which Berners-Lee does acknowledge data ownership fear factors.
Miller (03:21): “You talked a little bit about people's concerns … with loss of control or loss of credibility, or loss of visibility. Are those concerns justified or is it simply an outmoded way of looking at how you appear on the Web?”
Berners-Lee: “I think that both are true. In a way it is reasonable to worry in an organization … You own that data, you are worried that if it is exposed, people will start criticizing [you] ….
So, there are some organizations where if you do just sort of naively expose data, society doesn't work very well and you have to be careful to watch your backside. But, on the other hand, if that is the case, there is a problem. [T]he Semantic Web is about integration, it is like getting power when you use the data, it is giving people in the company the ability to do queries across the huge amounts of data the company has.
And if a company doesn't do that, then, it will be seriously disadvantaged competitively. If a company has got this feeling where people don't want other people in the company to know what is going on, then, it has already got a problem ….
Well actually, it would expose... all these inconsistencies. Well, in a way, you (sic) got the inconsistencies already, if it exposes them then actually it helps you. So, I think, it is important for the leadership in the company … to give kudos to the people who provided the data upon which a decision was made, even though they weren't the people who made the decision.” (emphasis added)
Elsewhere in this ZDNet interview, Berners-Lee announces that the core pieces for development of the Semantic Web are now in place (i.e., SPARQL, RDF, URI, XML, OWL, and GRDDL). But, again, what I find lacking is that these core pieces do not by themselves provide a mechanism for addressing data ownership issues.
I wish I could introduce Berners-Lee to Marshall Van Alstyne.
Actually, they may already know each other. Like Berners-Lee, Van Alstyne is a professor at the Massachusetts Institute of Technology. Van Alstyne is an information economist whose work in the area of data ownership I have greatly admired for some time (though I have yet to have had the pleasure of making his acquaintance).
There are other noteworthy recent papers by Van Alstyne but, since I first came across it several years ago, I have continued to be enamored with the prescience of a 1994 publication he co-authored entitled, Why Not One Big Database? Ownership Principles for Database Design. Here’s my favorite quote from that paper.
“The fundamental point of this research is that ownership matters. Any group that provides data to other parts of an organization requires compensation for being the source of that data. When it is impossible to provide an explicit contract that rewards those who create and maintain data, "ownership" will be the best way to provide incentives. Otherwise, and despite the best available technology, an organization has not chosen its best incentives and the subtle intangible costs of low effort will appear as distorted, missing, or unusable data.” (emphasis added)
Whether they know each other or not, the reason I would want to see them introduced is that I don’t hear Van Alstyne’s socio-economic themes in the voice of Berners-Lee. In fact I have checked out the online biographies provided by the World Wide Web Consortium (W3C) of the very fine team that Berners-Lee, as the head of W3C, has brought together. I find no references to academic degrees or experiential backgrounds in either sociology or economics. The W3C team is heavily laden with technologists.
And, why not? After all, the mission of the W3C is one of setting standards for the technological marvel that is the World Wide Web. One must set boundaries and bring focus to any enterprise or endeavor, and Berners-Lee has reasonably done so by directing the W3C team to connect the data that society is either already providing, albeit free of data ownership concerns (i.e., the information already available in massively populated government databases, academic databases, or other publicly accessible sources).
It’s just that I wish there was some cross-pollination going on between the W3C and the likes of Van Alstyne that was resulting, for instance, in something like author-controlled XML (A-XML) as exampled in Parts II and III, above (and, again, below).
That the W3C is not focusing on data ownership is an opportunity for the likes of Dataportability.org. Similarly, as mentioned in Part III, above, in the world of supply chains a likely candidate for a central ‘any product data bank’ would be EPCglobal, the non-profit supply chain consortium. But EPCglobal is a long way from focusing on the kind of data ownership proposed in this writing, or perhaps even envisioning as an organization that they might want to do so.
Like EPCglobal within the ecology of supply chains, Dataportability.org has seated at its table some very powerful members of the social networking ecology (i.e., Google, Plaxo, Facebook, LinkedIn, Twitter, Flickr, SixApart and Microsoft). There is a critical mass in those members that provides an opportunity for an organization like Dataportability.org to become a neutral, central data bank for portable information among its members for the benefit of social networking subscribers.
For instance, for e-mail addresses desired by a Facebook subscriber to be portable to other social networking websites, Facebook would add tools to the subscriber's interface for seamless registration of the e-mail addresses with a central, portability database branded with Facebook's trademark (but in fact separately administered by Dataportability.org). The subscriber would merely enter the chosen e-mail addresses into his or her interface, click on the 'register' button, and automatically author the following draft XML object ...
<?xml version="1.0" encoding="UTF-8" ?>
<PortabilityDictionary_DraftElements>
<emailaddr>noname01@pardalis.com</emailaddr>
<emailaddr>noname02@pardalis.com</emailaddr>
<emailaddr>noname03@pardalis.com</emailaddr>
</PortabilityDictionary_DraftElements>
... which would come to be registered in the central portability 'bank' (again, administered by Dataportability.org) as the following XML object.
<?xml version="1.0" encoding="UTF-8" ?>
<PortabilityDictionary_RegisteredElements>
<emailaddr UniquePointer =
" http://www.centralportabilitybank.org/email_IDs/21263 "/>
<emailaddr UniquePointer =
" http://www.centralportabilitybank.org/email_IDs/21264 "/>
<emailaddr UniquePointer =
" http://www.centralportabilitybank.org/email_IDs/21265 "/>
</PortabilityDictionary_RegisteredElements>
Again, as illustrated in Part III, above, this would set the stage for a viable model for Dataportability.org, as a non-profit consortium managed by the likes of Facebook, Flickr, etc., to provide more than just portability services. Now, with a centralized registry service for A-XML objects (i.e., author-controlled, informational objects) the portability service could easily be stretched into a non-collaborative data authoring and sharing service.
IP Comment: Compare and contrast the collaborative data authoring and sharing systems illustrated by Xerox's US Patent 5,220,657, Updating local copy of shared data in a collaborative system Φ and eiSolutions' US Patent 6,240,414, Method of resolving data conflicts in a shared data environment.
And, again, the 'data ownership' service would presumably be branded by each of the distributed ‘bank members’ (like Facebook, Flikr, etc.) as their own service.
What might this data ownership service entail? To instill confidence in subscribers that they ‘own’ their portable data, what could be provided to members by Facebook, Flickr, etc. as part of the data ownership service made possible by the central Dataportability.org?
For instance:
- Each time an administrative action is taken by Dataportability.org affecting the registered data object - or a granular data element within a registered object - the subscriber could choose to be automatically notified with a fine-grained report.
- Each time the registered data object is shared - or data elements within the object are granularly shared - according to the permissions established by the subscriber, he or she could choose be immediately, electronically notified with a fine-grained report.
- Online, on-demand granular information traceability reports (i.e., fine-grained reports mapping out who accesses or uses a subscribers shared information)
- Catastrophe data back-up services
- etc.
Thus could Dataportability.org light a data ownership pathway for both the W3C and EPCglobal.
Concluding Remarks
The fundamental point of this multi-entry blog is that data ownership matters. With it, the Semantic Web stands the best chance for reaching its full potential for the porting of records between and among social networking sites, and for the tracking and discovering of information along both information and product supply chains.
And holding that positive thought in mind, it’s time to end this writing with a little portability rock n’ roll. It's courtesy of Danny Ayers. Enjoy!
Of relevant interest is the blog entry, Empathic Web, posted by Zach Beauvais to Nodalities on 3 June 2008 -
This concept: empathy at a distance or a digitally-connected community, made me consider the connections in the Semantic Web. The in’s and out’s of the SemWeb have been argued, discussed, debated, and explored technologically. Many blogs and sites have huge amounts of content devoted to the definitions of SPARQL and RDF. Abstractions have been published discussing the applications of this new technology. Sir Tim Berners-Lee refers to the Semantic Web as ‘The Web done right.’
But, what is being done right? Is the Semantic Web the Web done technologically right? Is it an upgrade to the existing framework or a patch to fix what was wrong? Maybe. But it makes me wonder about looking at this from a sociological or communicative perspective. The Semantic Web, technologically, is important to humanity only so far as it’s a medium for our connections.
Reader Comments