As enterprise supply chains and consumer demand chains have beome globalized, they continue to inefficiently share information “one-up/one-down”. Profound "bullwhip effects" in the chains cause managers to scramble with inventory shortages and consumers attempting to understand product recalls, especially food safety recalls. Add to this the increasing usage of personal mobile devices by managers and consumers seeking real-time information about products, materials and ingredient sources. The popularity of mobile devices with consumers is inexorably tugging at enterprise IT departments to shifting to apps and services. But both consumer and enterprise data is a proprietary asset that must be selectively shared to be efficiently shared.
About Steve Holcombe
Unless otherwise noted, all content on this company blog site is authored by Steve Holcombe as President & CEO of Pardalis, Inc. More profile information:
Follow @WholeChainCom™ at each of its online locations:
There were 12 references to prior patents at issuance. These references are unremarkable for the scope and purposes of this blog.
An advanced search at USPTO online on May 15, 2008 for distinguishing references to this patent after its issuance using ref/5,181,162 reveals 334 references. A more refined search reveals the four US patents identified above.
Abstract:
It is an object of the present invention:
to provide a novel system for creating, distributing, producing and managing various types of complex documents.
to support coordinated, multiple user access to various components of complex documents.
to maintain individual document components as discrete units that may be accessed selectively and combined by the user or by means of external programming.
to provide a general platform which may be customized to suit a variety of publishing, case management and document handling applications.
to provide an object-oriented, data-base-centered computational environment for the storage, modification, organization and retrieval of documents.
Independent claims (as numbered):
1. A document management and production system for accepting and organizing document information, the system comprising:
a. electronic data-storage means comprising a plurality of data locations;
b. means for storing in said data locations, for each document, information representative of document components that collectively specify content, organization and appearance of the document, said information including:
1) logical document components defining structural divisions and structural relationships among information-bearing constituents of the document;
2) attributes, if any, of such logical document components;
3) layout document components that define how content is physically distributed and located within the document; and
4) attributes, if any, of such layout document components;
c. database-management means for specifying ordinal and hierarchical relationships among logical document components and among layout document components; and
d. document-management means for integrating the logical and layout components into a single, organized document;
wherein at least some of the attributes associated with the logical document components contain information specifying locational preferences and positions of such components within the document, thereby facilitating mapping of logical document components that specify information-bearing constituents to layout document components to produce an integrated document.
The following video is provided by UChannel, a collection of public affairs lectures, panels and events from academic institutions all over the world. This video was taken at a conference held at Princeton University's Center for Information Technology Policy on January 14, 2008. The conference was sponsored by Microsoft.
What you will see is a panel and discussion format moderated by Ed Felten, Director of the CITP. The panel members are:
Timothy B. Lee, blogger at Technology Liberation Front and adjunct scholar, Cato Institute
Marc Rotenberg, Executive Director, Electronic Privacy Information Center
Here is a paragraph descriptive of the questions addressed by the panel.
"In cloud computing, a provider's data center holds information that would more traditionally have been stored on the end user's computer. How does this impact user privacy? To what extent do users "own" this data, and what obligations do the service providers have? What obligations should they have? Does moving the data to the provider's data center improve security or endanger it?"
The video, entitled "Computing in the Cloud: Possession and ownership of data", is useful and timely. And the panel is well constructed.
Tim Lee, who readily states that he is not a lawyer, very much serves as an apologist for the online companies who believe that "total, one-hundred percent online privacy would mean ... that there wouldn't be any online [sharing] services at all" (Video Time ~ 2:07).
The online services Lee briefly touches upon by way of example are the ubiquitous use of Web cookies for collecting a wide variety of information about usage of the Internet by online users (~5:30), Google's Gmail which employs a business model of examining contents of users' e-mail and tailoring advertising presented to users (~8:05), Facebook's News Feed service which permits users to keep track of changes to their 'friends' accounts, and Facebook's Beacon service which sends data from external websites to Facebook accounts for the purpose of allowing targeted advertisements (~10:54).
Joel Reidenberg, a professor of law, believes that the distinction between government and the private sector loses its meaning when we think of computing in the cloud (~ 15:10), but that the prospect of cloud computing also reinforces the need for fair information practice standards (~16:00). He is of the opinion that as computing moves into the cloud it will be easier to regulate centralized gate-keepers by law and/or by technical standards (~23:50).
Marc Rotenberg, also a law professor, emphasizes that without user anonymity, and without transparency provided by the online companies, there will be no privacy for users in the cloud (~29:47 - 37:20). And in doing so Rotenberg challenges Tim Lee for his statement that there cannot be complete user privacy for online companies to provide the services they provide (~33.30). This actually makes for the most interesting exchanges of the video from the 38:00 minute mark to the 44:00 minute mark.
There is also an interesting dialogue regarding the application of the Fourth Amendment. One of the conference attendees asked the panel why there had been no mention of the Fourth Amendment in any of their presentations. Here is the response from Reidenberg at the 53:30 mark:
"Cloud computing is threatening the vitality of the Fourth Amendment ... [because] the more we see centralization of data [by private, online companies], and the more that data is used for secondary purposes, the easier it is for the government to gain access outside the kind of restraints we put on states in the Fourth Amendment."
In other words, why should the government worry about overcoming Fourth Amendment hurdles about confiscating a person's data when it can sit back and relatively easily purchase or otherwise obtain the same personal data from the big online companies? And do so even in real-time? Why, indeed.
For me, the second 'take away' from this video is found in another cogent comment by Professor Reidenberg at the 88:53 mark:
"The [online] company that ... figures out ways of ... building into [its] compliance systems ... [privacy] compliance mechanisms ... will be putting itself at a tremendous competitive advantage for attracting the services to operate in [the cloud computing environment]."
The technological data ownership discussed and described in Portability, Traceability and Data Ownership - Part IV, supra, is a privacy compliance mechanism.
For those who are interested in the legalities and government policies revolving around burgeoning data ownership issues related to software as a service, the Semantic Web and Cloud Computing, and who are motivated to sit through a 90 minute presentation, here is the video clip ....
In the United States, and notwithstanding the impotency of the Fourth Amendment protections against government search and seizure, it is an irony that the growing centralization of the Cloud may well render it more amenable to government regulation and lawsuit liability.
"Cloud computing opens doors for privacy enhancements [driven by regulation]. It's easier to target for regulation by law or by technical configuration [the] gatekeepers. So to the extent there is a central management, they're easier to find, they're easier to regulate, and they're easier to hold liable than distributed systems."
The following excerpt is from an article published in the print edition of the The Economist on May 8, 2008.
"Item 100237099149 was a bargain. Offered for sale on eBay on May 5th, it was a CD of the “income-tax returns for 2005 of the entire Italian people”. Until it was removed after a few hours, it could have been bought for around $75.
Five days earlier Italians had learnt, to their varying dismay, amusement and fascination, that—without warning or consultation with the data-protection authority—the tax authorities had put all 38.5m tax returns for 2005 up on the internet. The site was promptly jammed by the volume of hits. Before being blacked out at the insistence of data protectors, vast amounts of data were downloaded, posted to other sites or, as eBay found, burned on to disks."
There were 12 references to prior patents at issuance, including US Patent 4,962,475 and US Patent 5,181,162, referenced above. The remaining 10 references are unremarkable for the scope and purposes of this blog.
An advanced search at USPTO online on May 9, 2008 for distinguishing references to this patent after its issuance using ref/5655130 reveals 42 references. A more refined search reveals no patents deemed relevant for the scope and purposes of this blog.
Abstract:
The present invention relates to computer-implemented methods of document production, and in particular to a system and method for producing a variety of documents from a common document database.
Independent claims (as numbered):
1. A computer-implemented method for creating a plurality of versions of custom documents from a document source file, comprising the steps of:
defining a source file having a plurality of encapsulated data elements, wherein each encapsulated data element includes document information including text or graphics, wherein the plurality of encapsulated data elements includes a plurality of first encapsulated data elements and a plurality of second encapsulated data elements;
defining a first class, wherein the first class includes a plurality of first class variation names and wherein the plurality of first class variation names includes a first and a second variation name;
tagging the plurality of first encapsulated data elements with the first variation name in the first class;
tagging the plurality of second encapsulated data elements with the second variation name in the first class;
selecting encapsulated data elements from the plurality of encapsulated data elements wherein the step of selecting includes the step of choosing a set of variation names including the first variation name; and
filtering the source file with the set of variation names, wherein the step of filtering comprises forming a filtered source file comprising the selected encapsulated data elements.
11. A computer-implemented method of generating a version of a document from a document database, comprising the steps of:
providing a plurality of data objects, wherein each data object includes document information including text or graphics, wherein the plurality of data objects comprises a plurality of first data objects, a plurality of second data objects and a plurality of common data objects;
defining a first class having a first variation name and a second variation name;
associating the first variation name with the plurality of first data objects;
associating the second variation name with the plurality of second data objects;
selecting the data objects associated with a set of variation names, wherein the step of selecting includes the step of adding the first variation name to the set of variation names; and
forming an output document wherein the step of forming includes the step of removing the unselected second data objects.
18. A document generation system for generating a variety of documents from a common document database, comprising:
authoring means for entering a document having a plurality of data objects, wherein the plurality of data objects includes a plurality of first data objects, a plurality of second data objects and a plurality of common data objects, wherein each data object includes document information including text or graphics, wherein the authoring means includes first class assigning means for assigning a first class having a first and a second variation name to each of the first and second data objects, wherein the first class assigning means comprises means for associating the plurality of first data objects with the first variation name and means for associating the plurality of second data objects with the second variation name;
document validation means for determining that the document is in a predetermined format; and
document filtering means for removing data objects associated with the second variation name.
22. A computer implemented method of creating multiple variations of documentation, comprising the steps of:
defining a source file having a plurality of document elements, wherein the document elements include document information including text or graphics;
tagging predetermined ones of said plurality of document elements as first variation document elements;
tagging predetermined ones of said plurality of document elements as second variation document elements;
tagging predetermined ones of said plurality of document elements as common document elements;
selecting a first variation;
scanning said source file for selected document elements, wherein the step of scanning includes the step of scanning said source file for first variation document elements and for common document elements; and
generating an output document from the document information contained in the selected document elements.
24. A computer-implemented method of generating a version of a document from a document database, comprising the steps of:
providing a plurality of document section objects, wherein the plurality of document section objects includes document section objects having one or more paragraphs and document section objects having one or more illustrations;
dividing the plurality of document section objects into a plurality of first document section objects, a plurality of second document section objects and a plurality of common document section objects;
defining a first class having a first variation name and a second variation name;
associating the first variation name with the plurality of first document section objects;
associating the second variation name with the plurality of second document section objects;
selecting the document section objects associated with a set of variation names, wherein the step of selecting includes the step of adding the first variation name to the set of variation names; and
filtering the document database to form an output document comprising the common document section objects and the selected document section objects.
29. A document generation system for generating a variety of documents from a common document database, comprising:
an input/output device, wherein the input/output device includes authoring means for entering a document having a plurality of encapsulated paragraphs and means for grouping each of the plurality of encapsulated paragraphs into first, second and common encapsulated paragraphs, wherein the authoring means includes first class assigning means for assigning a first class having a first and a second variation name to each of the first and second encapsulated paragraphs, wherein the first class assigning means comprises means for associating the first encapsulated paragraphs with the first variation name and means for associating the second encapsulated paragraphs with the second variation name;
document validation means, connected to the input/output device, for determining that the document is in a predetermined format;
storage means, connected to the document validation means, for storing the document; and
document filtering means, connected to the storage means, for removing encapsulated paragraphs associated with the second variation name.
Let’s begin this final part with a nicely presented video interview of Tim Berners-Lee, the widely acclaimed inventor of the World Wide Web, by Technology Review.
Berners-Lee has a degree in physics from The Queen’s College, Oxford. He well expresses in the video the insight of an academic technologist preaching the benefits of the emerging Semantic Web as, essentially, one big, connected database.
For instance, Berners-Lee discusses life sciences not once but twice during this interview in the context of making more and better semantically connected information available to doctors, emergency responders and other healthcare workers. He sees this, and rightly so, as being particularly important to fight both (a) epidemics and pandemics, and (b) more persistent diseases like cancer and Alzheimer’s. Presumably that means access to personal health records. However, there is no mention in this interview about concerns over the ownership of information.
Here’s a more recent interview excerpt in March, 2008, initiated by interviewer Paul Miller of ZDNet, in which Berners-Lee does acknowledge data ownership fear factors.
Miller (03:21): “You talked a little bit about people's concerns … with loss of control or loss of credibility, or loss of visibility. Are those concerns justified or is it simply an outmoded way of looking at how you appear on the Web?”
Berners-Lee: “I think that both are true. In a way it is reasonable to worry in an organization … You own that data, you are worried that if it is exposed, people will start criticizing [you] ….
So, there are some organizations where if you do just sort of naively expose data, society doesn't work very well and you have to be careful to watch your backside. But, on the other hand, if that is the case, there is a problem. [T]he Semantic Web is about integration, it is like getting power when you use the data, it is giving people in the company the ability to do queries across the huge amounts of data the company has.
And if a company doesn't do that, then, it will be seriously disadvantaged competitively. If a company has got this feeling where people don't want other people in the company to know what is going on, then, it has already got a problem ….
Well actually, it would expose... all these inconsistencies. Well, in a way, you (sic) got the inconsistencies already, if it exposes them then actually it helps you. So, I think, it is important for the leadership in the company … to give kudos to the people who provided the data upon which a decision was made, even though they weren't the people who made the decision.” (emphasis added)
Elsewhere in this ZDNet interview, Berners-Lee announces that the core pieces for development of the Semantic Web are now in place (i.e., SPARQL, RDF, URI, XML, OWL, and GRDDL). But, again, what I find lacking is that these core pieces do not by themselves provide a mechanism for addressing data ownership issues.
Actually, they may already know each other. Like Berners-Lee, Van Alstyne is a professor at the Massachusetts Institute of Technology. Van Alstyne is an information economist whose work in the area of data ownership I have greatly admired for some time (though I have yet to have had the pleasure of making his acquaintance).
There are other noteworthy recent papers by Van Alstyne but, since I first came across it several years ago, I have continued to be enamored with the prescience of a 1994 publication he co-authored entitled, Why Not One Big Database? Ownership Principles for Database Design. Here’s my favorite quote from that paper.
“The fundamental point of this research is that ownership matters. Any group that provides data to other parts of an organization requires compensation for being the source of that data. When it is impossible to provide an explicit contract that rewards those who create and maintain data, "ownership" will be the best way to provide incentives. Otherwise, and despite the best available technology, an organization has not chosen its best incentives and the subtle intangible costs of low effort will appear as distorted, missing, or unusable data.” (emphasis added)
Whether they know each other or not, the reason I would want to see them introduced is that I don’t hear Van Alstyne’s socio-economic themes in the voice of Berners-Lee. In fact I have checked out the online biographies provided by the World Wide Web Consortium (W3C) of the very fine team that Berners-Lee, as the head of W3C, has brought together. I find no references to academic degrees or experiential backgrounds in either sociology or economics. The W3C team is heavily laden with technologists.
And, why not? After all, the mission of the W3C is one of setting standards for the technological marvel that is the World Wide Web. One must set boundaries and bring focus to any enterprise or endeavor, and Berners-Lee has reasonably done so by directing the W3C team to connect the data that society is either already providing, albeit free of data ownership concerns (i.e., the information already available in massively populated government databases, academic databases, or other publicly accessible sources).
It’s just that I wish there was some cross-pollination going on between the W3C and the likes of Van Alstyne that was resulting, for instance, in something like author-controlled XML (A-XML) as exampled in Parts II and III, above (and, again, below).
That the W3C is not focusing on data ownership is an opportunity for the likes of Dataportability.org. Similarly, as mentioned in Part III, above, in the world of supply chains a likely candidate for a central ‘any product data bank’ would be EPCglobal, the non-profit supply chain consortium. But EPCglobal is a long way from focusing on the kind of data ownership proposed in this writing, or perhaps even envisioning as an organization that they might want to do so.
Like EPCglobal within the ecology of supply chains, Dataportability.org has seated at its table some very powerful members of the social networking ecology (i.e., Google, Plaxo, Facebook, LinkedIn, Twitter, Flickr, SixApart and Microsoft). There is a critical mass in those members that provides an opportunity for an organization like Dataportability.org to become a neutral, central data bank for portable information among its members for the benefit of social networking subscribers.
For instance, for e-mail addresses desired by a Facebook subscriber to be portable to other social networking websites, Facebook would add tools to the subscriber's interface for seamless registration of the e-mail addresses with a central, portability database branded with Facebook's trademark (but in fact separately administered by Dataportability.org). The subscriber would merely enter the chosen e-mail addresses into his or her interface, click on the 'register' button, and automatically author the following draft XML object ...
Again, as illustrated in Part III, above, this would set the stage for a viable model for Dataportability.org, as a non-profit consortium managed by the likes of Facebook, Flickr, etc., to provide more than just portability services. Now, with a centralized registry service for A-XML objects (i.e., author-controlled, informational objects) the portability service could easily be stretched into a non-collaborative data authoring and sharing service.
And, again, the 'data ownership' service would presumably be branded by each of the distributed ‘bank members’ (like Facebook, Flikr, etc.) as their own service.
What might this data ownership service entail? To instill confidence in subscribers that they ‘own’ their portable data, what could be provided to members by Facebook, Flickr, etc. as part of the data ownership service made possible by the central Dataportability.org?
For instance:
Each time an administrative action is taken by Dataportability.org affecting the registered data object - or a granular data element within a registered object - the subscriber could choose to be automatically notified with a fine-grained report.
Each time the registered data object is shared - or data elements within the object are granularly shared - according to the permissions established by the subscriber, he or she could choose be immediately, electronically notified with a fine-grained report.
Online, on-demand granular information traceability reports (i.e., fine-grained reports mapping out who accesses or uses a subscribers shared information)
Catastrophe data back-up services
etc.
Thus could Dataportability.org light a data ownership pathway for both the W3C and EPCglobal.
Concluding Remarks
The fundamental point of this multi-entry blog is that data ownership matters. With it, the Semantic Web stands the best chance for reaching its full potential for the porting of records between and among social networking sites, and for the tracking and discovering of information along both information and product supply chains.
And holding that positive thought in mind, it’s time to end this writing with a little portability rock n’ roll. It's courtesy of Danny Ayers. Enjoy!
Of relevant interest is the blog entry, Empathic Web, posted by Zach Beauvais to Nodalities on 3 June 2008 -
This concept: empathy at a distance or a digitally-connected community, made me consider the connections in the Semantic Web. The in’s and out’s of the SemWeb have been argued, discussed, debated, and explored technologically. Many blogs and sites have huge amounts of content devoted to the definitions of SPARQL and RDF. Abstractions have been published discussing the applications of this new technology. Sir Tim Berners-Lee refers to the Semantic Web as ‘The Web done right.’
But, what is being done right? Is the Semantic Web the Web done technologically right? Is it an upgrade to the existing framework or a patch to fix what was wrong? Maybe. But it makes me wonder about looking at this from a sociological or communicative perspective. The Semantic Web, technologically, is important to humanity only so far as it’s a medium for our connections.