The explosive demand for providing global data access and integrated business solutions
The following quoted text (indented at the bullet) is taken from the statement of prior art in US Patent 6,988,109 entitled System, method, software architecture, and business model for an intelligent object based information technology platform and filed in 2001 by IO Informatics, Inc.
- As demand for Information Technology (IT) software and hardware to
provide global data access and integrated business solutions has
exploded, significant challenges have become evident. A central problem
poses access, integration, and
utilization of large amounts of new and valuable information generated
in each of the major industries. Lack of unified, global, real-time
data access and analysis is detrimental to crucial business processes,
which include new product discovery,
product development, decision-making, product testing and validation,
and product time-to-market.
With the completion of the sequence of the human genome and
the continued effort in understanding protein expression in the life
sciences, a wealth of new genes are being discovered that will have
potential as targets for therapeutic
intervention. As a result of this new information, however, Biotech and
Pharmaceutical companies are drowning in a flood of data. In the Life
Sciences alone, approximately 1 Terabyte of data is generated per
company and day, of which currently the vast
majority is unutilized for several reasons.
First, data are contained in diversified system environments
using different formats, heterogeneous databases and have been analyzed
using different applications. These applications may each apply
different processing to those data. Competitive
software, based on proprietary platforms for network and applications
analysis, have utilized data platform technologies such as SQL with
open database connectivity (ODBC), component object model (COM), Object
Linking and Embedding (OLE) and/or
proprietary applications for analysis as evidenced in patents from such
companies as Sybase, Kodak, IBM, and Cellomics in U.S. Pat. Nos.
6,161,148, 6,132,969, 5,989,835, 5,784,294, for data management and
analysis, each of which patents are hereby
incorporated by reference. Because of this diversity, despite the fact,
that the seamless integration of public, legacy and new data is crucial
to efficient drug discovery and life science research, current data
mining tools cannot handle all data
simultaneously. There is a significant lack of data handling methods,
which can utilize these data in a secure, manageable way. The
shortcomings of these technologies are evident within heterogeneous
software and hardware environments with global data
resources. Despite the fact that the seamless integration of public,
legacy and new data is crucial to efficient research (particularly in
the life sciences), product discovery (such as for example drug, or
treatment regime discovery) and distribution,
current data mining tools cannot handle or validate all diverse data
simultaneously.
Second, with the expansion of high numbers of dense data in a
global environment, user queries often require costly massive parallel
or other supercomputer-oriented processing in the form of mainframe
computers and/or cluster servers with various
types of network integration software pieced together for translation
and access functionality as evidenced by such companies as NetGenics,
IBM and ChannelPoint in U.S. Pat. Nos. 6,125,383[,] 6,078,924, 6,141,660,
6,148,298, each of which patents are
herein incorporated by reference--(e.g. Java, CORBA, "wrapping", XML)
and networked supercomputing hardware as evidenced by such companies as
IBM, Compaq and others in patents such as for example U.S. Pat. Nos.
6,041,398, 5,842,031, each of which is
hereby incorporated by reference. Even with these expensive software
and hardware infrastructures, significant time-delays in result
generation remain the norm.
Third, in part due to the flood of data and for other reasons
as well, there is a significant redundancy within the data, making
queries more time consuming and less efficient in their results.
Fourth, an additional consideration, which is prohibitive to
change towards a more homogenous infrastructure, is cost. The cost to
bring legacy systems up to date, to retool a company's Intranet based
software systems, to carry out analysis with
existing tools, or even to add new applications can be very expensive.
Conventional practices require retooling and/or translating at
application and hardware layers, as evidenced by such companies as
Unisys and IBM in U.S. Pat Nos. 6,038,393,
5,634,015.
Because of the constraints outlined above, it is nearly
impossible to extract useful, relevant information from the entity of
data within reasonable computing time and efforts. For this reason, the
development of architecture to overcome these
obstacles is needed.
These are not the only limitations. With the advent of
distinct differentiations in the field of genomics, proteomics,
bioinformatics and the need for informed decision making in the life
sciences, the state of object data is crucial for their
overall validation and weight in complex, multi-disciplinary queries.
This is even more important due to inter-dependencies of a variety of
data at different states. Furthermore, because biological data describe
a "snapshot" of complex processes at a
defined state of the organism, data obtained at any time refer to this
unique phase of metabolism. In order to account for meaningful
comparison, thus, only data in similar states can be utilized.
Therefore, there is a growing need for a object data
state processing engine, which allows to continuously monitor, govern,
validate and update the data state based on any activities of
intelligent molecular objects in real-time.
Data translation processes between different data types are
time-consuming and require provision of information on data structure
and dependencies, in spite of advances in information technology. These
processes, although available and used,
have a number of shortcomings. Data contained in diversified system
environments may use different formats, heterogeneous databases and
different applications, each of which may apply different processing to
those data. Because of that, despite the
fact that the seamless integration of public, legacy and new data is
crucial to efficient drug discovery and life science research, several
different applications and/or components have to be designed in order
to translate each of those data sets
correctly. These require significant effort and resources in both,
software development and data processing. With the advent of distinct
differentiations in the field of genomics, proteomics, bioinformatics
and the need for informed decision making in
the life sciences, access to all data is crucial for overall validation
and weight in complex, multi-disciplinary queries. This is even more
important due to inter-dependencies of a variety of data at different
states. The current individual data
translation approach does not support these needs. Because biological
data describe a "snapshot" of complex processes at a defined state of
the organism, data obtained at any time refer to this unique phase of
metabolism. In order to account for
meaningful comparison, thus, only data in similar states can be
utilized. The latter requires real-time processing and automated,
instant data translation of data from different sources. Therefore,
there is a growing need for an object data translation
engine, which allows for bi-directional translation of multidimensional
data from various sources into intelligent molecular objects in
real-time.
The flood of new and legacy data results in a significant
redundancy within the data making queries more time consuming and less
efficient in their results. There is a lack of defined sets of user
interaction and environment definition
protocols, which are needed to provide means for intelligent data
mining and optimization in result validation towards real solutions and
answers. An additional consideration, which is prohibitive to change
towards a more homogeneous infrastructure is
the missing of object representation definition protocols to prepare
and present data objects for interaction within heterogeneous
environments. Lastly, data currently are interacted with and presented
in diverse user interfaces with dedicated, unique
features and protocols preventing universal, unified user access. Thus,
a homogeneous, unified presentation such as a web-enabled graphical
user interface which integrates components from diverse applications
and laboratory systems environments is
highly desirable, but currently non-existent for objects in real-time.
Because of these constraints, it is nearly impossible to
extract useful, relevant information from the entity of data within
reasonable computing time and efforts. For this reason, the development
of an architecture and unifying user interface
to overcome these obstacles is needed.
I am filing this entry in the
Reference Library to this blog.
Article originally appeared on The @WholeChainCom Blog (http://www.pardalis.com/).
See website for complete article licensing information.