Data Integrity Will
Enable Big Data Analytics
We continue to see real-world and mostly bad-news scenarios
where one can be readily convinced that necessary information to avoid much of
the bad news was “available”, but data was not being processed and managed to
create an actionable result in a
timely way. As a partial result, analytics at cloud scale
to inform truly valuable assessments, or suggest key courses of action,
especially high-performance analytics using in-memory approaches, are
deservedly seeing significant investment in both resources and brain
power.
Analytics require access to data. But how do we convince data owners that the
benefit of pointing their enterprise telemetry to an analytics engine outweighs
the risk that their content will be compromised or corrupted while in the
engine? Data owners must be enticed to make the concept of data availability an easily- checked box, rather than a painful decision point. For many, the big-data value
proposition at the macro level brings with it a natural uncertainty about
control over data at the tactical level, which ultimately relates to whether
the analytics product is truly actionable for that customer.
Enterprise owners are more likely to embrace analytics –
especially outsourced analytics – if they are not required to trust the
analytics provider, and especially the provider’s insiders. Trust
as a pre-requisite to data sharing will not scale. But if the provider can offer transparency
and mutual auditability of the content feeding their analytics solution(s),
taking the question of trust mostly off the table, the result will be increased
attractiveness to potential customers. The analytic services can be assessed on their
own merits, without the distraction of uncertain integrity as a risk factor on
the ROI. And as a byproduct, the
analytic services are themselves improved, as they’ll be fed by a larger body
of content - further improving the ROI for all.
Keyless Signature Infrastructure (KSI) offers the scalable
data integrity solution needed to enable data availability to these engines,
and thus a sustainable large-scale analytics business model. KSI, when used as a complement to big-data
analytics and related services, offers the proof of integrity that the
analytics alone cannot.
Just as in a multi-tenant object store, KSI is an enabler of
the multi-tenant analytics approach. Many
potential users of analytics face problems with common denominators, and in
those cases, analytics which draw on content from multiple contributors will be
value-added to all with similar challenges. Even competitors in a market will decide to
contribute their content to a common analytics engine working shared problems ,
but only if they have evidence-quality proof of who is and is not touching
their data, along with complete assurance that their content is intact. In addition, they must be confident that the
assessment returned from the analytics was based on authentic information - even
if the complete information set itself is not exposed in the assessment (e.g. because
some of it came from a different tenant).
This is again fundamental to the question of whether the product is
actionable. Use of KSI offers this potential,
and will again incentivize data owners to make their content “available” to the
analytics.
Going further, KSI can also enable a future in which the
customer doesn’t need to choose among the wide and growing array of analytics
providers, and can instead can leverage analytics “brokers”, who maintain
current situational awareness on the strengths of multiple engines, and can
tailor the service delivery over time to what best fits the customer’s needs. Again, assurance of data integrity is a
critical enabler of data availability to feed such a model, as the customer is
now agreeing to let their content reach multiple third party engines. The
value of KSI in that scenario is critical, as it enables unique and actionable
information to reach that customer, who can now achieve a breakthrough or avoid
a disaster.