KID Press 16 Jan 2014

The top 7 emerging data trends for 2014

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID).
Johannesburg, 16 Jan 2014

1. Self-service BI and visual discovery These are not new trends, the two have been working together for some time now to improve the rate at which non-analytics business people can gain access to corporate information to improve their response times and give them more and more pertinent information upon which to base their business decisions. People like managers, sales people, field technicians and so on can access information in the forms of dashboards, charts, graphs and other forms of graphically represented information.

Additional capabilities that are likely to emerge in 2014 are, as TDWI suggests, combining those capabilities into process improvement. Business people, who are not analytics experts, will be able to automate tasks for which they are responsible to a far greater degree, says Mervyn Mooi, director at Knowledge Integration Dynamics (KID).

2. The speed of business As many organisations now shift their focus from being product-centric to customer-centric, so the ability to respond to customer needs quickly becomes more of a competitive issue. The only way to do that effectively is to move the data from the point of capture to consumption and meaningful insight, as well as regurgitation for customer service as quickly as possible.

Again, as TDWI notes: "The focus has been on delivery of information for daily, real-time decisions by humans and automated systems; deeper real-time analytics are next. In 2014, we will see advances in use of the above technologies plus big data solutions such as Apache Spark for running real-time, interactive queries against Hadoop systems."

3. A greater variety of data warehousing and integration Data federation and integration will be a key touchstone of 2014 as companies need to access a greater variety of data from a greater variety of sources and seek to integrate those into existing systems as well as drive operational process efficiencies. Many data-driven businesses, such as the great variety of financial services organisations, rely on vast quantities of paperwork and other data sources. Automated integration of those sources into physical and virtual repositories is shown to dramatically improve information flow efficiencies and speeds. While those types of projects have been ongoing for some time now in South Africa, with varying degrees of success, newer technologies will most likely be brought to bear on accessing that information quicker and feeding that to the coalface where it can be used most effectively.

4. Data and information governance The purpose of collecting, storing and using data in the context of information governance is to ensure that regulations and rules governing collecting, storing, using and destruction of data are adhered to. This cannot be accomplished without maintaining control over the environment, knowing who accesses what data and information, how they access it, whether or not it is secure, which standards are applicable, and where the data and information are to be found and in what form. While governance or control processes can be wrapped around almost any data and information resource to make sure that they are properly tracked, managed and reported on, it requires an investment to implement. Investing in an inefficient environment where there is no framework or business intent or directive increases costs and erodes benefit, much to everyone's detriment.

The data governance framework, then, will be a paramount issue for South African organisations in 2014. Many businesses have yet to implement data governance practices. In 2013, companies sought to establish policies and some went so far as to deploy tools to enable data governance but many simply ticked the box to say that they had investigated the situation, designed policies, but never actually deployed them into daily operations. That will become a pressing issue in 2014.

5. Life cycle management of data and information assets Once more, the data governance framework will take centre stage in 2014 as companies seek to redress the often-chaotic quagmire that is their life cycle management of data and information assets. The information governance framework incorporates policies, principles, standards and guidelines for information life cycle management (ILM), data management life cycle (DMLC), enterprise information management (EIM) and information and data management, the latter of which includes enterprise information architecture (EIA), master data management (MDM), data quality management (DQM), records and content management, metadata management, information security and privacy management, data warehousing and business intelligence (BI), and information and data integration. It also includes storage and process optimisation and e-discovery.

6. Unstructured big data management (content and electronic records management) Metadata does not fit into the business data and information content but rather into the models, definitions, programs, scripting and specifications of all ICT artefacts and resources. It sits a layer above the artefacts and resources, which is why metadata is the universe in which all data, models, information and process objects exist, which in turn includes big data and data warehouses. That is why it is in metadata that we realise flexibility for the new usage models big data requires.

The solution to realising the benefits of big data does not reside solely in the employment of big data technologies and systems, or even other technologies and tools, but rather in the architecture of the data domain that relies on reliable, consistent and available metadata to drive flexibility within the confines of good management practices and processes.

7. Developments in advancing virtualisation, for example: larger buffering or persistence of data capability. This will depend on advancements in network and server/PC speeds or bandwidth and throughput. Data virtualisation developments are crucial because there are a growing number of data assets and increasingly fragmented data sources in many organisations today. Physical consolidation often implies a higher licensing cost, slower response times to any changes, higher data delivery latencies, a greater maintenance requirement and generally more employees to administer. Technical features, such as intelligent caching, improve performance, drive latencies down, reduce source impact and reduce source replication. Those become highly important goals as you deal with increasingly disparate, heterogeneous and dispersed data sources.

The ability to leave source data in its source repository while gaining all the proposed advantages of virtualisation will also be dependent, in South Africa, on lower cost, higher bandwidth availability and reliability. Just as data governance requires not so much the use of individual tools or technologies but rather an overarching framework that governs them all appropriately and deploys them in designs that offer reliable functionality, so too must physical network resources be applied intelligently to provide the connectivity that these data-driven organisations will require in customer-centric worlds. Technologies such as server and network virtualisation, coupled to software-defined networking (SDN) and multi-channel centre-point Gbps fibre connectivity, promise the leaps and bounds necessary to effect that.