AI and machine learning's growing role in data analytics
Artificial intelligence (AI) in today's world doesn't look quite like it was envisioned in decades past by great science-fiction writers and film directors. That said, there's no denying its virtual pervasiveness in our professional and personal lives — or its vital role in the management and optimization of analytics.
Machine learning (ML) is the most important strain of AI for data analytics for a number of reasons. Complex ML algorithms can process, distribute, and analyze swaths of structured or unstructured data in minutes or even seconds. But even more importantly, ML platforms continuously learn from the data that passes through them.
Not only does this allow for constant self-refinement of ML's operations that steadily improves quality, but it also unlocks the potential of predictive analytics and prescriptive analytics. In a guest post for the Forbes Technology Council, InfoVision Chief Technology and Innovation Officer Chithrai Mani explained that enterprises have begun to use ML-derived actionable insight to conduct more accurate market research, predict behaviors of individual customers and large consumer demographics, and devise improvement strategies in countless operational areas: everything from marketing and customer service to supply chain management and maintenance.
It's no surprise, then, that the use of ML is among the most notable data analytics trends to follow in the coming years. According to Gartner, by 2024, 75% of organizations will have shifted from AI and ML pilot programs to full-fledged use of these technologies. ML has the potential to galvanize processes such as decision modeling, personalization, and data management, and it will become far more pervasive at the edge, given how much data is processed there. As noted by Data Science Central, the success of these efforts will require high-quality data engineering.
Another Gartner report identified that it will be critical for enterprises to upscale their AI operations, including ML, in response to ever-expanding volumes of data and organizations' increasing reliance upon it. The research firm also pointed out that historical data will become somewhat less important for analytics due to the seismic shift in the economy caused by the COVID-19 pandemic. As such, ML tools will have to adapt to having less information to base calculations on, but the very nature of the technology suggests it can accommodate this challenge.
More data in multi-cloud, hybrid cloud and intercloud architectures
Enterprises are moving large shares of their workloads and the data associated therewith to the cloud: 75% of all databases will be cloud-based by 2022, per Gartner's projections, and by 2023, half of the revenue from the database management system (DBMS) market will stem from cloud DBMS adoption. There are, of course, still plenty of reasons to keep certain data on-premises; migration does not have to be total to have great value. But taking Gartner's figures into account, it's hard to imagine this trend of cloud migration reversing or even significantly slowing down.
The quantity and variety of the data involved in this trend mean that numerous enterprises are not migrating to one cloud but several:
- Those opting for the multi-cloud approach are using multiple clouds from one or more cloud service providers (CSPs).
- Hybrid cloud entails the simultaneous use of on-premises infrastructure and public cloud resources.
- Intercloud links public clouds from multiple CSPs as a single holistic architecture. This strategy of cross-cloud utilization allows workloads to be automatically moved and helps leverage specific advantages of each cloud according to real-time business needs.
Certain cloud services have also emerged to serve the needs of specific business sectors. According to VentureBeat, this trend involves so-called "industry cloud" solutions like Microsoft Cloud for Manufacturing as well as the use of SaaS-based electronic health records (EHR) tools.
From a business analytics perspective, it's worth noting that there may not yet be a significant difference between the general-purpose and industry-specific cloud offerings of major providers like Microsoft, Amazon, and Google. But this development may still be worth exploring for organizations with concerns about standard public cloud offerings' ability to handle the demands of their sector.
No matter which style of deployment enterprises choose for their cloud migration, it will be critical to ensure that their data analytics technology tools are cloud-ready.
The rise of the data mesh design pattern
To make the most of data analytics, enterprises cannot simply think about where data is stored, but must also consider how it is arranged — its delineations and architecture. In the last year or so, there's been a great deal of conversation involving terms like data lake, data warehouse, data lakehouse, data mesh, and data fabric, to name just a few, and it can all be more than a little bit confusing.
In the interest of avoiding said confusion, we're focusing principally on just one of those here: data mesh. In this approach, data domains for different areas of an enterprise's operations (e.g., marketing and accounting) are controlled independently of one another — almost as if siloed, even though they aren't — by those closest to the data relevant to their business unit.
If each domain has separate schemas — an approach Teradata recommends — there will be little to none of the bottlenecking that can occur in instances where enterprise data is centralized as a single schema. Each domain follows data governance stipulations as necessary, and data products developed within one domain are crafted according to agreed-upon interoperability standards so that they can be used by their counterparts. Also, in a data mesh, domain schemas can be isolated, co-located under a single database, or simply connected to one another, the latter two of which are particularly well-suited to enterprise analytics.
Data mesh allows for the accelerated development and delivery of complex data products, and also makes it much easier to share them across the enterprise. This can be a critical advantage for cross-functional teams within any company — groups that must remain simultaneously cognizant of multiple aspects of operations.
It must be said that data mesh can lead to certain difficulties. By its nature, a great deal of data sets are created. This can lead to duplication, performance and quality degradation, and governance challenges. But for large enterprises, domain decentralization can be quite useful for ensuring that the data across many domains is leveraged most effectively. Teradata's QueryGrid solution can be an ideal companion to a data mesh ownership and maintenance framework.
Growing demand for data scientists and CDOs
Another major trend in data analytics management involves those who perform the "nuts and bolts" work of analytics and uncover the insights within.
For one, there aren't many of them: Per May 2020 data from the Bureau of Labor Statistics (BLS) in May 2020 — the most recent date for which accurate figures are available — there are fewer than 60,000 "data scientists," "data engineers," or individuals with similar titles in the American labor force. That supply remains unequal to the demand, and it seems unlikely that demand for individuals with such skills will diminish. Also, Burtch Works' 2021 survey of analytics professionals and data scientists found that their median salaries remain high despite the pandemic, so competition for these experts' services will be fierce.
According to MIT Sloan, enterprises that already have chief data officers (CDOs) on the payroll may have a solution to this issue: offering training to employees outside the data science realm on the tenets and technologies integral to data analytics. Towards Data Science noted that many data professionals didn't get degrees in the field but transitioned to it from other fields and learned by doing. With the stewardship of an effective CDO, enterprises could create "citizen data scientists" out of their employees and develop an analytics-forward culture. If the data expert and scientist shortage remains, more and more organizations may adopt such a strategy.
These four trends are hardly the only developments to follow in data analytics — and the Teradata team is tracking them diligently. Take a look at our blog to learn about the importance of analytics to merchant acquiring, 5G technology's embedded-analytics approach, and much more.