Generative AI deployment: Strategies for smooth scaling

After a procession of overhyped technologies like Web3, the metaverse, and blockchain, executives are bracing for the tidal wave of generative AI, a shift some consider to be on par with the advent of the internet or the desktop computer. But with power comes responsibility, and generative AI offers as much risk as reward. The technology is testing legal regimes in copyright and intellectual property, creating new cyber and data governance threats, and setting off automation anxiety in the workforce.

Organizations need to move quickly to keep up with stakeholder expectations, yet they must proceed carefully to ensure they do not fall foul of regulations or ethical standards in areas like data privacy and bias. Operationally, enterprises need to reconfigure their workforce and forge partnerships with tech companies to design safe, effective, and reliable generative AI.

To gauge the thinking of business decision-makers at this crossroads, MIT Technology Review Insights polled 1,000 executives about their current and expected generative AI use cases, implementation barriers, technology strategies, and workforce planning. Combined with insights from an expert interview panel, this poll offers a view into today’s major strategic considerations for generative AI, helping executives reason through the major decisions they are being called upon to make.

Key findings from the poll and interviews include the following:

  • Executives recognize the transformational potential of generative AI, but they are moving cautiously to deploy. Nearly all firms believe generative AI will affect their business, with a mere 4% saying it will not affect them. But at this point, only 9% have fully deployed a generative AI use case in their organization. This figure is as low as 2% in the government sector, while financial services (17%) and IT (28%) are the most likely to have deployed a use case. The biggest hurdle to deployment is understanding generative AI risks, selected as a top-three challenge by 59% of respondents.
  • Companies will not go it alone: Partnerships with both startups and Big Tech will be critical to smooth scaling. Most executives (75%) plan to work with partners to bring generative AI to their organization at scale, and very few (10%) consider partnering to be a top implementation challenge, suggesting that a strong ecosystem of providers and services is available for collaboration and co-creation. While Big Tech, as developers of generative AI models and purveyors of AI-enabled software, has an ecosystem advantage, startups enjoy advantages in several specialized niches. Executives are somewhat more likely to plan to team up with small AI-focused companies (43%) than large tech firms (32%).
  • Access to generative AI will be democratized across the economy. Company size has no bearing on a firm’s likelihood to be experimenting with generative AI, our poll found. Small companies (those with annual revenue less than $500 million) were three times more likely than mid-sized firms ($500 million to $1 billion) to have already deployed a generative AI use case (13% versus 4%). In fact, these small companies had deployment and experimentation rates similar to those of the very largest companies (those with revenue greater than $10 billion). Affordable generative AI tools could boost smaller businesses in the same way as cloud computing, which granted companies access to tools and computational resources that would once have required huge financial investments in hardware and technical expertise.
  • One-quarter of respondents expect generative AI’s primary effect to be a reduction in their workforce. The figure was higher in industrial sectors like energy and utilities (43%), manufacturing (34%), and transport and logistics (31%). It was lowest in IT and telecommunications (7%). Overall, this is a modest figure compared to the more dystopian job replacement scenarios in circulation. Demand for skills is increasing in technical fields that focus on operationalizing AI models and in organizational and management positions tackling thorny topics including ethics and risk. AI is democratizing technical skills across the workforce in ways that could lead to new job opportunities and increased employee satisfaction. But experts caution that, if deployed poorly and without meaningful consultation, generative AI could degrade the qualitative experience of human work.
  • Regulation looms, but uncertainty is today’s greatest challenge. Generative AI has spurred a flurry of activity as legislators try to get their arms around the risks, but truly impactful regulation will move at the speed of government. In the meantime, many business leaders (40%) consider engaging with regulation or regulatory uncertainty a primary challenge of generative AI adoption. This varies greatly by industry, from a high of 54% in government to a low of 20% in IT and telecommunications.

Download the report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Laying the foundation for data- and AI-led growth

Enterprise adoption of AI is ready to shift into higher gear. The capabilities of generative AI have captured management attention across the organization, and technology executives are moving quickly to deploy or experiment with it. Many organizations intend to increase their spending on the wider family of AI capabilities and the data infrastructure that supports them by double digits during the next year. And notwithstanding concerns about unfavorable economic conditions, executives see opportunities to leverage data and AI to deliver more growth to their organizations, to both the top and bottom lines.

Based on a global survey of 600 technology leaders and a series of in-depth interviews, this report finds that organizations are sharply focused on retooling for a data and AI-driven future. Everything from data architecture to AI-enabled automation is on the table, as technology executives strive to find new efficiencies and new sources of growth. At the same time, the pressure to democratize the power of data and AI creates renewed urgency to bolster data governance and security.

Following are the study’s key findings:

  • CIOs are doubling down on their investments in data and AI. Faced with increasing audience expectations, new competitive pressures, a challenging economic backdrop, and an unprecedented speed of innovation, technology leaders need their data and AI assets to deliver more growth to the business than ever before. They are investing to secure this future: every organization surveyed will boost its spending on modernizing data infrastructure and adopting AI during the next year, and for nearly half (46%), the increase will exceed 25%.
  • Consolidation of data and AI systems is a priority. The proliferation of data and AI systems is particularly extensive in the survey’s largest organizations (those with annual revenue of more than $10 billion). Among these, 81% operate 10 or more of these systems, and 28% use more than 20. The executives we interviewed aim to pare down their multiple systems, connecting data from across the enterprise in unified platforms to break down silos and enable AI initiatives to scale.
  • Democratization of AI raises the stakes for governance. As business units and their staff clamor to use generative AI, executives seek assurance that governance frameworks for the technology can provide not only the needed data accuracy and integrity but also adequate data privacy and security. That’s probably why 60% of respondents say a single governance model for data and AI is “very important.”
  • Executives expect AI adoption to be transformative in the short term. Eighty percent of survey respondents expect AI to boost efficiency in their industry by at least 25% in the next two years. One-third say the gain will be at least 50%.
  • As generative AI spreads, flexible approaches are favored. Eighty-eight percent of organizations are using generative AI, with one-quarter (26%) investing in and adopting it and another 62% experimenting with it. The majority (58%) are taking a hybrid approach to developing these capabilities, using vendors’ large language models (LLMs) for some use cases and building their own models when IP ownership, privacy, security, and accuracy requirements are tighter.
  • Lakehouse has become the data architecture of choice for the era of generative AI. Nearly three-quarters of surveyed organizations have adopted a lakehouse architecture, and almost all of the rest expect to do so in the next three years. Survey respondents say they need their data architecture to support streaming data workloads for real-time analytics (a capability deemed “very important” by 72%), easy integration of emerging technologies (66%), and sharing of live data across platforms (64%). Ninety-nine percent of lakehouse adopters say the architecture is helping them achieve their data and AI goals, and 74% say the help is “significant.”
  • Investment in people will unlock more value from data and AI. In our survey, talent and skills gaps overshadow organizations’ other data and AI challenges. When asked where their company’s data strategy needs to improve, the largest share of respondents (39%) say investing in talent. The number-one difficulty they face with their data and AI platforms, with 40% citing this as a top concern, is training and upskilling staff to use them.

A subsequent report will examine these survey results in detail, accompanied by insights from additional executive interviews across six sectors: financial services, health care and life sciences, retail and consumer packaged goods, manufacturing, media and entertainment, and government.

Download the report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Driving companywide efficiencies with AI

Autonomous shopping carts that follow grocery store customers and robots that pick ripe cucumbers faster than humans may grab headlines, but the most compelling applications of AI and ML technology are behind the scenes. Increasingly, organizations are finding substantial efficiency gains by applying AI- and ML-powered tools to back-office procedures such as document processing, data entry, employee onboarding, and workflow automation.

The power of automation to augment productivity in the back office has been clear for decades, but the recent emergence of advanced AI and ML tools offers a step change in what automation can accomplish, including in highly regulated industries such as health care.

“In the past, AI was seen as a complex and expensive technology that was only accessible to large companies with deep pockets,” says Himadri Sarkar, executive vice president and global head of consulting at Teleperformance, a digital business services company. “However, the development of easy-to-use generative AI tools has made it possible for businesses of all sizes to experiment with AI and see how it can benefit their operations.”

Organizations are taking note with innovative use cases that not only promise to improve back-office operations but also deliver bottom-line benefits, from cost savings to productivity gains.

AI in action

According to McKinsey’s 2022 Global Survey on AI, AI adoption has more than doubled—from 20% of respondents having adopted AI in at least one business area in 2017 to 50% today. It’s easy to understand this technology’s growing popularity: as challenging economic times meet increasing customer expectations, organizations are being asked to do more with less.

“Companies are trying to optimize their use of resources in an inflationary environment,” says Omer Minkara, vice president and principal analyst with Aberdeen Strategy and Research. “Adding to the pressure is the fact that many companies have to defer their technology spend and headcount increases.”

Fortunately, AI and ML solutions can help bridge this gap for a wide range of industries by automating and optimizing various back-office tasks and processes. A retailer, for example, may use AI-powered chatbots to handle routine customer inquiries, track orders, and respond to refund requests, improving response times, enhancing customer experience, and freeing up contact center agents. At the same time, financial institutions are discovering the power of ML to identify anomalies within large volumes of data that may indicate fraud—an early warning system against financial loss. Organizations across industries can employ AI and ML tools to extract and analyze information from documents, such as invoices, contracts, and reports, and to reduce the burden of manual data entry while speeding up processing times and minimizing human errors.

Download the full report

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Data analytics reveal real business value

Business data provides an often untapped well of organizational value. Customer interaction data, supply chain data, operational data, human resource data, financial data, market research data, back-office data—these oft-hidden data sources “hold immense potential for operational insights and value creation,” says Sidharth Mukherjee, chief digital officer of Teleperformance, a global digital business services company.

Making sense of today’s vast volumes of business data, however, is a daunting process. For starters, there’s so much of it, it’s often freeform in structure, and it’s frequently unknown or siloed within the organization. In fact, research firm Forrester identified “activation of unstructured ‘dark’ data” as a 2022 customer service megatrend.

The arrival of enterprise-ready generative AI tools in late 2022 put the need to leverage this data in sharp focus. Given recent months’ enormous hype and heightened expectations around generative AI, having a robust data strategy has become the key imperative for organizations keen to leverage its potential.

Fortunately, data analytics can help organizations identify and extract actionable insights from this underutilized data to support smarter decision-making, streamlined back-office processes, and enhanced business performance. To accomplish this feat, though, business and analytics leaders must ensure data quality while securing the right leadership, employee buy-in, and a data-driven culture.

The benefits of operationalizing data

By 2025, the amount of data in the world will grow to more than 180 zettabytes, according to Statista. This includes the massive streams of data generated by everyday business applications: customer interaction logs, supplier contacts, conversion tracking results, employee and workforce management information, customer feedback data, research results, invoice processing receipts, vendor management. From payroll processing solutions to employee onboarding tools, these technologies produce data whose potential is often underleveraged. That’s changing, however, as organizations turn to data analytics to examine this data, identify patterns, and create models that surface relevant information and recommendations that can lead to more informed decisions.

“Data analytics technology has made huge strides in the last couple of years,” says Sharang Sharma, vice president of business process services at Everest Group. “It’s really phenomenal to see the amount of data that some of these tools can analyze and generate insights from.” In fact, the analytics and business intelligence software market is expected to double in size by 2025, reaching a value of $13 billion, according to Gartner research.

Organizations are already discovering new and innovative ways of operationalizing business data through data analytics. These use cases span industries and demonstrate the power of data analytics to identify inefficient internal processes, particularly back-office workflows, and enhance them for improved business performance.

A grocery store chain, for example, might examine its supply chain data to pinpoint the causes of bottlenecks and delays. Not only do these insights allow the retailer to address delays and act ahead of the curve, but they enable warehouse and procurement managers to optimize inventory in ways that can prevent product waste, customer frustration, and unnecessary costs.

An insurance business might analyze the data generated by human resource management systems to develop new operational insights. Consider, for example, a health insurance company that takes the time to examine data associated with its employee onboarding process. It might identify factors that cause some new hires to take longer than others to become fully productive—and as a result, the business can implement training modules that are designed to boost productivity and minimize turnover. These types of applications are a particular advantage, of course, in highly competitive sectors and in today’s tight labor market.

In a customer support environment, operational efficiencies can be achieved when data analytics tools are used to monitor interaction activity. Certain data patterns may point, for example, to a sudden surge in call volume. Recognizing these patterns can help organizations prepare their staff for upticks and more strategically allocate resources based on fluctuating demand. The result: cost savings, improved customer experience, and new operational efficiencies.

Download the full report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Why embracing complexity is the real challenge in software today

Technology Radar is a snapshot of the current technology landscape produced by Thoughtworks twice a year; it’s based on technologies we’ve been using as an organization and communicates our perspective on them. There is always a long list of candidates to be featured for us to work through and discuss, but with each edition that passes, the number of technologies the group discusses grows ever longer. It seems there are, increasingly, more and more ways to solve a problem. On the one hand this is a good thing—the marketplace is doing its job offering a wealth of options for technologists. Yet on the other it also adds to our cognitive load: there are more things to learn about and evaluate.

It’s no accident that many of the most widely discussed trends in technology—such as data mesh and, most recently, generative AI (GenAI)—are presented as solutions to this complexity. However, it’s important that we don’t ignore complexity or see it as something that can be fixed: we need to embrace it and use it to our advantage.

Redistributing complexity

The reason we can’t just wish away or “fix” complexity is that every solution—whether it’s a technology or methodology—redistributes complexity in some way. Solutions reorganize problems. When microservices emerged (a software architecture approach where an application or system is composed of many smaller parts), they seemingly solved many of the maintenance and development challenges posed by monolithic architectures (where the application is one single interlocking system). However, in doing so microservices placed new demands on engineering teams; they require greater maturity in terms of practices and processes. This is one of the reasons why we cautioned people against what we call “microservice envy” in a 2018 edition of the Technology Radar, with CTO Rebecca Parsons writing that microservices would never be recommended for adoption on Technology Radar because “not all organizations are microservices-ready.” We noticed there was a tendency to look to adopt microservices simply because it was fashionable.

This doesn’t mean the solution is poor or defective. It’s more that we need to recognize the solution is a tradeoff. At Thoughtworks, we’re fond of saying “it depends” when people ask questions about the value of a certain technology or approach. It’s about how it fits with your organization’s needs and, of course, your ability to manage its particular demands. This is an example of essential complexity in tech—it’s something that can’t be removed and which will persist however much you want to get to a level of simplicity you find comfortable.

In terms of microservices, we’ve noticed increasing caution about rushing to embrace this particular architectural approach. Some of our colleagues even suggested the term “monolith revivalists” to describe those turning away from microservices back to monolithic software architecture. While it’s unlikely that the software world is going to make a full return to monoliths, frameworks like Spring Modulith—a framework that helps developers structure code in such a way that it becomes easier to break apart a monolith into smaller microservices when needed—suggest that practitioners are becoming more keenly aware of managing the tradeoffs of different approaches to building and maintaining software.

Supporting practitioners with concepts and tools

Because technical solutions have a habit of reorganizing complexity, we need to carefully attend to how this complexity is managed. Failing to do so can have serious implications for the productivity and effectiveness of engineering teams. At Thoughtworks we have a number of concepts and approaches that we use to manage complexity. Sensible defaults, for instance, are starting points for a project or piece of work. They’re not things that we need to simply embrace as a rule, but instead practices and tools that we collectively recognize are effective for most projects. They give individuals and teams a baseline to make judgements about what might be done differently.

One of the benefits of sensible defaults is that they can guard you against the allure of novelty and hype. As interesting or exciting as a new technology might be, sensible defaults can anchor you in what matters to you. This isn’t to say that new technologies like generative AI shouldn’t be treated with enthusiasm and excitement—some of our teams have been experimenting with these tools and seen impressive results—but instead that adopting new tools needs to be done in a way that properly integrates with the way you work and what you want to achieve. Indeed, there are a wealth of approaches to GenAI, from high profile tools like ChatGPT to self-hosted LLMs. Using GenAI effectively is as much a question of knowing the right way to implement for you and your team as it is about technical expertise.

Interestingly, the tools that can help us manage complexity aren’t necessarily new. One thing that came up in the latest edition of Technology Radar was something called risk-based failure modeling, a process used to understand the impact, likelihood and ability of detecting the various ways that a system can fail. This has origins in failure modes and effects analysis (FMEA), a practice that dates back to the period following World War II, used in complex engineering projects in fields such as aerospace. This signals that there are some challenges that endure; while new solutions will always emerge to combat them, we should also be comfortable looking to the past for tools and techniques.

Learning to live with complexity

McKinsey’s argument that the productivity of development teams can be successfully measured caused a stir across the software engineering landscape. While having the right metrics in place is certainly important, prioritizing productivity in our thinking can cause more problems than it solves when it comes to complex systems and an ever-changing landscape of solutions. Technology Radar called this out with an edition with the theme, “How productive is measuring productivity?”This highlighted the importance of focusing on developer experience with the help of tools like DX DevEx 360. 

Focusing on productivity in the way McKinsey suggests can cause us to mistakenly see coding as the “real” work of software engineering, overlooking things like architectural decisions, tests, security analysis, and performance monitoring. This is risky—organizations that adopt such a view will struggle to see tangible benefits from their digital projects. This is why the key challenge in software today is embracing complexity; not treating it as something to be minimized at all costs but a challenge that requires thoughtfulness in processes, practices, and governance. The key question is whether the industry realizes this.

This content was produced by Thoughtworks. It was not written by MIT Technology Review’s editorial staff.

New approaches to the tech talent shortage

We live in a tech-enabled world, but for organizations to advance world-changing innovations, they need skilled people who can build, install, and maintain the systems that underlie them. Finding that talent is one of the biggest ongoing problems — and opportunities — in tech.

The IT staffing shortages brought on by covid-19 and the Great Resignation are still affecting companies today. In a poll of global tech leaders conducted by MIT Technology Review Insights, 64% of respondents say candidates for their IT and tech jobs lack necessary skills or experience. Another 56% cite an overall shortage of candidates as a concern.

A 2021 Gartner survey of IT executives shows that a majority — 64% — believe the ongoing tech talent shortage is the most significant barrier to the adoption of emerging technologies. By 2030, more than 85 million jobs might go unfilled, “because there aren’t enough skilled people to take them,” according to Korn Ferry. Without that talented workforce, companies could lose out on $8.5 trillion in annual revenue.

Companies are all looking for ways to address this talent shortage in the short term. As the Great Resignation has given way to a Great Reshuffle, with tech employees — including those affected by the tech layoffs of late 2023 and early 2023 — seeking new roles that meet their needs for flexibility, work-life balance, and career growth, some employers have seen the opportunity to differentiate themselves with their career offerings. They compete fiercely to offer the best salaries, benefits, and working conditions; court freshly minted university graduates as well as experienced talent; and bring on contract and temporary workers to bridge the gap. 

But tech doesn’t just need short-term bridges. It needs long-term solutions. That’s why some companies are looking earlier in the pipeline — and even building their own pipeline. Innovative tech leaders have begun targeting less traditionally qualified candidates, including those who have just finished secondary school, and they are cultivating that future potential through new early-career programs. 

A new approach to early-career candidates

For many people, the traditional path from education to career has followed a linear trajectory: Graduate high school. Go to college, university, or trade school. Get a job. But that approach has its risks — both for students and for potential future employers. 

For students, the cost of a university degree can be reason enough to pursue a different path. The College Board reports the average U.S. in-state student pays $10,740 per year for tuition at a public, four-year college (plus an average of $11,950 per year for room and board). According to the same data, the average student will take out $30,000 in loans to earn a bachelor’s degree.

Those prohibitively high costs have impacted diversity within the tech industry. Students who can’t afford a tech degree don’t go to school, and then they don’t join the industry. Further down the line, when future students don’t see tech leaders who come from backgrounds similar to their own, they may opt for a different path. 

Download the full report

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff

New approaches to the tech talent shortage

We live in a tech-enabled world, but for organizations to advance world-changing innovations, they need skilled people who can build, install, and maintain the systems that underlie them. Finding that talent is one of the biggest ongoing problems — and opportunities — in tech.

The IT staffing shortages brought on by covid-19 and the Great Resignation are still affecting companies today. In a poll of global tech leaders conducted by MIT Technology Review Insights, 64% of respondents say candidates for their IT and tech jobs lack necessary skills or experience. Another 56% cite an overall shortage of candidates as a concern.

A 2021 Gartner survey of IT executives shows that a majority — 64% — believe the ongoing tech talent shortage is the most significant barrier to the adoption of emerging technologies. By 2030, more than 85 million jobs might go unfilled, “because there aren’t enough skilled people to take them,” according to Korn Ferry. Without that talented workforce, companies could lose out on $8.5 trillion in annual revenue.

Companies are all looking for ways to address this talent shortage in the short term. As the Great Resignation has given way to a Great Reshuffle, with tech employees — including those affected by the tech layoffs of late 2023 and early 2023 — seeking new roles that meet their needs for flexibility, work-life balance, and career growth, some employers have seen the opportunity to differentiate themselves with their career offerings. They compete fiercely to offer the best salaries, benefits, and working conditions; court freshly minted university graduates as well as experienced talent; and bring on contract and temporary workers to bridge the gap. 

But tech doesn’t just need short-term bridges. It needs long-term solutions. That’s why some companies are looking earlier in the pipeline — and even building their own pipeline. Innovative tech leaders have begun targeting less traditionally qualified candidates, including those who have just finished secondary school, and they are cultivating that future potential through new early-career programs. 

A new approach to early-career candidates

For many people, the traditional path from education to career has followed a linear trajectory: Graduate high school. Go to college, university, or trade school. Get a job. But that approach has its risks — both for students and for potential future employers. 

For students, the cost of a university degree can be reason enough to pursue a different path. The College Board reports the average U.S. in-state student pays $10,740 per year for tuition at a public, four-year college (plus an average of $11,950 per year for room and board). According to the same data, the average student will take out $30,000 in loans to earn a bachelor’s degree.

Those prohibitively high costs have impacted diversity within the tech industry. Students who can’t afford a tech degree don’t go to school, and then they don’t join the industry. Further down the line, when future students don’t see tech leaders who come from backgrounds similar to their own, they may opt for a different path. 

Download the full report

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff

Making sense of sensor data

Consider a supply chain where delivery vehicles, shipping containers, and individual products are sensor-equipped. Real-time insights enable workers to optimize routes, reduce delays, and efficiently manage inventory. This smart orchestration boosts efficiency, minimizes waste, and lowers costs.

Many industries are rapidly integrating sensors, creating vast data streams that can be leveraged to open profound business possibilities. In energy management, growing use of sensors and drone footage promises to enable efficient energy distribution, lower costs, and reduced environmental impact. In smart cities, sensor networks can enhance urban life by monitoring traffic flow, energy consumption, safety concerns, and waste management.

These aren’t glimpses of a distant future, but realities made possible today by the increasingly digitally instrumented world. Internet of Things (IoT) sensors have been rapidly integrated across industries, and now constantly track and measure properties like temperature, pressure, humidity, motion, light levels, signal strength, speed, weather events, inventory, heart rate and traffic.  

The information these devices collect—sensor and machine data—provides insight into the real-time status and trends of these physical parameters. This data can then be used to make informed decisions and take action—capabilities that unlock transformative business opportunities, from streamlined supply chains to futuristic smart cities.

John Rydning, research vice president at IDC, projects that sensor and machine data volumes will soar over the next five years, achieving a greater than 40% compound annual growth rate through 2027. He attributes that not primarily to an increasing number of devices, as IoT devices are already quite prevalent, but rather due to more data being generated by each one as businesses learn to make use of their ability to produce real-time streaming data.

Meanwhile, sensors are growing more interconnected and sophisticated, while the data they generate increasingly includes a location in addition to a timestamp. These spatial and temporal features not only capture data changes over time, but also create intricate maps of how these shifts unfold across locations—facilitating more comprehensive insights and predictions.

But as sensor data grows more complex and voluminous, legacy data infrastructure struggles to keep pace. Continuous readings over time and space captured by sensor devices now require a new set of design patterns to unlock maximum value. While businesses have capitalized on spatial and time-series data independently for over a decade, its true potential is only realized when considered in tandem, in context, and with the capacity for real-time insights.

Download the report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Making sense of sensor data

Consider a supply chain where delivery vehicles, shipping containers, and individual products are sensor-equipped. Real-time insights enable workers to optimize routes, reduce delays, and efficiently manage inventory. This smart orchestration boosts efficiency, minimizes waste, and lowers costs.

Many industries are rapidly integrating sensors, creating vast data streams that can be leveraged to open profound business possibilities. In energy management, growing use of sensors and drone footage promises to enable efficient energy distribution, lower costs, and reduced environmental impact. In smart cities, sensor networks can enhance urban life by monitoring traffic flow, energy consumption, safety concerns, and waste management.

These aren’t glimpses of a distant future, but realities made possible today by the increasingly digitally instrumented world. Internet of Things (IoT) sensors have been rapidly integrated across industries, and now constantly track and measure properties like temperature, pressure, humidity, motion, light levels, signal strength, speed, weather events, inventory, heart rate and traffic.  

The information these devices collect—sensor and machine data—provides insight into the real-time status and trends of these physical parameters. This data can then be used to make informed decisions and take action—capabilities that unlock transformative business opportunities, from streamlined supply chains to futuristic smart cities.

John Rydning, research vice president at IDC, projects that sensor and machine data volumes will soar over the next five years, achieving a greater than 40% compound annual growth rate through 2027. He attributes that not primarily to an increasing number of devices, as IoT devices are already quite prevalent, but rather due to more data being generated by each one as businesses learn to make use of their ability to produce real-time streaming data.

Meanwhile, sensors are growing more interconnected and sophisticated, while the data they generate increasingly includes a location in addition to a timestamp. These spatial and temporal features not only capture data changes over time, but also create intricate maps of how these shifts unfold across locations—facilitating more comprehensive insights and predictions.

But as sensor data grows more complex and voluminous, legacy data infrastructure struggles to keep pace. Continuous readings over time and space captured by sensor devices now require a new set of design patterns to unlock maximum value. While businesses have capitalized on spatial and time-series data independently for over a decade, its true potential is only realized when considered in tandem, in context, and with the capacity for real-time insights.

Download the report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Bolstering enterprise LLMs with machine learning operations foundations

Generative AI, particularly large language models (LLMs), will play a crucial role in the future of customer and employee experiences, software development, and more. Building a solid foundation in machine learning operations (MLOps) will be critical for companies to effectively deploy and scale LLMs, and generative AI capabilities broadly. In this uncharted territory, improper management can lead to complexities organizations may not be equipped to handle.

Back to basics for emerging AI

To develop and scale enterprise-grade LLMs, companies should demonstrate five core characteristics of a successful MLOps program, starting with deploying ML models consistently. Standardized, consistent processes and controls monitor production models for drift, and data and feature quality. Companies should be able to replicate and retrain ML models with confidence: through quality assurance and governance processes to deployment, without much manual work or rewriting. Lastly, they should ensure their ML infrastructure is resilient (ensuring multiregional availability and failure recovery), consistently scanned for cyber vulnerabilities, and well managed.

Once these components are in place, more complex LLM challenges will require nuanced approaches and considerations—from infrastructure to capabilities, risk mitigation, and talent.

Deploying LLMs as a backend

Inferencing with traditional ML models typically involves packaging a model object as a container and deploying it on an inferencing server. As the demands on the model increase—more requests and more customers require more run-time decisions (higher QPS within a latency bound)—all it takes to scale the model is to add more containers and servers. In most enterprise settings, CPUs work fine for traditional model inferencing. But hosting LLMs is a much more complex process which requires additional considerations.

LLMs are comprised of tokens—the basic units of a word that the model uses to generate human-like language. They generally make predictions on a token-by-token basis in an autoregressive manner, based on previously generated tokens until a stop word is reached. The process can become cumbersome quickly: tokenizations vary based on the model, task, language, and computational resources. Engineers deploying LLMs need not only infrastructure experience, such as deploying containers in the cloud, they also need to know the latest techniques to keep the inferencing cost manageable and meet performance SLAs.

Vector databases as knowledge repositories

Deploying LLMs in an enterprise context means vector databases and other knowledge bases must be established, and they work together in real time with document repositories and language models to produce reasonable, contextually relevant, and accurate outputs. For example, a retailer may use an LLM to power a conversation with a customer over a messaging interface. The model needs access to a database with real-time business data to call up accurate, up-to-date information about recent interactions, the product catalog, conversation history, company policies regarding return policy, recent promotions and ads in the market, customer service guidelines, and FAQs. These knowledge repositories are increasingly developed as vector databases for fast retrieval against queries via vector search and indexing algorithms.

Training and fine-tuning with hardware accelerators

LLMs have an additional challenge: fine-tuning for optimal performance against specific enterprise tasks. Large enterprise language models could have billions of parameters. This requires more sophisticated approaches than traditional ML models, including a persistent compute cluster with high-speed network interfaces and hardware accelerators such as GPUs (see below) for training and fine-tuning. Once trained, these large models also need multi-GPU nodes for inferencing with memory optimizations and distributed computing enabled.

To meet computational demands, organizations will need to make more extensive investments in specialized GPU clusters or other hardware accelerators. These programmable hardware devices can be customized to accelerate specific computations such as matrix-vector operations. Public cloud infrastructure is an important enabler for these clusters.

A new approach to governance and guardrails

Risk mitigation is paramount throughout the entire lifecycle of the model. Observability, logging, and tracing are core components of MLOps processes, which help monitor models for accuracy, performance, data quality, and drift after their release. This is critical for LLMs too, but there are additional infrastructure layers to consider.

LLMs can “hallucinate,” where they occasionally output false knowledge. Organizations need proper guardrails—controls that enforce a specific format or policy—to ensure LLMs in production return acceptable responses. Traditional ML models rely on quantitative, statistical approaches to apply root cause analyses to model inaccuracy and drift in production. With LLMs, this is more subjective: it may involve running a qualitative scoring of the LLM’s outputs, then running it against an API with pre-set guardrails to ensure an acceptable answer. 

Governance of enterprise LLMs will be both an art and science, and many organizations are still understanding how to codify them into actionable risk thresholds. With new advances emerging rapidly, it’s wise to experiment with both open source and commercial solutions that can be tailored for specific use cases and governance requirements. This requires a very flexible ML platform, especially the control plane with high levels of abstraction as a foundation. This allows the platform team to add or subtract capabilities, and keep pace with the broader ecosystem, without impacting its users and applications. Capital One views the importance of building out a scaled, well-managed platform control plane with high levels of abstraction and multitenancy as critical to address these requirements.

Recruiting and retaining specialized talent

Depending on how much context the LLM is trained on and the tokens it generates, performance can vary significantly. Training or fine-tuning very large models and serving them in production at scale poses significant scientific and engineering challenges. This will require companies to recruit and retain a wide array of AI experts, engineers, and researchers.

For example, deploying LLMs and vector databases for a service agent assistant to tens of thousands of employees across a company means bringing together engineers experienced in a wide variety of domains such as low-latency/high throughput serving, distributed computing, GPUs, guardrails, and well-managed APIs. LLMs also need to deploy on well-tailored prompts to provide accurate answers, which requires sophisticated prompt engineering expertise.

A deep bench of AI research experts is required to stay abreast of the latest developments, build and fine-tune models, and contribute research to the AI community. This virtuous cycle of open contribution and adoption is key to a successful AI strategy. Long-term success for any AI program will involve a diverse set of talent and experience combining data science, research, design, product, risk, legal, and engineering experts that keep the human user at the center.

Balancing opportunity with safeguards

While it is still early days for enterprise LLMs and new technical capabilities evolve on a daily basis, one of the keys to success is a solid foundational ML and AI infrastructure.

AI will continue accelerating rapidly, particularly in the LLM space. These advances promise to transform in ways that haven’t been possible before. As with any emerging technology, the potential benefits must be balanced with well-managed operational practices and risk management. A targeted MLOps strategy that considers the entire spectrum of models can offer a comprehensive approach to accelerating broader AI capabilities.

This content was produced by Capital One. It was not written by MIT Technology Review’s editorial staff.