gradient

Stay with SaaS or move to DaaS? The pros and cons of using emerging data services

Data-as-a-Service (DaaS) has become a hot commodity, with companies around the world enjoying access to multiple rapidly-growing, full-spectrum data marketplaces provided by the likes of IBM and Google, as well as S&P Global, Snowflake, and others.

What is underpinning their rising popularity is the fact that hedge funds, investors, and financial services companies are increasingly discovering how they can leverage these rich datasets to make informed decisions.

Governments around the world have also been quick to adopt DaaS practices, especially when it comes to data monetization drives in such fields as health care and smart city projects.

My aim with this article is to explain what DaaS is, what it is packing “under the hood”, and what data collection alternatives are currently available to businesses and organizations.

What is DaaS?

In a nutshell, Data-as-a-Service is a cloud-based type of software that provides users with multi-source data on demand via APIs, rather than as a standalone product. In other words, by paying a subscription fee based on data usage, businesses gain access to numerous data sets rated by other users, making it easy for them to find what they need.

DaaS facilitates the consolidation of enterprise data in one place and, unlike other data management frameworks, doesn’t require users to have extensive on-premises IT infrastructure or expertise to store, manage, retrieve, and otherwise handle massive amounts of data. As of 2024, there are a total of 60 publicly traded DaaS companies.

As with any other technology, DaaS comes with its own set of pros and cons. On the pro side, it has been used to reduce licensing costs, streamline workloads by leveraging cloud services, speed up software development, create enterprise benchmarking reports, and boost the efficiency of business intelligence.

DaaS’s cons, meanwhile, include risks like data privacy breaches, security violations when dealing with sensitive data, and the low quality of granular, niche data types.

What powers DaaS?

DaaS makes extensive use of alternative data. How do we define that though? Traditional data can be defined as pretty much all publicly available, structured data produced under legal and official supervision. This would include data from statistics departments, press releases, financial statements, and so forth.

Alternative data, on the other hand, is typically unstructured, stored in multiple different formats (from blocks of text to video clips), and extracted for specific purposes. Some businesses — particularly financial services and investment companies — are highly reliant on exactly this type of data already. A key reason for this is that alternative data often contains unique investment signals that aren’t present in its traditional counterpart.

The most common method of collecting alternative data is called web scraping, which dates back to around 1989. In basic terms, web scraping uses specialized software to copy large amounts of unstructured public data from websites automatically and transfer it to a central database or spreadsheet, where it is converted into structured data, for later analysis. This form of data gathering is arguably the main engine behind the power of DaaS marketplaces.

Web scraping has been rapidly rising in popularity over the past decade. As global competition continues to heat up, more and more companies are turning to web scraping to make better-informed business decisions and gain an edge over their market opponents. And with vast amounts of data being generated every single day — around 2.5 quintillion bytes, to be precise — the ways of collecting alternative data and putting it to good use are potentially limitless.

To sum up, DaaS marketplaces often use web scraping to collect alternative data, which may contain unique signals, and provide their customers with powerful cloud infrastructure they can use to analyze it according to their individual needs.

SaaS, DaaS, or… ?

Businesses and organizations have three ways of acquiring the alternative data they need. Let’s take a brief look at each in turn.

Web scraping with SaaS

The first method entails the use of web scraping software (e.g., via a customized scraper API) and an ethical proxy network from a reliable provider. This in-house approach comes with a substantial learning curve and requires a good deal of maintenance — e.g., when a website changes its HTML structure and content, the scraping pipelines can break. There are also potential legal issues, such as accidentally scraping sensitive data.

The main benefit of web scraping with SaaS is that you actually control the data you get. Which means that it’s easier to control its quality, to target specific data, and to maintain security, privacy, and compliance, especially if you’re gathering sensitive information. In addition, you get your data fast, at relatively low cost, and it comes in a structured form.

Buying ready-made datasets

The second method is to buy ready-made datasets that match your needs. While this does mean that you’ll be dependent on a specific vendor (or vendors) for updates and support, as well as somewhat limited customization options, you also get many benefits. For instance, since you’re buying a finished product, you can start using it immediately — no in-house data collection required, and no infrastructure-related overhead.

Ready-made datasets are relatively cheap, as licensing or subscription fees replace substantial upfront investments. Additionally, since you’re getting your data pre-packaged by a reputable vendor, it’s going to be of high quality. This is because companies that sell datasets use various data validation techniques to ensure its accuracy.

Sourcing from a DaaS marketplace

Finally, you may opt for getting your data from a DaaS marketplace, which combines data collection, storage, and management. Since we’ve already covered “vanilla” DaaS, it might be worthwhile mentioning Big Data-as-a-Service (BDaaS) here. In basic terms, what you get with BDaaS is not simply more data, but also a whole data analytics package designed to help companies extract the insights they need.

Given that, according to current projections, the BDaaS market value will reach over $52 billion by 2026, you might want to keep an eye on this in the future. For now, however, if you don’t have much experience in working with data, it’s probably best to sit this one out.

Final word

Ultimately, which method is best depends on your needs. Can you get the data you require in the form of standalone datasets? Does it make sense to collect it yourself using a custom SaaS web scraper? Or perhaps you need access to larger pools of data via something like BDaaS?

Before making that all important decision, make sure you’ve established exactly what type of data you actually require, and what is the easiest, most cost-effective way of obtaining it.

We list the cheapest proxy services.

This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro