Alternative Data: How to Collect It?
“Alternative” data refers to the footprint users leave while surfing the internet, using financial instruments, or filing documents. Same as any data, it can help to inform decisions and discoveries in business and science. The main problem is to balance it with protecting the privacy and security of everyone involved. Here are some ways to do it.
What Is Alternative Data?
“Alternative” data is the term coined for the absence of a better one at the time. It simply meant “data gathered by novel methods instead of the historical, traditional ones.”
Alternative data acquisition uses tracking, automation, and computer models to get and process data from the internet and online services.
The scale of conducting business has evolved due to the better accessibility of financial tools, borderless expansion of the markets, and zero-cost replication of digital products. With this, the scale of gathering insight into the customer, the ecosystem, and the market conditions also changed.
One example: in the past, it was possible to meet with a credit applicant in person and evaluate their situation firsthand. At the scale on which financial tools are sold and used now, in-person evaluations have become vastly impossible. The information comes from multiple online accounts, mobile service companies, digital bill payments, and more.
By now, what once used to be “alternative” data can easily be considered the primary one for many population segments.
Why Is Alternative Data Useful?
For the business, it is essential. The evolving environment requires data that is native and relevant to the users’ behaviors and lifestyles. It also has to be feasible for the businesses to acquire and keep up to date, which calls for automation.
More relevant and recent insight means more accurate forecasting, more attractive offer customization, and, ultimately, competitive success.
For the customer, it is two-fold. On the one hand, the data-gathering process is a lot more seamless. Most trackers are not disruptive, and the user data exchange is happening without disturbing the customer. On the other, it means less control over one’s data and a potential infringement of privacy happening behind the scenes.
Alternative data is bringing advantages to the users of financial instruments. As an example, the credit scoring system in the US had relied on a few Bureaus supplying information to all institutions. Low or no score meant no access to financial instruments and overpriced – up to 30% – credit interest rates from “loan sharks.” The system was far from inclusive: many individuals who were not in the US from birth or had no access to personal banking were falling through the gaps.
The alternative data provides banks and credit unions with rich context for more granular risk assessment. Advanced models are trying to forecast the creditor risk status through customers’ shopping habits, phone bills, and online behaviors. Hopefully, this will open access to credit to previously underserved populations.
Types of Alternative Data by Sources
There are many sources for alternative data, and more appear every year. There are business models built entirely on collecting, processing, and selling customer data and extracted insights.
And, of course, you can always collect your own data, depending on what exactly you want to learn.
The Behavior of the Audiences by Segments
The source you have the most control over is your digital property: websites, blogs, and social media. It is a well-understood and fairly well-accepted method of gathering information for online shops and services.
You can install trackers, observe the customers’ typical behaviors, conduct experiments, and optimize the design to lead customers to your target action points.
You can then segment the audience by the size of the basket or associated risks and manage each segment differently, improving both business outcomes and access to opportunities for the customers.
You can go further and set up multiple proxy types to conduct marketing experiments and gather insights across different geo markets and media delivery channels.
Online and Omnichannel Shopping Trends
Large marketplaces monetize customer data by providing insight into the trends and behaviors across multiple categories. Businesses use this insight to adjust their product mix, negotiate with vendors, and remain competitive.
If you carry a phone in your pocket or wear a smartwatch on your wrist, the entire history of your movement is available. Devices with GPS chips are not the only ones capable of geolocations: mobile networks, WiFi, and even Bluetooth all put the dots on the map when they attempt to connect to hot spots around.
Combined with demographical data from mobile network applications you fill when purchasing a sim card or getting a contract, this data becomes one of the most accurate and complete profiles of someone’s lifestyle.
This data can be used to evaluate population by area, target customers with geo-relevant offers, make investment decisions for property, transportation, and infrastructure, and even improve safety in tourist areas.
Intent and Interests Trends
Search engines and social media are a powerful source of insight on everything from consumer behaviors to sentiment. For the last couple of generations, a natural response to a question or wish for many of us has been “to go search for it.”
Google and most Search Engines publish free reports on the generated insights, make general statistics on search trends available via analytical accounts, and use precise behavioral and demographic targeting in their advertisement instruments.
Same as users reveal their interests and intent on search engines, they also do not hesitate to share the smallest opinions and most significant life events online, enriching them with pictures, locations, and ways to contact them.
Social media and job search websites are notorious for tracking and selling all customer data, including private exchanges. In most cases, it is anonymized. This information is short-lived, but you can collect information about the waves of public sentiments at scale to use it in politics, brand loyalty, and behavioral investment.
Exploiting reactivity of public sentiment and targeting emotions have become staples in not-so-scrupulous tactics of political populism. There is more direct access and more propaganda instruments for manipulating and directing public opinion than ever before.
Bank Card Activity, Billing, and Online Shopping
Paying by tapping a phone or a card is incredibly convenient for users and has the same convenience for analysts. The information about the purchases and returns goes straight to the database.
Detailed and personal tracking of purchasing behavior enabled automated borrower risk evaluation and led to the development of instant micro-credit and point-of-sale financial tools.
Accessing the card activity databases is expensive, and most banks and companies utilize them internally rather than selling them. However, multiple “perks and benefits” apps that create a layer on top of your regular banking, shopping, and billing might not be as picky.
Data from Public Websites and Plain Facts
Private information is helpful, but publicly available information can be insightful, too. Brands and scientific institutions make their product data as readily available as possible, and many users share their opinions without any care for privacy. This information is utilized for investment, marketing, and social sciences.
How to Collect Alternative Data?
You can always do some form of search and collecting by hand, but once the data you need grows to a large scale or changes too rapidly, you need a tool or a provider to do the collecting for you.
Analytical Tools and Tracking
Analytical tools evolve to keep up with the transition of our activities from offline to online. You can track pretty much everything: from emails to website elements interaction, from cursor movement to customer attention patterns.
With opt-in, you can record the user’s interaction with your app or game in real life. A simple app that improves your T9 and grammar can also record everything you typed on your smartphone. Browsers record what pages you visited and media websites – how you reacted to videos.
Online activity is ultimately trackable and traceable. We need to be honest with ourselves: the best way to protect our privacy is not through a VPN or a proxy but by having deliberate and conscious control over what we say or do. Proxy and VPNs are instrumental in accessing information and opportunities, but they do not replace our common sense.
Third-party Licensing and Alternative Data Vendors
Many business models make money through selling data, explicitly or not. Next time you install a free product, think about its business model. Businesses need to survive, pay for their hardware, and maintain their staff. If it is not a charity, that should be funded somehow. In many cases, customer data and insights are low-hanging fruit.
One of the most significant if little-noticed suppliers of alternative data in the US are mobile connection companies. They sell non-anonymized customer data to financial institutions, which helps to provide a rich context for underwriting and credit decision-making.
With a web scraper and a proxy server, you can get information from various public websites. You can also scrape plain facts such as brand and product information and competitor prices. Scraping social media is a source of public sentiment that does not require infringement on privacy, and the repeated structure of individual pages makes the task easier. You need a proxy server, a web scraper tool, and infrastructure to maintain the connection.
Advanced technology should be balanced by advanced ethics. We do not advocate for breaking the trust and boundaries of others. However, if there are ways to serve customers better and offer solutions to underserved populations – this is where data, alternative or not, comes in handy.
Remember to conduct your customer interviews, too: quantitative information without qualitative one can quickly become misleading. And large-scale statistics have no predictive value for individuals.