Hadoop 640x360 Oct2013

Big Data for Telco – Where is the Data?

If you’ve worked in OSS for more than 5 minutes you know that data, the availability of it, and its accuracy, is a bit of a thing. So, if you’re thinking about ‘doing’ some big data as part of an OSS or BSS project what data related challenges are you likely to face, and what is the reward for your efforts?

A few months ago I attended Informa’s Telco Big Data and Real Time Analytics event in London. My assumption had been that there would be a great deal of focus on customer data – Pretty much every analytics themed telco session I had been to in the past focused on the classic topics of determining which customers are likely to churn (a nicer way of saying ‘leave you’) and what products you should try to up-sell to each customer. Typically this is based on call/billing records, purchase history, payment records, and CDRs. All relatively easy to acquire data and available no matter what OSS/BSS/hardware vendors’ systems you have.

Network, OSS, operational data is much less consistent, much harder to acquire. So I was pleased to find that these issues were covered by a number of presenters at Telco Big Data, including case studies from Sprint and Telefonica.

Telco data is divided in to two broad categories, familiar to any data analyst:

  • Structured data, which is easily processed due to it being in a consistent format such as XML, spreadsheets, databases and is usually available over system interfaces. Examples include: Call records; billing records; electronic records (IP, service); location records; inventory; MIBs
  • Unstructured data, which has no consistent, logical structure making it hard to derive insight from its content without sophisticated, specialist processing. Examples include: Voice calls; texts; blog posts; web content; file downloads; apps; media content

You could also categorise these as ‘data’, the actual content flowing across the network, and ‘meta-data’, data describing the properties, sources, costs, etc. relating to the content data.

The structured meta-data is usually found in OSS and BSS systems. In OSS, the data quality and format will vary greatly between CSPs as each has their own software vendors (different inventory databases) and hardware vendors (different NMS interfaces, device MIBs, and stat collection capabilities).

The cost of acquiring OSS data is often high. There’s the ‘integration tax’, referring to the development and data cleansing needed to extract and use available data. And, often, OSS vendors and hardware vendors will charge a license fee to allow use of their APIs to get at the data that’s there, sometimes costing several hundreds of thousands of dollars.

In the unstructured data, telcos have access to real social data that the likes of Facebook and Google can currently only dream of – relationships, family, work patterns and so on are available, as is accurate location data. It was suggested at the Telco Big Data event that with relatively simple analysis it is even possible to determine the sex and age of customers, even if they aren’t named on the household bill.

Of course, monitoring and analysing data in this way quickly leads to privacy questions. I’m not going to attempt to discuss them here.

If big data analysis is done with appropriate care and respect for the customers privacy, it can pay-off for the service provider. At the Telco Big Data event, Sprint claimed they were already generating $10 million in revenue by selling, externally, market insight about their customers; Many other CSPs were claiming double-percentage-point increases in product sales by better targeting; Guavus, a vendor of telco-focused analytics software claimed a reduction of network equipment costs, for one customer, from $1bn to $54m based on better understanding of how the network is being used.

What stood out at Informa’s Telco Big Data event was the relative maturity of the solutions being implemented, the quality of the case study results and the pragmatic approach to identifying and leveraging available data sources. Unlike emerging technologies such as SDN, big data is ready for telco now, and has proven cost-cutting and revenue-increasing benefits.

Ps. Some would argue the correct title should be “where are the data”. I checked. Use of ‘data’ in the singular is valid, it’s just that many in-house journo/publishing style guidelines standardise on using it only in the plural.

 

, ,