How does a customer data platform work?

how does a customer data platform work?

How exactly does a customer data platform work, and help marketers leverage data to gain a better, more accurate understanding of customer behaviour?

Ingesting and integrating data

The first element in understanding this is ingesting data. CDP’s ingest customer data from multiple sources. Typically, these will include website data, paid digital, transactional, direct mail, retail, email, and call centre. All data received by a CDP will relate in some shape or form to a customer. The data is usually sent to a CDP using an API or via an SFTP site.

Customers have multiple identifiers and these change over time, such as mobile phone number, email address, cookie ID, postal address, customer reference or landline number. This data is collected, and these identifiers are used to generate a single customer view also known as ‘Identify Resolution’. For example: if someone logs into your website with their current email, but with a different cookie ID, then the new cookie ID is added to that particular customer record on the assumption that they are using a new device. Equally, if a new transaction record is received with the same customer reference, but a new address, then a new address is added on the basis they have either moved or added an extra residence.

As new data is ingested, each record goes through what is called the ‘purning’ process. This is the stage at which the record’s personal identifier(s) are matched against all other customer records that are held in the CDP until a match is or is not found. At this point the data may be matched into an existing single customer view or a new one created. Each recognised customer is given a permanent unique record number or ‘purn’.

Identity resolution

Is at the heart of a CDP and is central to all the rest of its functionality. A good CDPs’ functionality is rooted in the knowledge that people have multiple identifiers, and that these identifiers can all change. Over time many or all of these identifiers are likely to change for an individual. The CDP should keep a history for every one of every version of these, although regarding the latest versions as most likely to be current. This collection of identifiers is what it calls on to build the single customer view.

The data in a CDP is held in what is called a schema. This is the way in which the data is organised. Every organisation using a CDP will need their own schema although within an industry, schemas will have a lot of similarities.

Engineering derived data

Engineered data is important for the value it provides for selecting specific customer groups for communications or developing customer insight. It can comprise any variable that can be calculated using an algorithm or other means from the raw data in the customer data platform.

Data engineering can take many forms, from simple examples like banding variables such as age, to more complex ones like keeping a counter on customer’s total historic value. A major use of engineered data is in developing and recording scores derived from algorithms such as propensity models.

An example of an engineered data field is where we want to know what each customer has contributed to a business after the cost of acquiring them. We can then:

  • Use historic purchase data for each individual in say their first and second year since recruitment
  • Deduct the cost of acquisition which can be derived the channel they came in from
  • Deduct the cost of communications sent to them in the same period which is held in the contact history area
  • Calculate an individual customer contribution

Engineered data is updated at an individual level every time a relevant event happens; so, each new home shopping purchase, eCommerce transaction or physical retail transaction can lead to a changed score in the engineered data section. A great benefit of engineered data is that it allows you to base axis for charts or selections for campaigns on these additional variables.

Analysing customer data

A CDP is essential for gaining a full and accurate understanding of customer behaviour. For instance, without a CDP that combines web browsing history with transactions, it would not be possible to understand the relationship between the two. Again, if individual contact history is not held against a customer record then the effectiveness of campaigns that are sent to the customer, and to which the customer may respond through different channels, cannot be accurately measured.

The CDP builds the single customer view, and it is against this that customer analysis can take place. It provides the dataset that becomes the one authoritative source of information about customer behaviour for an organisation. With this in place decision makers have a firm basis on which to proceed.

There are so many aspects to the analytical tools that can be used to analyse customer data that there is little merit in trying to list them all. Some are built into the CDP and others require data to be first extracted from the CDP and then transferred to them. What matters is that they have the best possible customer data set to analyse.

So, the results from customer analysis form the basis on which key decisions about customer marketing can be made. These include such areas as:

  • Customer acquisition (targeting and channel choice)
  • Digital planning
  • Product development
  • Customer relationship management
  • Salesforce management
  • Pricing

Even corporate mergers and company valuations.

Given how important these decisions are, it makes good sense when designing a CDP to first start with a list of the kind of results that will be required from customer analysis so that for instance data is held with sufficient granularity to make these possible.

Connectivity to external systems

The CDP can support other systems in their personalisation and management of customer communications. Typical examples are:

  • Providing customer selections for email marketing systems
  • Customer segmentations for web personalisation technology
  • Names and addresses for postal marketing
  • Target audiences for social media

So just as the CDP ingests data from multiple sources it also provides selected data to external systems. These connections are usually made via an API or via transfer of data to an SFTP site.

Delivering personalised customer experiences

Within the CDP we expect to find functionality for the selection of specific customer groups either on a one-off or on a recurring basis. These groups are usually selected for output to external systems that manage the actual communications. The selections themselves can be simple based on Boolean logic rules, or they may be more complex based on propensity scores applied within the engineered data. They can also be based on triggers, such as a new customer having just been recruited.

The CDP needs to enable these different types of selection, and crucially record what contacts each individual customer has been selected for. Functionality is also required for test and control, and for including source codes with the selection.

Associated with delivering personalised customer experiences needs to be functionality for measuring the results of campaigns. This is often automated within the design of the CDP and should always include the ability to attribute results such as orders back to campaigns, even if they respond through different channels.

What are the costs for a customer data platform?

UniFida logo

UniFida is the trading name of Marketing Planning Services Ltd, a London based technology and data science company set up in 2014. Our overall aim is to help organisations build more customer value at less marketing cost.

Our technology focus has been to develop UniFida. Our data science business comes both from existing users of UniFida, and from clients looking to us to solve their more complex data related marketing questions.

Marketing is changing at an explosive speed, and our ambition is to help our clients stay empowered and ahead in this challenging environment.

Are you grasping the moment?

One thing is certain; we won’t forget what we did in these very special Corona Virus quarantine days.

And although none of us know how long our confinement is going to last, we do know that it is going to end.

We are experiencing an unprecedented once in a lifetime chance to do those things we don’t normally do and to get marketing ship shape for when we are eventually let out, and the wheels start turning full speed again.

Freed from the daily commute and the office banter, we can make real progress with the infrastructure and the customer knowledge that should drive marketing, and which is so often ignored when times are busy.

By now you must have started to build a ‘must do when away from the office’ list; and we have a few suggestions of things that you might consider including:

– finding out which of all your marketing initiatives in the last year actually made you money
– finishing the unmentionable GDPR project
– planning future recruitment to avoid black hole areas
– getting customer data into a single customer view
– asking customers what they like and don’t like about what you do
– making sure you have the dashboards you need to steer the ship
– getting the team to use some of the incredible array of free online training resources
– achieving consensus on the five most important areas to focus once you are free
– and so on, and on. The list can get very long very quickly

We are here to help; as well as our data science and technology arms, we have a marketing consultancy.  Its aim is simply to ensure that you have the tools and the customer knowledge to unlock the most customer value at the least marketing cost.

If you would like us to help you write or deliver on your shortlist, we are here to provide expert support.

For a quick dip into what our consultancy normally covers, then please click here to view a short PowerPoint.

Hoping that we can help you make the best use of the quarantine.

UniFida logo

UniFida is the trading name of Marketing Planning Services Ltd, a London based technology and data science company set up in 2014. Our overall aim is to help organisations build more customer value at less marketing cost.

Our technology focus has been to develop UniFida. Our data science business comes both from existing users of UniFida, and from clients looking to us to solve their more complex data related marketing questions.

Marketing is changing at an explosive speed, and our ambition is to help our clients stay empowered and ahead in this challenging environment.

What is the biggest reason for firms wanting to comply with GDPR?

It is in fact the need to meet customer expectations, and not, as many of us would have thought, the fear of legal action.

This is the result of a recent survey by TrustArc to over 600 UK and US companies. See the full July 2018 survey here.

Focussing on customer expectations, what we suspect customers actually want is a simple way to access the data that companies hold on them. Check and adjust their consents, and if necessary exercise the right to be forgotten.

If you can give customers this option by using a simple click through from your website, you will be taking the lead in this important aspect of the customer experience.

And it is with this in mind that we have built this capability into UniFida, our cloud-based customer data platform technology. So when we are connected to your website and customers have authenticated themselves, they can self-serve all the necessary GDPR functionality.

It’s not magic, it’s just technology!

UniFida logo

UniFida is the trading name of Marketing Planning Services Ltd, a London based technology and data science company set up in 2014. Our overall aim is to help organisations build more customer value at less marketing cost.

Our technology focus has been to develop UniFida. Our data science business comes both from existing users of UniFida, and from clients looking to us to solve their more complex data related marketing questions.

Marketing is changing at an explosive speed, and our ambition is to help our clients stay empowered and ahead in this challenging environment.

Tech essentials for managing personal data post GDPR

We have tried to keep this at a very high level, but we wanted to share with you our thoughts on the minimum tech you will need:

Please don’t hesitate to make contact if you need help, as we have an affordable, pre-packaged, cloud-hosted solution ready and waiting for you.

UniFida logo

UniFida is the trading name of Marketing Planning Services Ltd, a London based technology and data science company set up in 2014. Our overall aim is to help organisations build more customer value at less marketing cost.

Our technology focus has been to develop UniFida. Our data science business comes both from existing users of UniFida, and from clients looking to us to solve their more complex data related marketing questions.

Marketing is changing at an explosive speed, and our ambition is to help our clients stay empowered and ahead in this challenging environment.

The far from simple task of fulfilling subject access requests!

What’s the full process in a GDPR request?

When individuals call into your fulfilment centre, or reach you via email or letter, with a request exercising their rights under GDPR, they will be triggering what is in reality a complex process.

They may alternatively be directly accessing your on-line privacy portal, using self-service, but the steps that they will follow will be broadly the same.

Step one is to have all your data relating to each individual that your organisation deals with joined together into a single customer view. This will need to include on-line data you are holding like pages browsed linked to cookie IDs, as well as off-line data such as transactions. To make matters more difficult, the personal data may be held in an unstructured form such as emails or reports. It will be far beyond the capabilities of most organisation to have the unstructured data pre-packaged as part of the single customer view, but you will at least need the capability of searching for it.

Step two is to identify that the individual approaching you is who they purport to be. If they reach you by email or letter, you will most probably have a requirement to verify them by checking on some other identifiers you may hold, to avoid handing over personal information to the wrong recipient or making false changes to the information you hold on someone.

Step three is to be able to access what some people are now calling a consent vault; the place where all the opt-ins and opt-outs are held. GDPR has defined the information you need to hold about each consent that has been provided, such as how it was obtained and what statement the individual is agreeing or not agreeing to. The consent vault will,we expect,naturally form part of the single customer view. However, as well as holding the individual consents you will need to interpret them so that you can inform Mrs Smith of what, as things stand, you may or may not use her data for. We suggest developing a set of ‘traffic lights’ that work off the consents already provided, and which give clear guidance about what types of activity may be undertaken by which channel.

Step four is to allow Mrs Smith to change her consents. This is gong to be much easier if you have the traffic light system as Mrs Smith will have a clear idea of what is in place for her now, and hence what she might want to change. The new consents or withdrawal of consents will need to be data captured and potentially a record of that change sent to Mrs smith.

Step five comes when Mrs Smith asks for a copy of all the information you hold about her. A relatively easy step if you have the single customer view in place, but a much more difficult one if you don’t. And then if you have unstructured data referring to Mrs Smith this will also need to be searched. There are technology tools around to help your search process if the amount of unstructured data is very considerable or spread over several different systems.

Step six comes when Mrs Smith sees her data and wants to correct it. The corrections will need to be data captured and the changes will need to be communicated to any systems that are upstream of where the single customer view is being held. Good practice will, we expect, be to send Mrs Smith some form of notification of the new details you are holding.

Step seven happens when Mrs Smith exercises her rights to data portability. You will then have to provide her data in machine readable format to another data controller that she specifies. We envisage creating an HTML or equivalent file, and sending it to Mrs Smith by email. The data transferred should include not just data provided by Mrs Smith but data generated by you.

Step eight happens when Mrs Smith exercises her right to be forgotten. In this case you can maintain any non-personal data like transactions relating to her, but you have to delete or overwrite any personal data like email, mobile phone number, postal address, cookie ID etc. As well as deleting them in the single customer view, you will need to inform the upstream systems of the request so that they can do the same thing.

Step nine involves taking account of Mrs Smith’s requests when it comes to further processing of her data. She may have opted out of profiling, which means that you will not be able to manipulate her data using algorithms to make decisions concerning what you do or do not want to say to her, or what offers you want to make to her. She may alternatively not have provided positive consent to be emailed, so you must not include her in email campaigns etc. etc.

Step ten is to maintain an audit trail of what has been done in respect of GDPR requests. We suggest that these actions are most conveniently recorded as part of the information held in the single customer view. In this way you can meet any challenges from an individual or the ICO concerning how you are managing the GDPR processes.

We have tried to summarise in these ten steps all the process intricacies involved in dealing with GDPR requests.

We have developed our own cloud-based technology, called UniFida, to support clients in fulfilling such individual requests.

Contact us if you’d like our help with this.


As many of us know, the right to erasure of ‘personal data’ is one of the key elements within GDPR.

But that leaves a big question mark over exactly what ‘personal data’ is, and hence what data is to be erased.

Back in 2007 the European Data Protection Working Party published Article 39, and defined personal data as ‘any information relating to an identifiable natural person’.

They went on to say that ‘a natural person can be identified when, within a group of persons, he or she is distinguished from other members of the group’.

The ICO recently wrote ‘the more expansive [GDPR] definition provides for a wide range of personal identifiers to constitute personal data’. Here I assume that they are not changing the definition itself, but rather just adding in new forms of identifier, like cookie ID.

So, are we being asked to deduce that if a data item is not an identifier relating to a natural person, it is not personal data?

Assuming that this is the case, there are then two important questions that follow and which need to be answered: when is an item of data definitely not relating to a natural person, and hence something that does not need to be erased and within the remaining population of possible identifiers, how uniquely do identifiers need to point to a single individual to be classed as a true identifier?

Personal data does also need to be about an individual in order to relate to it. The value of a house is not personal data until you relate it to an owner, in which case it tells us something about how rich he is.

Many ordinary transactions for instance would not on their own relate to or identify an individual, but certain ones might do.

To take an extreme example, a sale record in the Land Registry for Buckingham Palace would certainly identify the transaction as being associated with the Queen. But less extreme examples like a pattern of phone calls made at certain times of day could also point to a unique individual.

And although a number on its own is clearly not an identifier, when you put it in the category of a list of customers, each with a customer number, then it definitely can become one.

So the question of whether an item of data does or does not point to an individual will depend on its context.

However,I suspect that GDPR is not expecting us marketers to use GCHQ level intercepts to trace links to an individual, which would then make almost every item of data a potential identifier, but rather expect us to employ simpler and more straight forward ones like an IP address, name or email.

But, at this point a further problem arises, which is that many of these ‘straightforward’ identifiers, like forename and surname, can point to more than one person. When I Googled I found that there are 50 people called Julian Berry in the UK, so Julian Berry in itself is not a unique identifier until it is associated with other information like his address.

Another experiment we ran was to look at whether through knowing data of birth, outbound postcode and gender you could uniquely identify an individual from a UK wide lifestyle database. And the unexpected answer was that in 70%+ cases you can.

This shows that although each of these data items on their own do not uniquely identify an individual, in combination they do.

So where is this taking us?

I would suggest that;

– data items on their own may not be relating to an individual or unique identifiers, and hence personal information, but that they may become so when associated with other data items. A forename and surname plus an address becomes a unique personal identifier. This means that, when deciding what are personal data, we will need to identify either individual data items that point uniquely to an individual like an email, or groups of data items that in combination may do so.

– data items that either on their own, or in association with other data items on a database, definitely do not relate to or point to an individual, are not personal information. Broadly speaking this will mean that when erasing personal data, we can leave areas like ‘transactions’ or ‘donations’ untouched.

This implies that we will need to do a review of all the data held by an organisation to define what could be, and what could not be, personal data, before setting up the technology to erase that personal data when requested to do so.

And when erasing it, we will need not just to erase the personal data held in downstream systems like a single customer view, but also upstream in source data systems. So this implies knowing where all the personal data about an individual has originated, and where it is currently being stored.

Has GDPR ignored the elephant in the database?

So has GDPR ignored the elephant in the database? ‘Personal data’ is defined in GDPR as ‘any information relating to a person who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that person’.

So each of us can be seen as ‘bristling’ with multiple potential identifiers, any or all of which may be stored by organisations using our personal data. And to add another layer of complexity, most of the commonly used identifiers, like email addresses or mobile phone numbers, may change on a regular basis.

All of us, as data subjects, can ask any organisation holding their data,for their personal data to be deleted, or transferred, or not to be used for marketing communications, or for profiling, or sold to anyone else etc. etc.

We may also change our minds about how our data can be used, and most probably forget what we have requested in the first place, because it’s not at all important to us.

So, for example, using our name and address as our ID, we request that organisation X does not profile our data, whilst using our email we ask to have our data deleted, and via our mobile phone number then expect to have our recent order traced.

GDPR tacitly assumes that persons about whom personal data is held can each be recognised uniquely, across all the identifiers they care to use, and as they change identifiers over time; and that from this basis rational interpretations can be made of their instructions.

This is evidently a delusion.

As vendors of a technology to build single customer views we know how difficult the identity problem is. The normal ‘shrinkage’ when we deduplicate a customer base across just say a couple of identifiers is around 20-25%; the more the types of identifier the greater the chance of duplicate records.

The technology we have developed to try to solve the problem is called UniFida, and it approaches the question of personal identifiers in a rather different way. It assumes, correctly, that all our common identifiers like email addresses, mobile numbers, cookie IDs etc. will change over time, and that individuals may have multiple versions of them.

So, it stores a history, for each individual, of all the identifiers it has been able to link. When an identifier arrives at UniFida as part of an on-line or off-line data feed, it searches the entire library of identifiers to see if it can get a match. In this way, it brings as much information about an individual together as is possible.

Has GDPR ignored the elephant in the database? To find out a little more about Unifida please contact us. It may make complying with GDPR a little bit more possible.