How to analyse your customers social profile in 24 hours (Part II – analysis) → Big Data Partnership → Unlock Value from Complex Data

How to analyse your customers social profile in 24 hours (Part II – analysis)

(This is the second part of the post – How to analyse your customer social profile in 24 hours – Data and Collection)

Community Level

After collecting the data as described in the previous post, we can look into the data and visualize some aspects of it. There are many questions we can ask of this data, however an obvious one is what are the people who are engaged in the given topic talking about? This is important to understand in order to identify how the organisation can approach these users and market itself. We can get a quick overview of this by producing a word cloud of the most used words in the tweets.

This gives a high level overview of the topics that might be worth looking into. Note that this goes beyond the initially specified tags (see the previous post), as it discovers a number of associations between the given topics and what people actually talk about. This high level extraction can be replaced by more sophisticated methods that can assign importance to the topics based on the influence of the user and how close they are to the target organisations. Essentially, this is a trade-off between exploring new topics and exploiting known topics.

We can also produce a word cloud of the topics, followers of the organisation were talking about.

As these tweets were not restricted to the topics of the target organisation, we get a wide range of keywords here. It is clear, that apart from the main topics that interests most of the people, these people have a slight divergence from the average towards the target organisation. This divergence is what we are interested in identifying. You can spot some keywords that would not turn up from a randomly sampled population of Twitter, for example cfpreform, GreenpeaceUK, overfishing, BristolZooGdns etc. This suggests that the followers of this account are more interested in nature related topics.

Individual Level

Now we can look at what can be derived about the people who form these two communities. We would like to understand who are the influential people who form opinions and spread information. In other words, we aim to identify who the influencers are and therefore who the organisation should start to engage with more closely. This information can also be fed back to the previous analysis by weighting topics and keyword depending on how influential the originator of the tweet is. The simplest way to identify people is to look how active they are and how many followers they have (note that these two factors are not independent).  In the literature, other factors were incorporated into this score including the number of retweets and mentions.  The figure below shows the top-20 followers of the RSPB account.

These users produced around 7% of the tweets we collected. It is important to concentrate on these people as the information they find interesting is very likely to spread. Users in the long tail should not be abandoned too, however, it is more problematic to define a strategy to reach out to those users. The same approach can also be applied to the users of the target topic; the top influences can be identified and targeted. However, the long tail distribution is even more apparent in that case, as the top-20 users produced only 5% of the tweets.

User profiling

Apart from detecting whether a user is influential, a number of additional characteristics can be inferred. From example, using various data-mining and predicting techniques, my Twitter profile analysis says that

  • I am 25-34 year old
  • I live near London, England, United Kingdom
  • I can potentially reach 1378 users
  • My network is composed of 25-34 year olds followed by 18-20 year olds
  • I frequently talk about android, apple and big data
  • My personality type is inquisitive, cautious (source: )
  • My style is academic
This analysis can be extended to obtain information on personality traits (openness to experience, conscientiousness, extroversion, agreeableness, neuroticism), wealth etc. This analysis is useful to help personalise strategies that are used to approach the user and also to recommend the best product that the user would find useful.
I described a quick analysis of Twitter profiles for RSPB, at a high level, but there is great potential to dig deeper if needed. There are a number of possible directions, including more sophisticated topic extraction, sentiment analysis and latent topic modelling. To summarise, the approach is to target two “communities�?; people who talk about certain topics related to the organisation and people who follow the organisation. The former mainly helps to reach out to people who might be interested in what the organisation can offer, but, as we discovered, it can also help identify additional topics. The latter approach is to identify topics that the followers are talking about which can be used to determine how organisation might advertise and market itself towards those individuals. This way we can cover both new topics and new users of interest and identify users/topics that need further analysis.

 

Posted on June 1, 2012 in Blog, Business, General, Technology

Share the Story

Leave a reply

Back to Top