Subscribe & Follow
Jobs
- Journalist Intern Johannesburg
- Senior Sales Representative Johannesburg
- Journalism Internship - Paid Position Centurion
Big data: Man vs the machine
Rugby fans, imagine cracking the top 0.5% of SuperBru, meaning your rugby predictions are more accurate than 99.5% of others worldwide. That's just what Principa's two teams - Nero and Trojan - did, by looking at a wide range of data to predict as accurately as possible the win-lose outcome of each match and the margin between the two teams. They've looked at data from over 6,000 matches played by 99 teams going back 20 years to identify patterns that are highly probable to repeat in future. They even predicted the upset between South Africa and Japan.
It's this mix of predictive analytics and machine learning used with big data that the South African-based data analytics company has dubbed 'Man vs the Machine'. Rossouw tells us more...
1. Firstly, your analytics actually predicted the Japan-SA defeat? That's impressive!
Rossouw: One of our teams' earlier models indeed predicted a favourable result for Japan - remember, South Africa weren't coming off of good recent performances, but Japan were. There was also no track record of matches between the two sides.
The data that we had initially indicated a chance of an upset. We later brought in some additional data and our model retrained and decided that SA would win. We went with the updated model.
2. Why do you have two teams of data scientists vying against each other?
Rossouw: For fun and to generate some internal competition, just in case the 2015 RWC competition proved a bit dull, which it certainly has not. We also wanted to see whether we would get different results by looking at different data sources and applying different underlying algorithms. Interestingly, our predictions are quite different, but our positions in the SuperBru contest are quite close. We are watching with close interest how the rest of the tournament unfolds for both teams.
3. What's the importance of predictive analytics for business?
Rossouw: Predictive analytics is a way of reaching a purely objective decision by looking at various disparate data sources. The human brain is extremely fast at picking out everyday patterns like: "is that a leopard in the tree and should I start running?" A computer cannot compete with the brain's processing speed in this type of situation. However, computers come into their own when the number of data sources are too numerous and the interactions between the data sources are too complex for the human mind to identify patterns - especially if the datasets are made up of disparate information in a tabulated form instead of in the form of an image.
Predictive analytics techniques can be applied across many fields - for example, a retailer wants to determine which customers are most likely to purchase or to leave them for the competition, or whether a customer is fraudulently claiming against his or her insurance, or even which team will win the upcoming rugby game. Predictive analytics enables businesses to optimise their operations by improving the results of customer acquisition or retention strategies to fraud prevention and collections strategies by giving them the ability to predict the most likely outcome based on past behaviour.
4. How do you apply these predictive analytics based on prior customer behaviour?
Rossouw: The concepts that we use to predict whether a future customer will be a good paying customer or which product should be offered to a specific existing customer are the same concepts that have been used to predict the rugby scores. We assume that what has been observed in the past will occur again in the future. This is the fundamental assumption and as long as there is consistency in the system, the concept works time and time again and across numerous fields of study.
5. What's the value of this type of application for business?
Rossouw: The value is essentially a business version of a crystal ball. Ask it any question, and by using predictive analytics - and enough of the right data - we can provide companies with the answers to most questions with a relatively high degree of accuracy, such as which customers are most likely to respond to a campaign and which account holders are most likely to pay money owed. By giving businesses the ability to predict a result with a high degree of accuracy based on patterns in past behaviour, they are able to reduce the cost and time spent on campaigns or collections by targeting only those contacts "most likely to X." The same concept can be applied to operations to improve efficiency and reduce costs, such as: "if I call these specific contacts in the evening, I am most likely to get a response."
6. Tell us more about 'machine learning'.
Rossouw: Machine learning and predictive analytics are closely related. Machine learning is all about setting up a computer-based system so that it learns or re-trains automatically from previous results. For example, if Japan beats South Africa in the RWC, then the machine learning system will look at what resulted in this upset - along with what happened in all the other previous games, of course - and then the recent learnings will be applied to the next prediction. The concept of machine learning is very exciting, because it means that one can set up a system that automatically adapts to recent changes in the data with very little intervention from the user. The concept of such a system that naturally "follows the trend" makes a lot of sense to many. The applications within the commercial space are numerous, such as on-going improvements in fraud detection, call centre optimisation (whom to call and when, how to channel the various incoming calls), responding automatically to basic email queries, various voice and text analytics applications. There are also applications in other areas such as traffic control, logistics, object recognition, stock market analysis and many others.
A concept that's certainly relevant to any industry. If you're interested in finding out more, Principa is posting its teams' data-driven predictions before every match on their website as well as via their Twitter and Facebook accounts.