Building trust with business teams by predicting call reiteration to customer service

@ Orange


After working one year on the building of Orange's B2B Big Data platform and its first data exposition projects, our team had made available plenty of data accessible in one place.

We were ready to take a step forward in adding value and using this data for Machine Learning projects.


We quickly realized that nobody would entrust us with Machine Learning projects if we couldn't prove having succeeded in one.

We had to build an attention grabbing project to showcase.

We had all the data about customer relation available. We thought that clients with usolved problems could have specific behaviors leading to successive calls to customer service.

Identifying those unsatisfied customers could help take actions to serve them better.



We formulated the problem as a supervised learning classification task : for each client, we should be able to predict the likelyhood of call reiteration based on historical contact data.


Showing our first imperfect results (~65% accuracy) was enough to convince customer service and marketing departments to entrust us with two projects they were working on.

We put this first project on hold to launch them quickly.

Our most promising axis for future improvements on this first project was to feature engineer more meaningful metrics for the past contact detail.

Tools & methods :

Hadoop, Hive, Scala, Spark, Spark ML, Spark MLlib, Supervised learning, Logistic Regression.

Interested in cooperation or would like to discuss anything ?