Context and problematic
Resumption of a complicated context that exists, with a lot of abortive initiative.
We had to allow Transdev to regain control of their data by centralizing the latter in one place, but most of all, convince the businesses and management of the contribution of these technologies.
We had a first batch of 8 use cases to bring concrete results.
Initially, we focused on pilot projects with Quick-Win. Subsequently, we intervened in a more global way on all Data subjects.
Goals
Definition of the Data strategy
Support the customer on the centralization and enhancement of their Data through the creation of their DataHub (DataWarehouse).
Industrialization of 5 BI use cases
Demonstrate the contribution of Data Sciences via 8 Uses-Cases
Preparation and support for a more global deployment
Establish a Data-Driven and DevOps culture
Our intervention
Audit
- State of maturity and knowledge of Data subjects/culture
- Analysis and inventory of the technologies which are used
- Organize the architecture and typology of the various group databases in order to define a representative target scope
- IS impact study
Data Strategy– Definition and implementation :
- CDC writing for ETL publisher consultation
- Drafting the CDC to consult a system which makes it easy to host standard data
- Drafting of the CDC to consult Data Analysis tools
- Solution benchmark and pilot project launch with Dataiku
AMOA :
- Definition of use cases
- Definition of a target scope: business, infrastructure, etc
- Definition of the roadmap
Data Architecture – definition and conception of DataWarehouse :
- Infrastructure architecture through AWS
- Definition of the Data ecosystem
- Setting up the environment
Data Engineering / Big Data development:
- Development and industrialization of ingestion pipelines under Spark and Scala
- BI architecture: development and implementation of BI modules for 5 user-cases
- Industrialization of models in Python and Scala
- DevOps: Implementation of the ecosystem and related practices
Architecture BI: development and implementation of BI modules for the 5 User-Cases :
- Analysis of business requirements
- Mapping of data and repositories
- Setting up the BI ecosystem: Power BI and Tableau
Data Analysis and Data Visualization: development and deployment of User-Cases :
- Scorecard Matrix: Drivers and Manager
- Providing drivers and their managers with key indicators to trigger operational responses, properly manage individual performance and improve the company’s performance through its relationship with AOs.
- User case 2 : visualization of network traffic depending on the theoretical offer and the various services offered to travelers.
- Management Dashboard : provision of homogeneous indicators to operational managers and COMEX, which enables operational responses that need to be triggered, limiting the reporting burden in terms of the entity, by improving data quality (standardization of definitions, harmonization of benchmarks…).
Data Sciences: development and industrialization of models :
- Incident Classification (NLP and Time Series) : automatic incident analysis and classification. Prediction on the average resolution time.
- Predictive maintenance (Time Series) : prediction of the failure rate on Bus type rolling stock
- License plate detection and reading (Computer Vision): identify return buses to the warehouse
- Sentiment analysis: analysis of social networks to pinpoint issues. Analysis of satisfaction questionnaires
- Churn : analysis and definition of patterns on potentially churning clients
Results
Operational DataHub for the entire target scope 4 DashBoards put into production instead of the 5 which were planned
7 Data Sciences User-Cases are deployed
Technical environment
AWS – SnowFlakes – Talend – Python – Scala – Spark – Docker – ElasticSearch – Keras TensorFlow – PyTorch – Tableau – PowerBI – Dataiku