Responsibilities :
During the definition of the project
Design of data ingestion chains
Design of data preparation chains
Design of basic ML algorithms
Data product design
Design of NOSQL data models
Data visualization design
Participation in the selection of services / solutions to be used according to usage
Participation in the development of a data toolbox
During the iterative realization phase
Implementation of data ingestion chains
Implementation of data preparation chains
Implementation of basic ML algorithms
Implementation of data visualizations
Use of ML framework
Implementation of data products
Exhibition of data products
Configuration of NOSQL databases
Distributed processing implementation
Use of functional languages
Debugging distributed processing and algorithms
Identification and cataloging of reusable items
Contribution to the evolution of work standards
Contribution and advice on data processing problems
During integration and deployment
Participation in problem solving
During serial life
Participation in the monitoring of Operations
Participation in problem solving
Competencies:
· Leadership
Managing junior technology resources
· Business sense
Good knowledge of industry perspective
· Analytics skills
Interest in innovative technologies and desire to work on pioneering engagements
· Methodological competence, e.g., agile software and test-driven development
· Degree in computer science, electrical engineering, or other relevant engineering
· Strong drive and motivation
· Fluent in English (verbal and written)
· Expertise in the implementation of end-to-end data processing chains
· Mastery of distributed development
· Basic knowledge and interest in the development of ML algorithms
· Knowledge of ingestion frameworks
Apache beam
· Knowledge of Beam and its different execution modes on DataFlow
Batch or streams
· Knowledge of Spark and its different modules
· Mastery of Java (+ Scala and Python)
Python ok, Java
· Knowledge of the GCP ecosystem DataProc, DataFlow, BigQuery, Pub-Sub, PostgreSQL/Composer, Cloud Functions, StackDriver)
· Knowledge of the use of Solace
· Knowledge of Spotfire & Dynatrace
· Knowledge of the ecosystem of NOSQL databases
· Knowledge in building data product APIs
· Knowledge of Dataviz tools and libraries
· Ease in debugging Beam (+ Spark) and distributed systems
· Popularization of complex systems
· Control of the use of data notebooks
· Expertise in data testing strategies
· Strong problem-solving skills, intelligence, initiative and ability to resist pressure
· Excellent interpersonal skills and great communication skills (ability to go into detail)
General role:
Contribute to the business value of Data-oriented products based on on-premise Datalake or on cloud environments, by implementing end-to-end data processing chains, from ingestion to API exposure and data visualization.
General responsibility: Quality of data transformed in the Datalake, proper functioning of data processing chains and optimization of the use of resources of on-premise or cloud clusters by data processing chains.
General skills: Experience in the implementation of end-to-end data processing chains and Big data architectures in the Cloud (GCP) mastery of languages and frameworks for the processing of massive data in particular in Streaming Mode (Beam DataFlow , Java, Spark / Scala / DataProc). Practice agile methods.
Il n'y a pas d'offres.
FreelanceDay © Copyrights 2019