Country
Czechia
Job Family
Technology
At GfK, we are on a mission to drive sustainable growth for our people, our clients, and the world around us. We combine prescriptive insights and consulting expertise to analyze, explain and predict what is happening in today’s fast-changing world.
Our employees, the shapers of tomorrow, are empowered to bring new bold ideas to life by connecting unique datasets, science, and digital research. We encourage innovation and offer global career and fast development opportunities. This is why the world’s largest companies and leading brands know GfK as their trusted partner.
Job Description
Responsibilities
- Architect, design, and implement data processing pipelines from scratch
- Maintain existing data processing pipelines and data science products running in production
- Be able to decide and then work on which problems to solve with scale-up and which ones to tackle with scale-out approaches and how to combine them in data pipelines
- Should have a deep understanding of the impacts associated with batch vs. (near) real-time processing
- Manage and collaborate with Data Scientists / Statisticians for building and making production ready configurable products for internal and external client projects and work streams
- Support Data Scientists in coming up with quick machine learning models in proof-of-concepts and prototypes by providing infrastructural support and necessary technical guidance
- Be responsible to integrate partial solutions and RnD results into production-ready, scalable and configurable data pipelines
- Create and improve existing methodologies and tooling and contribute to best practices in the area of Data Engineering
- Advise on data transformation workflow designs and optimizations on the dimensions of cost, performance and delivery mechanisms / streams / increments / etc.
- Should have and present when needed an economic view on questions of operability of crafted data pipelines with respect to (but not only limited to)
- Maintenance of data integrity in cases of reprocessing
- Understanding of whether or how (internal or external) customer data deliveries are affected by changes to the software
- Should have an awareness of data design patterns in the organization and best practices followed in general in software development
- Be responsible for Designing Quality Control/Validation processes / Quality Gates
- Regularly participate in Communities of Practice
- Present GfK’s Data Engineering capabilities at conferences and workshops
- Support development of talent and organization of required trainings
Qualifications
- Solid understanding and also practical experience of Big Data Architectures and Big Data environments (e.g., Cloudera Hadoop, Hortonworks Data Platform, Map/Reduce, Hive)
- Expert knowledge in cloud infrastructure services (preferably AWS)
- Solid skills with regard to performance optimization and scalability (e.g., parallelization, code optimization, containerization, function as a service)
- Expert knowledge of programming languages (especially Python, Java, Scala), analytic languages such as R along with its respective ecosystem like R-Studio Server.
- Expert knowledge of scalable distributed engines such as Spark
- Expert knowledge of terraform for maintaining the Infrastructure as a Code
- Expert knowledge of orchestration tools (e.g. Airflow) setup, configuration and usage
- Experience in setting up of logging and monitoring frameworks for existing and new products and pipelines
- Solid skills is usage of git and setting up of CI/CD pipelines using GitLab for e.g.
- Should have practical experience of software development best practices like TDD, pair-programming
- Expert skills with regard to database handling, in particular SQL, NoSQL
- Good knowledge of different data design patterns, data lake architecture, data modelling, schema evolution etc.
- Expert knowledge of optimization and / or machine learning algorithms and / or deep learning algorithms
- Domain knowledge of Media Measurement in the context of TV, Radio and Internet will be preferred
- Should have worked in an agile environment
- Should be able to translate and clearly communicate sophisticated analytical information to none-expert stakeholders and clients
- Proven project management skills
- Proven client facing skills
- Ability to fluidly communicate in an English business environment
- Master degree that reflects strong IT / Big Data / computer science knowledge.
- Typically 5+ years of work experience in an industry having the above technologies
Don't meet every single requirement? Some people are less likely to apply unless they meet all the requirements listed in a job specification. GfK is looking for self-starters to join our innovative team keen to take on a new challenge. So, if you're excited about this role but your skills and experience don't align perfectly with every requirement we've listed, we still encourage you to apply. You may be just the right candidate for this or other roles.
We are an ethical and honest company that is wholly committed to its clients and employees. We are proud to be an inclusive workplace for all and are committed to equal opportunity in employment which focuses on all of our employees reaching their full potential. We are looking forward to meeting you!