Our customer is a niche engineering company based in Leuven, Belgium and specialized in building Data Lakes using Scala, Spark, AWS, airflow and bit of Python. Their customers are big enterprises in Benilux region in Telecom, eCommerce, Energy and other industries.
- Developing data pipelines and data-marts on Spark and Scala;
- Creating new high-load data processing services;
- Architect, design, code and maintain components for aggregating billions of DB records;
- Managing the cloud-based data & analytics platform;
- Deploying updates and fixes and assist technical support;
- Working directly with the business teams to rapidly prototype analytics solutions based upon business requirements;
- Exploration and validation of data from various sources, generation of new reports based on investigated data;
- Gathering, analysis and documentation of requirements for data processing and reporting;
- Development of data-processing pipelines and Operational monitoring of key metrics and determining the causes of deviations from the expected values.
- At least 2 years of experience in Data Engineers using Spark/Scala;
- Proficiency in SQL;
- Hands-on experience with Scala/Java;
- Working experience with AWS Cloud;
- Understanding the principles of Massive Parallel Processing;
- BI tools knowledge;
- Understanding of data modeling and data warehousing concepts;
- Excellent communication and interpersonal skills.
Nice to have:
- ETL development background;
- Column-based data warehouse: Amazon Redshift / Snowflake / Google BigTable / Teradata / CosmosDB;
- Process orchestration software: Apache Airflow / Prefect / Dagster;
- Serverless computation frameworks.