Project Overview:
Requirements:
- Strong SQL experience (ad hoc queries, optimization techniques, preferably PostgreSQL);
- Python;
- ETL/microservices;
- Data processing & visualization (pandas; matplotlib / hvPlot / plotly; jupyter / jupyterhub);
- Basic web development (Django / Flask / Falcon; SQLAlchemy);
- ETL experience / Data Warehouse concepts;
- Schema Architecture (e.g. Flat-file / Star / Snowflake);
- Processing Paradigm ( Batch / mini-batch / streaming / change data capture / lambda / kappa);
- Orchestration: Airflow / Prefect / Luigi / Jenkins;
- Storage infrastructure for scalable OLAP processing: Postgres / Amazon Redshift;
- Experience with any batch processing technology: SQL / Talend / Elastic / Spark / Custom Built;
- Experience with any streaming processing technology: Kafka / Spark Streaming / Storm / Flink;
- Basic experience with Business Intelligence Visualization Tool (e.g. PowerBI, Tableau );
- Good conceptual knowledge on streaming processing (Eventual consistency, duplication handling, data latency, water mark, stateless stream);
- Good knowledge of Docker;
- English upper-intermediate or advanced.
Nice to have:
- AWS infrastructure;
- Kubernetes;
- CI/CD knowledge;
- Basic Java.
Higher Education: Bachelor’s Degree.