- Create and maintain optimal data pipeline architecture, assemble large and complex datasets to meet functional/nonfunctional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, redesigning infrastructure for greater scalability, etc.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Support cross functional, cross BU data integration tasks.
- Experience with Computer Science relevant knowledge or degree in Computer Science, IT, or similar field; a Master's is a plus.
- Experience with Shell and Python 3.8+ at least 2 years for 1 -2 completed project life cycle.
- Build the infrastructure required for optimal extraction, transformation, and loading data from a wide variety of data sources using SQL and GCP 'big data' technologies, e.q. Airflow, Dagster, Argo Workflow.
- Experience with streaming processing system e.g. Pub/Sub, Kafka, Spark Streaming, Apache Flink, Cloud Dataflow, Apache Beam.
- Experience with RDBMS and NoSQL database, e.q. PostgreSQL, GCS, BigQuery, CloudSQL, Dataproc, Cassandra, Scylla, Elasticsearch, Druid, Redis.
- Experience with backend API development, deployment, e.q. FastAPI, Django, Flask, Sanic.
- Experience with Kubernetes, docker, e.q. build docker image.
- Experience with GNU/Linux system and do deployment and debug under the environment.
- Basic knowledge of machine learning algorithm.
- Basic knowledge of Data Lake and Data Warehouse Design.
- Basic knowledge of Data Modeling for OLTP and OLAP system.
- Experience with completed project lifecycle.
- Experience with the following programming languages: Go, Java, Scala.
- Experience with first line or second line system operation and maintenance.
To apply for this job email your details to email@example.com