Cloud Data Engineer
Job Description
Design and Development:
- Design, develop, and maintain scalable ETL pipelines using cloud-native tools (AWS DMS, AWS Glue, Kafka, Azure Data Factory, GCP Dataflow, etc.).
- Architect and implement data lakes and data warehouses on cloud platforms (AWS, Azure, GCP).
- Develop and optimize data ingestion, transformation, and loading processes using Databricks, Snowflake, Redshift, BigQuery and Azure Synapse.
- Implement ETL processes using tools like Informatica, SAP Data Intelligence, and others.
- Develop and optimize data processing jobs using Spark Scala.
Data Integration and Management:
- Integrate various data sources, including relational databases, APIs, unstructured data, and ERP systems into the data lake.
- Ensure data quality and integrity through rigorous testing and validation.
- Perform data extraction from SAP or ERP systems when necessary.
Responsibilities Duties:
Performance Optimization:
- Monitor and optimize the performance of data pipelines and ETL processes.
- Implement best practices for data management, including data governance, security, and compliance.
Collaboration and Communication:
- Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions.
- Collaborate with cross-functional teams to design and implement data solutions that meet business needs.
Documentation and Maintenance:
- Document technical solutions, processes, and workflows.
- Maintain and troubleshoot existing ETL pipelines and data integrations.
Key Skills:
- Strong programming skills in Python, Java, or Scala.
- Proficient in SQL and query optimization techniques.
- Familiarity with data modeling, ETL/ELT processes, and data warehousing concepts.
- Knowledge of data governance, security, and compliance best practices.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills.
Experiance Qualifications:
- 7+ years of experience as a Data Engineer or in a similar role.
- Proven experience with cloud platforms: AWS, Azure, and GCP.
- Hands-on experience with cloud-native ETL tools such as AWS DMS, AWS Glue, Kafka, Azure Data Factory, GCP Dataflow, etc.
- Experience with other ETL tools like Informatica, SAP Data Intelligence, etc.
- Experience in building and managing data lakes and data warehouses.
- Proficiency with data platforms like Redshift, Snowflake, BigQuery, Databricks, and Azure Synapse.
- Experience with data extraction from SAP or ERP systems is a plus.
- Strong experience with Spark and Scala for data processing.
Benefits:
Training, health, insurance, commuting support, lunch service etc.