Data Engineer

We're looking for a talented Data Engineer to join our team and build innovative AI solutions. You'll be responsible for designing, developing, and maintaining our data and data pipeline architecture, specifically tailored for AI applications. The Data Engineer will collaborate closely with our Business Analysts, Architects, and Data Scientists on data initiatives, ensuring an optimal and consistent data delivery architecture across ongoing projects. The ideal candidate must be self-directed, adept at supporting the data needs of multiple business units, systems, and products, and possess excellent communication skills to effectively capture requirements and understand data context for robust pipeline development.

What you’ll do

  • Create and maintain data pipelines using the Google Cloud Platform using the following tools: Dataflow, PubSub, BigQuery and Cloud Storage (or their equivalent in other platforms such as AWS, Azure or Hadoop)
  • Work with data scientists in building and optimizing our AI solutions for greater functionality in our data systems
  • Build analytics tools that utilize the data pipeline to provide actionable insights into operational efficiencies
  • Identify, design and implement process improvements: optimize data delivery and automate manual processes 
  • Work with stakeholders including Product Owners, Technology, Data and Architecture teams to assist with data-related technical issues and support their data needs.
  • Maintain data integrity and regionalization by defining boundaries through multiple GCP zones 
  • Design and develop visualization using Tableau to perform statistical analysis of data

What experience you need 

  • Bachelor’s Degree in Computer Science, Statistics, Mathematics or another quantitative field
  • At least 4 years of working experience
  • Experience with big data cloud tools tools: PubSub, Dataflow, BigQuery (Hadoop, Spark, Kafka, etc.)
  • At least 2 years of experience with relational SQL and NoSQL databases
  • At least 2 years of experience with object oriented/object function scripting languages: Python, Java, Scala, etc.
  • At least 2 years of experience working with data integration tools such as Informatica, Pentaho, Talend, DataStage

What could set you apart

  • GCP, AWS or Azure cloud certifications
  • Experience working in an agile development environment 
  • Experience specifically supporting Machine Learning or AI projects, including data preprocessing, feature engineering, and building pipelines for model training/serving datasets.
  • Familiarity with containerization technologies like Docker and orchestration systems like Kubernetes.

Primary Location:

CRI-Sabana

Function:

Function - Data and Analytics

Schedule:

Full time

Related vacancies