You are viewing a preview of this job. Log in or register to view more details about this job.

Data Engineer

Come solve real problems, with really awesome people :)


At Noodle.ai, we are not just building AI applications. We are going deep into industries that have yet to leverage AI at scale such as steel mills, distribution & logistics companies, or consumer packaged goods. Our applications fit and integrate deeply into the supply chain from raw materials to shelf. The applications we build not only need to integrate into the existing software in these industries, but also need to talk to each other to really drive the value from AI. Turns out, we are one of the pioneers here charting a new course. This means that science behind building the software and the AI behind it has not settled. You will be part of a team that is charting this new course figuring out how to adapt software engineering best practices to delivering AI applications that fit within legacy software in non-tech industries. This is going to be an exciting ride full of opportunity for impact, learning, and challenges we will tackle together.

Noodle.ai’s Data Engineers have a strong understanding of database structures, modeling, and data warehousing techniques; know how to create SQL queries, stored procedures, views and define best practices for engineering scalable secure data pipelines. We are looking for people who are not afraid of the unknown, are experts at their craft, and can adapt and learn as we create a suite of new AI applications.

ROLES AND RESPONSIBILITIES:-
  • Support and monitor multiple data pipelines across different customers
  • Work closely with the development team to understand changes to every release
  • Contribute to the data engineering development work when needed
  • Collaboration with multiple stakeholders including and not limited to the Infrastructure, DevOps, Data science among other teams
  • Interface with customer-facing teams 

Must-haves
  • Relevant experience of at least 2 years
  • Undergraduate degree in a relevant field (ex. Computer Science) or equivalent experience
  • Good knowledge of Python, especially for data processing
  • Very good knowledge of SQL and experience with writing complex SQL on PostgreSQL
  • Experience in data pipeline orchestration tools, preferably Airflow
  • Basic understanding of containers and familiarity with docker commands
  • Working knowledge of any distributed systems like Spark, Hadoop, etc.
  • Have been part of data engineering engagements involving developing complex data pipelines or ETL/ELT processes involving ingestion & processing of data
  • Very good debugging skills
  • Flexible to learn new technologies and adapt to a dynamic environment
 
Nice to have
  • Exposure to cloud (preferably AWS)
  • Working experience on Snowflake
  • Basic understanding and usage of Jenkins for continuous deployment
  • Understanding of ML model lifecycle and pipelines