JOB SUMMARY: We are seeking a highly experienced Senior Data Engineer to join our team in building robust, scalable data pipelines and platforms for factory-scale AI deployments. This role will focus on architecting and maintaining data flows from edge collection to Lakehouse systems using modern tools like Kafka, Airflow, and Fluentbit. You will work closely with ML, infrastructure, and software teams to ensure data is high-quality, accessible, and optimized for analytics and AI.
RESPONSIBILITIES:
- Lead the design, implementation, and maintenance of scalable data pipelines from industrial machines to Hot/Cold Storage.
- Design robust ingestion systems using Kafka, Fluentbit, and Airflow to support real-time and batch processing.
- Create efficient data models using PostgresDB and manage data transformation for Parquet-based storage.
- Collaborate with AI/ML engineers on data preparation for model training, RAG systems, and vector storage.
- Ensure end-to-end data quality by implementing validation and observability tools (e.g., Grafana, Elasticsearch, Loki).
- Establish and enforce data governance and version control across the Lakehouse architecture.
- Document architecture decisions and mentor junior team members on best practices.