Data Engineer

Data Engineer

Job ID:  14650
Job ID:  14650
Date:  12 Jun 2025
Date:  12 Jun 2025
Location: 

Dholera, GJ, IN

Location: 

Dholera, GJ, IN

Department:  Fab AI.Digital
Department:  Fab AI.Digital
Business:  Semifab
Business:  Semifab

About The Business - 

 

Tata Electronics Private Limited (TEPL) is a greenfield venture of the Tata Group with expertise in manufacturing precision components.

Tata Electronics (a wholly owned subsidiary of Tata Sons Pvt. Ltd.) is building India’s first AI-enabled state-of-the-art Semiconductor Foundry. This facility will produce chips for applications such as power management IC, display drivers, microcontrollers (MCU) and high-performance computing logic, addressing the growing demand in markets such as automotive, computing and data storage, wireless communications and artificial intelligence.

Tata Electronics is a subsidiary of the Tata group. The Tata Group operates in more than 100 countries across six continents, with the mission 'To improve the quality of life of the communities we serve globally, through long term stakeholder value creation based on leadership with Trust.’

 

Job Responsibilities -

 

•    Architect and implement scalable offline data pipelines for manufacturing systems including AMHS, MES, SCADA, PLCs, vision systems, and sensor data. 
•    Design and optimize ETL/ELT workflows using Python, Spark, SQL, and orchestration tools (e.g., Airflow) to transform raw data into actionable insights. 
•    Lead database design and performance tuning across SQL and NoSQL systems, optimizing schema design, queries, and indexing strategies for manufacturing data. 
•    Enforce robust data governance by implementing data quality checks, lineage tracking, access controls, security measures, and retention policies. 
•    Optimize storage and processing efficiency through strategic use of formats (Parquet, ORC), compression, partitioning, and indexing for high-performance analytics. 
•    Implement streaming data solutions (using Kafka/RabbitMQ) to handle real-time data flows and ensure synchronization across control systems. 
•    Building dashboards using analytics tools like Grafana.
•    Good Understanding of Hadoop ecosystem.
•    Develop standardized data models and APIs to ensure consistency across manufacturing systems and enable data consumption by downstream applications. 
•    Collaborate cross-functionally with Platform Engineers, Data Scientists, Automation teams, IT Operations, Manufacturing, and Quality departments. 
•    Mentor junior engineers while establishing best practices, documentation standards, and fostering a data-driven culture throughout the organization.

 

Essential Attributes -

 

•    Expertise in Python programming for building robust ETL/ELT pipelines and automating data workflows.
•    Proficiency with Hadoops ecosystem.
•    Hands-on experience with Apache Spark (PySpark) for distributed data processing and large-scale transformations.
•    Strong proficiency in SQL for data extraction, transformation, and performance tuning across structured datasets.
•    Proficient in using Apache Airflow to orchestrate and monitor complex data workflows reliably.
•    Skilled in real-time data streaming using Kafka or RabbitMQ to handle data from manufacturing control systems.
•    Experience with both SQL and NoSQL databases, including PostgreSQL, Timescale DB, and MongoDB, for managing diverse data types.
•    In-depth knowledge of data lake architectures and efficient file formats like Parquet and ORC for high-performance analytics.
•    Proficient in containerization and CI/CD practices using Docker and Jenkins or GitHub Actions for production-grade deployments.
•    Strong understanding of data governance principles, including data quality, lineage tracking, and access control.
•    Ability to design and expose RESTful APIs using FastAPI or Flask to enable standardized and scalable data consumption.

 

Qualifications -

 

•    BE/ME Degree in Computer science, Electronics, Electrical

 

Desired Experience Level -

 

•    Masters+ 2 Years of relevant experience.
•    Bachelors+4 Years of relevant experience.
•    Experience with semiconductor industry is a plus

About The Business - 

 

Tata Electronics Private Limited (TEPL) is a greenfield venture of the Tata Group with expertise in manufacturing precision components.

Tata Electronics (a wholly owned subsidiary of Tata Sons Pvt. Ltd.) is building India’s first AI-enabled state-of-the-art Semiconductor Foundry. This facility will produce chips for applications such as power management IC, display drivers, microcontrollers (MCU) and high-performance computing logic, addressing the growing demand in markets such as automotive, computing and data storage, wireless communications and artificial intelligence.

Tata Electronics is a subsidiary of the Tata group. The Tata Group operates in more than 100 countries across six continents, with the mission 'To improve the quality of life of the communities we serve globally, through long term stakeholder value creation based on leadership with Trust.’

 

Job Responsibilities -

 

•    Architect and implement scalable offline data pipelines for manufacturing systems including AMHS, MES, SCADA, PLCs, vision systems, and sensor data. 
•    Design and optimize ETL/ELT workflows using Python, Spark, SQL, and orchestration tools (e.g., Airflow) to transform raw data into actionable insights. 
•    Lead database design and performance tuning across SQL and NoSQL systems, optimizing schema design, queries, and indexing strategies for manufacturing data. 
•    Enforce robust data governance by implementing data quality checks, lineage tracking, access controls, security measures, and retention policies. 
•    Optimize storage and processing efficiency through strategic use of formats (Parquet, ORC), compression, partitioning, and indexing for high-performance analytics. 
•    Implement streaming data solutions (using Kafka/RabbitMQ) to handle real-time data flows and ensure synchronization across control systems. 
•    Building dashboards using analytics tools like Grafana.
•    Good Understanding of Hadoop ecosystem.
•    Develop standardized data models and APIs to ensure consistency across manufacturing systems and enable data consumption by downstream applications. 
•    Collaborate cross-functionally with Platform Engineers, Data Scientists, Automation teams, IT Operations, Manufacturing, and Quality departments. 
•    Mentor junior engineers while establishing best practices, documentation standards, and fostering a data-driven culture throughout the organization.

 

Essential Attributes -

 

•    Expertise in Python programming for building robust ETL/ELT pipelines and automating data workflows.
•    Proficiency with Hadoops ecosystem.
•    Hands-on experience with Apache Spark (PySpark) for distributed data processing and large-scale transformations.
•    Strong proficiency in SQL for data extraction, transformation, and performance tuning across structured datasets.
•    Proficient in using Apache Airflow to orchestrate and monitor complex data workflows reliably.
•    Skilled in real-time data streaming using Kafka or RabbitMQ to handle data from manufacturing control systems.
•    Experience with both SQL and NoSQL databases, including PostgreSQL, Timescale DB, and MongoDB, for managing diverse data types.
•    In-depth knowledge of data lake architectures and efficient file formats like Parquet and ORC for high-performance analytics.
•    Proficient in containerization and CI/CD practices using Docker and Jenkins or GitHub Actions for production-grade deployments.
•    Strong understanding of data governance principles, including data quality, lineage tracking, and access control.
•    Ability to design and expose RESTful APIs using FastAPI or Flask to enable standardized and scalable data consumption.

 

Qualifications -

 

•    BE/ME Degree in Computer science, Electronics, Electrical

 

Desired Experience Level -

 

•    Masters+ 2 Years of relevant experience.
•    Bachelors+4 Years of relevant experience.
•    Experience with semiconductor industry is a plus

Learn More about TATA Electronics

Learn More About Tata Electronics