Media



The Office of CAN, in collaboration with the Department of Pure and Applied Mathematics, Alliance University, conducted a hands-on workshop on Data Pipelines and the ETL Process. The event aimed to bridge academic learning with real-world data science practices, providing students with practical exposure to modern data workflows.
Insights
The workshop was led by Akash Kamerkar, Data Scientist at ABB Global Industries and Services Pvt Ltd. With experience in building data-driven solutions in industrial environments, Mr Kamerkar guided participants through the complete process of designing and optimising data pipelines using Microsoft Azure.
The core focus of the session was the ETL workflow—Extract, Transform, Load. The workshop covered methods to extract data from various sources, transform it into usable formats, and load it into appropriate storage systems for further processing. Emphasis was placed on Microsoft Azure as a cloud-based platform for managing and executing data pipeline processes.
Participants explored tools within the Azure ecosystem, learning how to construct data pipelines that are scalable, efficient, and maintainable. The workshop provided exposure to Azure Data Factory and related services, enabling participants to understand data movement and transformation in a cloud-native environment.
Introduction to Azure Databricks
The session introduced Azure Databricks as a platform for large-scale data processing. Participants learned how Databricks integrates with Azure to support advanced analytics and data engineering tasks. Key components of distributed computing, cluster management, and parallel processing were explained with practical demonstrations.
This segment of the workshop supported the understanding of data processing techniques for handling large volumes of data in a reliable and automated manner.
Key Concepts in Modern Data Engineering
In addition to the technical walkthroughs, the workshop covered critical topics in data engineering such as data versioning, pipeline automation, monitoring, and error handling. These concepts were contextualised within real industry scenarios, helping participants to appreciate the importance of quality control and reliability in data infrastructure.
Discussion also included best practices in building robust data pipelines that align with industry standards, offering participants a comprehensive overview of the data engineering lifecycle.
Student Participation and Learning Outcomes
The workshop saw active participation from students of the MSc Data Science and BTech CSE AIML programmes. Participants engaged in guided exercises, working through real-time data scenarios using Azure services. This practical exposure enhanced their understanding of cloud-based data systems and prepared them for industry-level challenges in data science and engineering roles.
Through this initiative, Alliance University continued its commitment to integrating industry knowledge with academic learning, fostering technical competence and career readiness among students.
Conclusion
The workshop served as a platform for applied learning in the domain of data engineering, with a strong focus on cloud technologies and scalable data systems. The event reinforced the importance of practical skills in data workflows and highlighted the relevance of cloud-based tools in current industry practices.