Shafaq Siddiqi

Data Management Group
Institute of Interactive Systems and Data Science (ISDS)
Computer Science and Biomedical Engineering (CSBME)
Graz University of Technology (TU Graz)
Office: 8010 Graz, Sandgasse 34/II, Room SAL02072
shafaq.siddiqi@tugraz.at
Shafaq

Shafaq Siddiqi is a teaching assistant at Institute of Interactive Systems and Data Science, Graz University of Technology, Austria. Her research focuses on exploiting data characteristics for data cleaning for large-scale data. Currently exploring the domains of heterogenous data preprocessing and challenges in multi-modal data alignment. She also has expertise in Natural Language Processing (NLP) specifically in processing semantic graphs for entity resolution. She is also a PMC member of Apache SystemDS with contributions to implementing linear algebra-based cleaning operations and a framework for cleaning pipelines optimization. Before joining TU Graz, She was serving as Lecturer in the department of Computer Science at Sukkur IBA University (SIBAU), Pakistan.

Publications

2024
  • David Cemernek, Shafaq Siddiqi, Roman Kern: Effects of Class Imbalance Countermeasures on Interpretability, IEEE Access 2024.
  • Shafaq Siddiqi, Roman Kern, Matthias Boehm: SAGA: A Scalable Framework for Optimizing Data Cleaning Pipelines for Machine Learning Applications, SIGMOD 2024, Best Paper Honorable Mention.
2023
  • Shafaq Siddiqi, Faiza Qureshi, Stefanie Lindstaedt and Roman Kern: Detecting Outliers in Non-IID Data: A Systematic Literature Review, IEEE Access 2023.
2020
  • Matthias Boehm, Iulian Antonov, Sebastian Baunsgaard, Mark Dokter, Robert Ginthör, Kevin Innerebner, Florijan Klezin, Stefanie Lindstaedt, Arnab Phani, Benjamin Rath, Berthold Reinwald, Shafaq Siddiqi, Sebastian Benjamin Wrede: SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle, CIDR 2020.
2019
  • Shafaq Siddiqui, M. Abdul Rehman, Sher Muhammad Doudpota, Ahmad Waqas: Ontology Driven Feature Engineering for Opinion Mining, IEEE Access 2019 .
2018
  • Ubaidullah Alias Kashif, Zulfiqar Ali Memon, Shafaq Siddiqui, Abdul Rasheed Balouch, Rakhi Batra: Architectural Design of Trusted Platform for IaaS Cloud Computing, IJCAC 2018 .
2016
  • Shafaq Siddiqi, Zulfiqar Ali Memon: Internet Addiction Impacts on Time Management That Results in Poor Academic Performance, FIT 2016 .

Teaching

Winter 2024
Winter 2023
Winter 2022

Experience

Teaching Assistant , ISDS, CSBME, TU Graz, Graz, Austria Sep 2019 - Present
  • Researching (R&D) in the field of data management latest trends and toolsets on data cleaning, transformation, integration and modelling in the context of big data, IoT, data warehouse and data lakes
  • Developed a framework for automatic generation and optimization of data cleaning pipelines for downstream applications i.e., financial forecasting, regression analysis, classification etc.,
  • Teaching Data Intergration and Large Scale Analysis (DIA) course
Lecturer (Computer Science) , SIBAU , Sukkur, Pakistan Aug 2017 - Aug 2019
  • Taught computer science major courses in undergrad programs
  • Supervised projects specific to Databases, Data Warehousing, Machine Learning & Android applications
Instructor (Computer Science), IBA-IET , Khairpur, Pakistan Nov 2015 - Jul 2017
  • Taught computer science major courses in undergrad and associate degree programs
  • Acted as Academic Coordinator and performed administrative duties

Education

Ph.D. (Computer Science) , ISDS, CSBME, TU Graz, Graz, Austria Mar 2020 - Jun 2024
  • Thesis Title: ML-based Data Preprocessing for Large-scale Data
MS (Computer Science) , SIBAU , Sukkur, Pakistan     Jan 2016 - Dec 2018
  • Thesis Title: Ontology Driven Feature Engineering for Opinion Mining
BS (Computer Science) , SIBAU , Sukkur, Pakistan     Aug 2011 - May 2015
  • Thesis Title: Sentiment Analysis for Movie Reviews