Shafaq Siddiqi

Data Management Group
Institute of Interactive Systems and Data Science (ISDS)
Computer Science and Biomedical Engineering (CSBME)
Graz University of Technology (TU Graz)
Office: 8010 Graz, Sandgasse 34/II, Room SAL02004
shafaq.siddiqi@tugraz.at
Shafaq

Shafaq Siddiqi is pursuing her Ph.D. in computer science under the supervision of Prof. Stefanie Lindstaedt , Dr. Roman Kern and Prof. Matthias Boehm (external) at the Graz University of Technology, Austria. Her research focuses on exploiting data characteristics for data cleaning for large-scale data. Currently exploring the domains of heterogenous data preprocessing and challenges in multi-modal data alignment. She also has expertise in Natural Language Processing (NLP) specifically in processing semantic graphs for entity resolution. Shafaq Siddiqi is also a committer and PMC member in Apache SystemDS with contributions to implementing linear algebra-based cleaning operations and a framework for cleaning pipelines optimization. Before joining TU Graz, She was serving as Lecturer in the department of Computer Science at Sukkur IBA University (SIBAU), Pakistan.

Publications

2024
  • Shafaq Siddiqi, Roman Kern, Matthias Boehm: SAGA: A Scalable Framework for Optimizing Data Cleaning Pipelines for Machine Learning Applications SIGMOD 2024.
2023
  • S. Siddiqi, F. Qureshi, S. Lindstaedt and R. Kern: "Detecting Outliers in Non-IID Data: A Systematic Literature Review" IEEE Access, doi: 10.1109/ACCESS.2023.3294096. 2023.
2020
  • Matthias Boehm, Iulian Antonov, Sebastian Baunsgaard, Mark Dokter, Robert Ginthör, Kevin Innerebner, Florijan Klezin, Stefanie Lindstaedt, Arnab Phani, Benjamin Rath, Berthold Reinwald, Shafaq Siddiqi, Sebastian Benjamin Wrede: SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle CIDR 2020.
2019
  • Shafaq Siddiqui, M. Abdul Rehman, Sher Muhammad Doudpota, Ahmad Waqas: Ontology Driven Feature Engineering for Opinion Mining, (2019), IEEE Access 7, 67392-67401.
2018
  • Ubaidullah Alias Kashif, Zulfiqar Ali Memon, Shafaq Siddiqui, Abdul Rasheed Balouch, Rakhi Batra: Architectural Design of Trusted Platform for IaaS Cloud Computing, (2018), IJCAC 8(2), 47-65.
2016
  • Shafaq Siddiqi, Zulfiqar Ali Memon: Internet Addiction Impacts on Time Management That Results in Poor Academic Performance, FIT 2016, 63-68.

Teaching

Winter 2022
Winter 2023

Experience

Ph.D. Teaching Assistant , ISDS, CSBME, TU Graz, Graz, Austria Sep 2019 - Present
  • Researching (R&D) in the field of data management latest trends and toolsets on data cleaning, transformation, integration and modelling in the context of big data, IoT, data warehouse and data lakes
  • Developed a framework for automatic generation and optimization of data cleaning pipelines for downstream applications i.e., financial forecasting, regression analysis, classification etc.,
  • Teaching Data Intergration and Large Scale Analysis (DIA) course
Lecturer (Computer Science) , SIBAU , Sukkur, Pakistan Aug 2017 - Aug 2019
  • Taught computer science major courses in undergrad programs
  • Supervised projects specific to Databases, Data Warehousing, Machine Learning & Android applications
Instructor (Computer Science), IBA-IET , Khairpur, Pakistan Nov 2015 - Jul 2017
  • Taught computer science major courses in undergrad and associate degree programs
  • Acted as Academic Coordinator and performed administrative duties

Education

Ph.D. (Computer Science) , ISDS, CSBME, TU Graz, Graz, Austria Spring 2020 - Present
  • Thesis Title: Data Preprocessing for Heterogeneous Large Scale Data
MS (Computer Science) , SIBAU , Sukkur, Pakistan     Jan 2016 - Dec 2018
  • Thesis Title: Ontology Driven Feature Engineering for Opinion Mining
BS (Computer Science) , SIBAU , Sukkur, Pakistan     Aug 2011 - May 2015
  • Thesis Title: Sentiment Analysis for Movie Reviews