Shafaq Siddiqi

Data Management Group
Institute of Interactive Systems and Data Science (ISDS)
Computer Science and Biomedical Engineering (CSBME)
Graz University of Technology (TU Graz)
Office: 8010 Graz, Inffeldgasse 13/V, 5th Floor

Shafaq Siddiqi is pursuing her Ph.D. in computer science under the supervision of Prof. Matthias Boehm at the Graz University of Technology, Austria. Her research focuses on exploiting data characteristics for data cleaning for large-scale data. Currently exploring the domains of heterogenous data preprocessing and challenges in multi-modal data alignment. She also has expertise in Natural Language Processing (NLP) specifically in processing semantic graphs for entity resolution. Shafaq Siddiqi is also a committer and PMC member in Apache SystemDS with contributions to implementing linear algebra-based cleaning operations and a framework for cleaning pipelines optimization. Before joining TU Graz, She was serving as Lecturer in the department of Computer Science at Sukkur IBA University (SIBAU), Pakistan.


  • Matthias Boehm, Iulian Antonov, Sebastian Baunsgaard, Mark Dokter, Robert Ginthör, Kevin Innerebner, Florijan Klezin, Stefanie Lindstaedt, Arnab Phani, Benjamin Rath, Berthold Reinwald, Shafaq Siddiqi, Sebastian Benjamin Wrede: SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle CIDR 2020.
  • Shafaq Siddiqui, M. Abdul Rehman, Sher Muhammad Doudpota, Ahmad Waqas: Ontology Driven Feature Engineering for Opinion Mining, (2019), IEEE Access 7, 67392-67401.
  • Ubaidullah Alias Kashif, Zulfiqar Ali Memon, Shafaq Siddiqui, Abdul Rasheed Balouch, Rakhi Batra: Architectural Design of Trusted Platform for IaaS Cloud Computing, (2018), IJCAC 8(2), 47-65.
  • Shafaq Siddiqi, Zulfiqar Ali Memon: Internet Addiction Impacts on Time Management That Results in Poor Academic Performance, FIT 2016, 63-68.


Winter 2022


Ph.D. Teaching Assistant , ISDS, CSBME, TU Graz, Graz, Austria Sep 2019 - Present
  • Researching (R&D) in the field of data management latest trends and toolsets on data cleaning, transformation, integration and modelling in the context of big data, IoT, data warehouse and data lakes
  • Developed a framework for automatic generation and optimization of data cleaning pipelines for downstream applications i.e., financial forecasting, regression analysis, classification etc.,
  • Teaching Data Intergration and Large Scale Analysis (DIA) course
Lecturer (Computer Science) , SIBAU , Sukkur, Pakistan Aug 2017 - Aug 2019
  • Taught computer science major courses in undergrad programs
  • Supervised projects specific to Databases, Data Warehousing, Machine Learning & Android applications
Instructor (Computer Science), IBA-IET , Khairpur, Pakistan Nov 2015 - Jul 2017
  • Taught computer science major courses in undergrad and associate degree programs
  • Acted as Academic Coordinator and performed administrative duties


Ph.D. (Computer Science) , ISDS, CSBME, TU Graz, Graz, Austria Spring 2020 - Present
  • Thesis Title: Data Preprocessing for Heterogeneous Large Scale Data
MS (Computer Science) , SIBAU , Sukkur, Pakistan     Jan 2016 - Dec 2018
  • Thesis Title: Ontology Driven Feature Engineering for Opinion Mining
BS (Computer Science) , SIBAU , Sukkur, Pakistan     Aug 2011 - May 2015
  • Thesis Title: Sentiment Analysis for Movie Reviews