Publications

You can also find our articles on our Google Scholar profile.

2024

  1. Kondo: Efficient Provenance-driven Data Debloating

    Modi, A., Tikmany, R., Malik, T., Komondoor,R., Gehani, A. and D'Souza, D.
    40th IEEE International Conference on Data Engineering (ICDE)
    PDF

2023

  1. Reproducible eScience: The Data Containerization Challenge

    Malik, Tanu
    IEEE eScience
    PDF
  2. Efficient Differencing of System-level Provenance Graphs

    Nakamura, Y, Kanj, I and Malik, T
    32nd ACM International Conference on Information and Knowledge Management (CIKM)
    PDF
  3. Towards Shareable and Reproducible Cloud Computing Experiments

    Malik, T and Khan, S
    IEEE CloudSummit
    PDF
  4. Querying Container Provenance

    Modi, A., Reyad, M, Gehani, A., and Malik, T
    WWW '23 Companion: Companion Proceedings of the ACM Web Conference
    PDF
  5. IOSPReD: I/O Specialized Packaging of Reduced Datasets and Data-Intensive Applications for Efficient Reproducibility

    Niddodi, C., Gehani, A., Malik, T., Mohan, S., and Rilee, M.
    IEEE Access
    PDF

2022

  1. CHEX: Multiversion Replay with Ordered Checkpoints

    Manne, N. N. Satpati, S. Malik, T. Bagchi, A. Gehani, A. Chaudhary, A.
    Proceedings of the Very Large Databases (VLDB)
  2. Provenance-based Workflow Diagnostics Using Program Specification

    Nakamura, Y. Malik, T. Kanj, I. Gehani, A.
    29th IEEE International Conference on High Performance Computing, Data, and Analytics
  3. Reproducible Notebook Containers using Application Virtualization

    Ahmad, R. Manne, N. Malik, T.
    18th IEEE International Conference on eScience

2021

  1. Artifact Description/Artifact Evaluation: A Reproducibility Bane or a Boon

    Malik, T.
    Proceedings of the 4th International Workshop on Practical Reproducible Evaluation of Computer Systems
  2. On Lowering Merge Costs of an LSM Tree

    That, D. H. T. Gharehdaghi, M. Rasin, A. Malik, T.
    Proceedings of the 33rd International Conference on Scientific and Statistical Database Management
  3. LDI: Learned Distribution Index for Column Stores

    That, D. T. Gharehdaghi, M. Rasin, A. Malik, T.
    2021 IEEE International Conference on Big Data (Big Data)
  4. Reproducibility Practice in High-Performance Computing: Community Survey Results

    Plale, B. A. Malik, T. Pouchard, L. C.
    Computing in Science & Engineering
  5. An Approach for Open and Reproducible Hydrological Modeling using Sciunit and HydroShare

    Choi, YoungDon and Goodall, Jonathan and Ahmad, Raza and Malik, Tanu and Tarboton, David
    EGU General Assembly Conference Abstracts

2020

  1. Efficient provenance alignment in reproduced executions

    Nakamura, Y. Malik, T. Gehani, A.
    12th International Workshop on Theory and Practice of Provenance (TaPP 2020)
    PDF
  2. Content-defined Merkle Trees for Efficient Container Delivery

    Nakamura, Y. Ahmad, R. Malik, T.
    28th IEEE International Conference on High Performance Computing, Data, & Analytics
    PDF
  3. A taxonomy for reproducible and replicable research in environmental modelling

    Essawy, B. T. Goodall, J. L. Voce, D. Morsy, M. M. Sadler, J. M. Choi, Y. D. Tarboton, D. G. Malik, T.
    Environmental Modelling & Software
    PDF
  4. {PROV-CRT}: Provenance Support for Container Runtimes

    Ahmad, R. Nakamura, Y. Manne, N. N. Malik, T.
    12th International Workshop on Theory and Practice of Provenance (TaPP 2020)
  5. Documenting computing environments for reproducible experiments

    Chuah, J. Deeds, M. Malik, T. Choi, Y. Goodall, J. L.
    Parallel Computing: Technology Trends
  6. DF-toolkit: interacting with low-level database storage

    Wagner, J. Rasin, A. Heart, K. Malik, T. Grier, J.
    Proceedings of the VLDB Endowment
  7. ODSA: Open Database Storage Access

    Wagner, J. Rasin, A. Malik, T. Grier, J.
    Extending Database Technology (EDBT)
  8. MiDas: Containerizing Data-Intensive Applications with I/O Specialization

    Niddodi, C. Gehani, A. Malik, T. Navas, J. A. Mohan, S.
    Proceedings of the 3rd International Workshop on Practical Reproducible Evaluation of Computer Systems

2019

  1. Report on the first international workshop on incremental re-computation: Provenance and beyond

    Missier, P. Malik, T. Cala, J.
    ACM SIGMOD Record
    PDF
  2. PLI+: Efficient Clustering of Cloud Databases

    That, D. H. T. Wagner, J. Rasin, A. Malik, T.
    Distributed and Parallel Databases
    PDF
  3. SciInc: A Container Runtime for Incremental Recomputation

    Youngdahl, A. Ton-That, D. Malik, T.
    2019 15th International Conference on eScience (eScience)

2018

  1. Leveraging Scientific Cyberinfrastructures to Achieve Computational Hydrologic Model Reproducibility

    Sadler, J. Essawy, B. Goodall, J. Voce, D. CHOI, Y. Morsy, M. Yuan, Z. Malik, T.
    AGU Fall Meeting Abstracts
  2. Improving Reproducibility of Distributed Computational Experiments

    Pham, Q. Malik, T. That, D. H. T. Youngdahl, A.
    Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems
    PDF
  3. Achieving Reproducible Computational Hydrologic Models by Integrating Scientific Cyberinfrastructures

    Essawy, B. T. Goodall, J. L. Morsy, M. M. Zell, W. Sadler, J. Malik, T. Yuan, Z. Voce, D.
    9th International Congress on Environmental Modelling and Software
  4. Integrating scientific cyberinfrastructures to improve reproducibility in computational hydrology: Example for HydroShare and GeoTrust

    Essawy, B. T. Goodall, J. L. Zell, W. Voce, D. Morsy, M. M. Sadler, J. Yuan, Z. Malik, T.
    Environmental Modelling & Software
    PDF
  5. Detecting database file tampering through page carving

    Wagner, J. Rasin, A. Heart, K. Malik, T. Furst, J. Grier, J.
    21st International Conference on Extending Database Technology
    PDF
  6. Using Provenance for Generating Automatic Citations

    Malik, T. Rasin, A. Youngdahl, A.
    10th USENIX Workshop on the Theory and Practice of Provenance (TaPP 2018)
    PDF
  7. Utilizing provenance in reusable research objects

    Yuan, Z. That, D. H. T. Kothari, S. Fils, G. Malik, T.
    Informatics
  8. Where Provenance in Database Storage

    Rasin, A. Malik, T. Wagner, J. Kim, C.
    International Provenance and Annotation Workshop

2017

  1. Cyberinfrastructure to Support Collaborative and Reproducible Computational Hydrologic Modeling

    Goodall, J. L. Castronova, A. M. Bandaragoda, C. Morsy, M. M. Sadler, J. M. Essawy, B. Tarboton, D. G. Malik, T. Nijssen, B. Clark, M. P. Liu, Y. Wang, S.
    AGU Fall Meeting Abstracts
  2. GeoTrust Hub: A Platform For Sharing And Reproducing Geoscience Applications

    Malik, T. Tarboton, D. G. Goodall, J. L. Choi, E. Bhatt, A. Peckham, S. D. Foster, I. That, D. T. Essawy, B. Yuan, Z. Dash, P. Fils, G. Gan, T. Fadugba, O. I. Saxena, A. Valentic, T. A.
    AGU Fall Meeting Abstracts
  3. Sciunits: Reusable Research Objects

    Ton That DH. Fils, G. Yuan, Z. Malik, T.
    2017 IEEE 13th International Conference on e-Science (e-Science)
  4. Database forensic analysis with DBCarver

    Wagner, J. Rasin, A. Malik, T. Heart, K. Jehle, H. Grier, J.
    CIDR 2017, 8th Biennial Conference on Innovative Data Systems Research
  5. PLI: Augmenting live databases with custom clustered indexes

    Wagner, J. Rasin, A. That, D. H. T. Malik, T.
    Proceedings of the 29th International Conference on Scientific and Statistical Database Management

2016

  1. Ontology-based urban data exploration

    Balasubramani, B. S. Shivaprabhu, V. R. Krishnamurthy, S. Cruz, I. F. Malik, T.
    Proceedings of the 2nd ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics
    PDF
  2. Challenges with Maintaining Legacy Software to Achieve Reproducible Computational Analyses: An Example for Hydrologic Modeling Data Processing Pipelines

    Essawy, B. T. Goodall, J. L. Malik, T. Xu, H. Conway, M. Gil, Y.
    iEMSs Conference
  3. Interactive provenance summaries for reproducible science

    Li, X. Xu, X. Malik, T.
    2016 IEEE 12th International Conference on e-Science (e-Science)

2015

  1. GEN: a database interface generator for HPC programs

    Pham, Q. Malik, T.
    Proceedings of the 27th International Conference on Scientific and Statistical Database Management
    PDF
  2. Sharing and reproducing database applications

    Pham, Q. Thaler, S. Malik, T. Foster, I. Glavic, B.
    Proceedings of the VLDB Endowment
    PDF
  3. Personalized, Shareable Geoscience Dataspaces For Simplifying Data Management and Improving Reproducibility

    Malik, T. Foster, I. Goodall, J. L. Peckham, S. D. Baker, J. B. Gurnis, M.
    AGU Fall Meeting Abstracts
  4. PDACS: a portal for data analysis services for cosmological simulations

    Madduri, R. Rodriguez, A. Uram, T. Heitmann, K. Malik, T. Sehrish, S. Chard, R. Cholia, S. Paterno, M. Kowalkowski, J. Habib, S.
    Computing in Science & Engineering
  5. An invariant framework for conducting reproducible computational science

    Meng, H. Kommineni, R. Pham, Q. Gardner, R. Malik, T. Thain, D.
    Journal of Computational Science
  6. LDV: Light-weight database virtualization

    Pham, Q. Malik, T. Glavic, B. Foster, I.
    2015 IEEE 31st International Conference on Data Engineering

2014

  1. SOLE: towards descriptive and interactive publications

    Malik, T. Pham, Q. Foster, I. T. Leisch, F. Peng, R.
    Implementing reproducible research
    PDF
  2. Benchmarking cloud-based tagging services

    Malik, T. Chard, K. Foster, I.
    2014 IEEE 30th International Conference on Data Engineering Workshops
    PDF
  3. Auditing and maintaining provenance in software packages

    Pham, Q. Malik, T. Foster, I.
    International Provenance and Annotation Workshop
    PDF
  4. GeoBase: indexing NetCDF files for large-scale data analysis

    Malik, T.
    Big data management, technologies, and applications
    PDF
  5. Plenario: An Open Data Discovery and Exploration Platform for Urban Science.

    Catlett, C. Malik, T. Goldstein, B. Giuffrida, J. Shao, Y. Panella, A. Eder, D. Zanten, E. v. Mitchum, R. Thaler, S. Foster, I. T.
    IEEE Data Eng. Bull.
  6. GeoDataspaces: Simplifying Data Management Tasks with Globus

    Malik, T. Chard, K. Tchoua, R. B. Foster, I.
    AGU Fall Meeting Abstracts

2013

  1. Sketching distributed data provenance

    Malik, T. Gehani, A. Tariq, D. Zaffar, F.
    Data Provenance and Data Management in eScience
    PDF
  2. Proactive Support for Large-Scale Data Exploration

    Hereld, M. Malik, T. Vishwanath, V.
    2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum
    PDF
  3. Distributed data provenance for large-scale data-intensive computing

    Zhao, D. Shou, C. Maliky, T. Raicu, I.
    2013 IEEE International Conference on Cluster Computing (CLUSTER)
    PDF
  4. Using provenance for repeatability

    Pham, Q. Malik, T. Foster, I.
    5th USENIX Workshop on the Theory and Practice of Provenance (TaPP 13)
  5. Lens: a faceted browser for research networking platforms

    Whaling, R. Malik, T. Foster, I.
    2013 IEEE 9th International Conference on e-Science
  6. Towards a provenance-aware distributed filesystem

    Shou, C. Zhao, D. Malik, T. Raicu, I.
    5th Workshop on the Theory and Practice of Provenance (TaPP)

2012

  1. Addressing data access needs of the long-tail distribution of geoscientists

    Malik, T. Foster, I.
    2012 IEEE International Geoscience and Remote Sensing Symposium
    PDF
  2. Wagging the long tail of earth science: Why we need an earth science data web, and how to build it

    Foster, I. Katz, D. S. Malik, T. Fox, P.
    PDF
  3. SOLE: linking research papers with science objects

    Pham, Q. Malik, T. Foster, I. Lauro, R. D. Montella, R.
    International Provenance and Annotation Workshop
    PDF

2011

  1. Policy-based integration of provenance metadata

    Gehani, A. Tariq, D. Baig, B. Malik, T.
    2011 IEEE International Symposium on Policies for Distributed Systems and Networks
    PDF
  2. Improving the efficiency of subset queries on raster images

    Malik, T. Best, N. Elliott, J. Madduri, R. Foster, I.
    Proceedings of the ACM SIGSPATIAL Second International Workshop on High Performance and Distributed Geographic Information Systems
    PDF

2010

  1. JAWS: Job-aware workload scheduling for the exploration of turbulence simulations

    Wang, X. Perlman, E. Burns, R. Malik, T. Budavári, T. Meneveau, C. Szalay, A.
    SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
    PDF
  2. Tracking and sketching distributed data provenance

    Malik, T. Nistor, L. Gehani, A.
    2010 IEEE Sixth International Conference on e-Science
    PDF
  3. Efficient querying of distributed provenance stores

    Gehani, A. Kim, M. Malik, T.
    Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
    PDF
  4. Providing scalable data services in ubiquitous networks

    Malik, T. Prasad, R. Patil, S. Chaudhary, A. Venkatasubramanian, V.
    International Conference on Database Systems for Advanced Applications
  5. RNEDE: Resilient network design environment

    Venkatasubramanian, V. Malik, T. Giridhar, A. Villez, K. Prasad, R. Shukla, A. Rieger, C. Daum, K. McQueen, M.
    2010 3rd International Symposium on Resilient Control Systems
  6. A Dynamic Data Middleware cache for Rapidly-growing Scientific Repositories

    Malik, T. Wang, X. Little, P. Chaudhary, A. Thakar, A.
    ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing

2009

  1. Adaptive physical design for curated archives

    Malik, T. Wang, X. Dash, D. Chaudhary, A. Ailamaki, A. Burns, R.
    International Conference on Scientific and Statistical Database Management
    PDF
  2. Liferaft: Data-driven, batch processing for the exploration of scientific databases

    Wang, X. Burns, R. Malik, T.
    Conference on Innovative Database Research (CIDR)

2008

  1. Rule-based classification systems for informatics

    Krishnamurthy, B. Malik, T. Stamatis, S. Venkatasubramanian, V. Caruthers, J.
    2008 IEEE Fourth International Conference on eScience
    PDF
  2. Large scale data management for the sciences

    Malik, T.
  3. Workload-Aware histograms for remote applications

    Malik, T. Burns, R.
    International Conference on Data Warehousing and Knowledge Discovery
    PDF
  4. Automated physical design in database caches

    Malik, T. Wang, X. Burns, R. Dash, D. Ailamaki, A.
    2008 IEEE 24th International Conference on Data Engineering Workshop
    PDF

2007

  1. A workload-driven unit of cache replacement for mid-tier database caching

    Wang, X. Malik, T. Burns, R. Papadomanolakis, S. Ailamaki, A.
    International Conference on Database Systems for Advanced Applications
    PDF
  2. A Black-Box Approach to Query Cardinality Estimation.

    Malik, T. Burns, R. C. Chawla, N. V.
    CIDR
    PDF

2006

  1. Estimating query result sizes for proxy caching in scientific database federations

    Malik, T. Burns, R. Chawla, N. V. Szalay, A.
    SC'06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing
    PDF

2005

  1. Practical passive lossy link inference

    Batsakis, A. Malik, T. Terzis, A.
    International Workshop on Passive and Active Network Measurement
    PDF
  2. Bypass caching: Making scientific databases good network citizens

    Malik, T. Burns, R. Chaudhary, A.
    21st International Conference on Data Engineering (ICDE'05)
    PDF

2002

  1. Web services for the virtual observatory

    Szalay, A. S. Budavári, T. Malik, T. Gray, J. Thakar, A. R.
    Virtual Observatories
  2. The SDSS SkyServer - Public Access to the Sloan Digital Sky Server Data

    Szalay, A. S. Gray, J. Thakar, A. R. Kunszt, P. Z. Malik, T. Raddick, J. Stoughton, C.
    ACM Special Interest Group on Management of Data (SIGMOD)
  3. SkyQuery: A WebService approach to federate databases

    Malik, T. Szalay, A. S. Budavari, T. Thakar, A. R.
    arXiv preprint cs/0211023