Follow
Ahmad Abdelfattah
Ahmad Abdelfattah
Research Assistant Professor, Innovative Computing Laboratory, University of Tennessee
Verified email at icl.utk.edu
Title
Cited by
Cited by
Year
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic
A Abdelfattah, H Anzt, EG Boman, E Carson, T Cojean, J Dongarra, A Fox, ...
The International Journal of High Performance Computing Applications 35 (4 …, 2021
1672021
Performance, design, and autotuning of batched GEMM for GPUs
A Abdelfattah, A Haidar, S Tomov, J Dongarra
High Performance Computing: 31st International Conference, ISC High …, 2016
1362016
High-performance tensor contractions for GPUs
A Abdelfattah, M Baboulin, V Dobrev, J Dongarra, C Earl, J Falcou, ...
Procedia Computer Science 80, 108-118, 2016
762016
High-performance matrix-matrix multiplications of very small matrices
I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ...
Euro-Par 2016: Parallel Processing: 22nd International Conference on …, 2016
682016
Parallel programming models for dense linear algebra on heterogeneous systems
J Dongarra, M Abalenkovs, A Abdelfattah, M Gates, A Haidar, J Kurzak, ...
Supercomputing frontiers and innovations 2 (4), 67-86, 2015
622015
Efficient exascale discretizations: High-order finite element methods
T Kolev, P Fischer, M Min, J Dongarra, J Brown, V Dobrev, T Warburton, ...
The International Journal of High Performance Computing Applications 35 (6 …, 2021
562021
Kblas: An optimized library for dense matrix-vector multiplication on gpu accelerators
A Abdelfattah, D Keyes, H Ltaief
ACM Transactions on Mathematical Software (TOMS) 42 (3), 1-31, 2016
532016
The design of fast and energy-efficient linear solvers: On the potential of half-precision arithmetic and iterative refinement techniques
A Haidar, A Abdelfattah, M Zounon, P Wu, S Pranesh, S Tomov, ...
International conference on computational science, 586-600, 2018
502018
Fast batched matrix multiplication for small sizes using half-precision arithmetic on GPUs
A Abdelfattah, S Tomov, J Dongarra
2019 IEEE international parallel and distributed processing symposium (IPDPS …, 2019
442019
With extreme computing, the rules have changed
J Dongarra, S Tomov, P Luszczek, J Kurzak, M Gates, I Yamazaki, H Anzt, ...
Computing in Science & Engineering 19 (3), 52-62, 2017
442017
A novel fast and accurate pseudo-analytical simulation approach for MOAO
É Gendron, A Charara, A Abdelfattah, D Gratadour, D Keyes, H Ltaief, ...
Adaptive Optics Systems IV 9148, 2148-2160, 2014
372014
Design, optimization, and benchmarking of dense linear algebra algorithms on AMD GPUs
C Brown, A Abdelfattah, S Tomov, J Dongarra
2020 IEEE High Performance Extreme Computing Conference (HPEC), 1-7, 2020
272020
GPU algorithms for efficient exascale discretizations
A Abdelfattah, V Barra, N Beams, R Bleile, J Brown, JS Camier, R Carson, ...
Parallel Computing 108, 102841, 2021
262021
A set of batched basic linear algebra subprograms and LAPACK routines
A Abdelfattah, T Costa, J Dongarra, M Gates, A Haidar, S Hammarling, ...
ACM Transactions on Mathematical Software (TOMS) 47 (3), 1-23, 2021
262021
A guide for achieving high performance with very small matrices on GPU: a case study of batched LU and Cholesky factorizations
A Haidar, A Abdelfattah, M Zounon, S Tomov, J Dongarra
IEEE Transactions on Parallel and Distributed Systems 29 (5), 973-984, 2017
262017
Novel HPC techniques to batch execution of many variable size BLAS computations on GPUs
A Abdelfattah, A Haidar, S Tomov, J Dongarra
Proceedings of the International Conference on Supercomputing, 1-10, 2017
252017
Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices
I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ...
Parallel Computing 81, 1-21, 2019
242019
Fast Cholesky factorization on GPUs for batch and native modes in MAGMA
A Abdelfattah, A Haidar, S Tomov, J Dongarra
Journal of Computational Science 20, 85-93, 2017
242017
Evaluating the performance of NVIDIA’s A100 Ampere GPU for sparse and batched computations
H Anzt, YM Tsai, A Abdelfattah, T Cojean, J Dongarra
2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High …, 2020
222020
C++ api for blas and lapack
M Gates, P Luszczek, A Abdelfattah, J Kurzak, J Dongarra, K Arturov, ...
SLATE Working Notes, 2017
22*2017
The system can't perform the operation now. Try again later.
Articles 1–20