Vis enkel innførsel

dc.contributor.authorHolm, Håvard Heitlo
dc.contributor.authorBrodtkorb, André R.
dc.contributor.authorSætra, Martin Lilleeng
dc.date.accessioned2023-08-17T11:45:26Z
dc.date.available2023-08-17T11:45:26Z
dc.date.created2020-01-18T07:30:12Z
dc.date.issued2020
dc.identifier.citationAdvances in Parallel Computing. 2020, 36 593-604.en_US
dc.identifier.issn0927-5452
dc.identifier.urihttps://hdl.handle.net/11250/3084583
dc.description.abstractIn this work, we examine the performance, energy efficiency, and usability when using Python for developing high-performance computing codes running on the graphics processing unit (GPU). We investigate the portability of performance and energy efficiency between Compute Unified Device Architecture (CUDA) and Open Compute Language (OpenCL); between GPU generations; and between low-end, mid-range, and high-end GPUs. Our findings showed that the impact of using Python is negligible for our applications, and furthermore, CUDA and OpenCL applications tuned to an equivalent level can in many cases obtain the same computational performance. Our experiments showed that performance in general varies more between different GPUs than between using CUDA and OpenCL. We also show that tuning for performance is a good way of tuning for energy efficiency, but that specific tuning is needed to obtain optimal energy efficiencyen_US
dc.language.isoengen_US
dc.publisherIOS Pressen_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titlePerformance and Energy Efficiency of CUDA and OpenCL for GPU Computing using Pythonen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2020 The authors and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).en_US
dc.source.pagenumber593-604en_US
dc.source.volume36en_US
dc.source.journalAdvances in Parallel Computingen_US
dc.identifier.doi10.3390/computation8010004
dc.identifier.cristin1776313
dc.relation.projectNorges forskningsråd: 250935en_US
dc.relation.projectNorges forskningsråd: 250935 (GPU Ocean)en_US
dc.relation.projectNotur/NorStore: NN9550Ken_US
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal