異構加速計算崛起，不應只是關注計算芯片

Treesa

2023年8月22日 17:36

原文標題：Why SYCL: Elephants in the SYCL Room

By James Reinders and Michael Wong

摘錄自：https://www.hpcwire.com/2022/02/03/why-sycl-elephants-in-the-sycl-room/

Commentary — In the second of a series of guest posts on heterogeneous computing, James Reinders, who returned to Intel last year after a short “retirement,” follows up on his piece about how SYCL will contribute to a heterogeneous future for C++. He is joined by Michael Wong, of Codeplay Software Ltd., who is also the current SYCL committee chair. Together, they offer their responses to what might be called the ‘Elephants in the SYCL Room.’

評論——在第二個關于異構計算的一系列客座帖子中，James Reinders在短暫的“退休”后于去年回到了英特爾，他繼續了講述SYCL將如何為C++的異構未來做出貢獻的文章。Codeplay Software Ltd.的Michael Wong也加入了他的行列，他也是現任SYCL委員會主席。他們一起對所謂的“SYCL房間里的大象”做出了回應。

The case for C++ programming, with SYCL bringing in full heterogeneous support, has been well articulated by persons close to the SYCL specification including a recent article “Considering a Heterogeneous Future for C++” and numerous other resources enumerated on sycl.tech. SYCL is a Khronos standard that introduces support for fully heterogeneous data parallelism to C++. While SYCL is not a cure-all, it is a solution to one aspect of a larger problem: How do we enable adequately enable full heterogeneous programming given the emerging explosion in hardware diversity?

熟悉 SYCL 規范的人已經很好地闡明了 SYCL 帶來了全面異構支持的 C++ 編程案例，包括最近的一篇文章“考慮 C++ 的異構未來”以及 sycl.tech 上列舉的許多其他資源。 SYCL 是一種 Khronos 標準，引入了對 C++ 的完全異構數據并行性的支持。雖然 SYCL 并不是包治百病的靈丹妙藥，但它是一個方面的解決方案：鑒于硬件多樣性的爆炸式增長，我們如何充分啟用完全異構編程？

In this article, we offer our perspective on key questions about SYCL, based on our perspectives of being having worked in this domain for decades. These important questions are asked by software developers looking to understand if SYCL matters to them. Let’s face it: at some point, every major project has Elephants in the Room.[1] Successful projects address their elephants openly.

在本文中，我們基于我們在該領域工作了數十年的觀點，提出了對 SYCL 關鍵問題的看法。這些重要問題是由希望了解 SYCL 對他們是否重要的軟件開發人員提出的。讓我們面對現實吧：在某些時候，每個重大項目都會有“房間里的大象”。[1] 成功的項目公開地解決了他們的問題。

Elephant 1: Aren’t GPUs enough? Do other accelerators really matter?

大象一：GPU 還不夠嗎？其他加速器真的重要嗎？

Valid questions exist about which accelerators will stay, and which will be a passing fad. For decades, different accelerators have come and gone while CPUs persist. Today, GPUs are present in the vast majority of computer systems. Writing our applications to leverage GPUs makes a lot of sense given their near ubiquity.

關于哪些加速器將繼續存在、哪些將成為曇花一現，存在一些合理的問題。幾十年來，不同的加速器來了又去，而 CPU 卻一直存在。如今，GPU 出現在絕大多數計算機系統中。鑒于 GPU 幾乎無處不在，編寫應用程序來利用 GPU 非常有意義。

As a result, one of the first elephant questions is whether we really need to generalize, i.e., do we need to be multiarchitecture and multivendor?

因此，首要問題之一是我們是否真的需要泛化，即我們是否需要多架構和多供應商？

The expectation that the future will require “dedicated or semi-dedicated hardware accelerators” as a must-have feature for computing in this decade is expected by experts including researchers led by Prof. Masaaki Kondo in “White Paper on Next-Generation Advanced Computing Infrastructure” and by Hennessy & Patterson in their paper “A New Golden Age for Computer Architecture”.

以近藤正明教授為首的研究人員等專家在《下一代高級計算基礎設施白皮書》中預計，未來將需要“專用或半專用硬件加速器”作為這十年計算的必備功能。 ”以及 Hennessy 和 Patterson 在他們的論文“計算機架構的新黃金時代”中。

As long as we are talking about dedicated accelerators, why stop at GPUs? Optimizing for different types of accelerators is a great objective, but we don’t want to write different code for different types of accelerators. We believe that the industry will benefit from a standardized language, that everyone can contribute to, collaborate on, is not locked into a particular vendor, and can evolve organically based on its members and public requirements.

既然我們談論的是專用加速器，為什么只停留在 GPU 上呢？針對不同類型的加速器進行優化是一個偉大的目標，但我們不想為不同類型的加速器編寫不同的代碼。我們相信，該行業將受益于標準化語言，每個人都可以做出貢獻、進行協作，不會被鎖定到特定的供應商，并且可以根據其成員和公眾要求有機發展。

SYCL takes an interesting approach that allows us to use common code when we want and specialize when we want. In this way, SYCL embraces accelerators in general, leaving it to us, the developers, to decide when to write common cross-architecture code, and when we feel it is sufficiently advantageous to specialize code.

SYCL 采用了一種有趣的方法，允許我們在需要時使用通用代碼，并在需要時進行專業化。通過這種方式，SYCL 總體上擁抱加速器，讓我們開發人員來決定何時編寫通用的跨架構代碼，以及何時我們認為專門化代碼有足夠的優勢。

Its underlying programming model, SPMD, has shown to be usable across many architectures. SPMD is how most programmers using Nvidia CUDA/OpenCL/SYCL think: writing code from the perspective of operating on one work item and expecting it to run concurrently on most hardware such that multiple work-items fill vector hardware lanes.

其底層編程模型 SPMD 已被證明可在多種體系結構中使用。 SPMD 是大多數使用 Nvidia CUDA/OpenCL/SYCL 的程序員的想法：從操作一個工作項的角度編寫代碼，并期望它在大多數硬件上同時運行，以便多個工作項填充矢量硬件通道。

SYCL offers a large degree of portability across vendors (e.g., many different sources of GPUs) as well as architecture (e.g., GPUs, FPGAs, ASICs).

SYCL 提供了跨供應商（例如，許多不同的 GPU 來源）以及架構（例如，GPU、FPGA、ASIC）的高度可移植性。

Elephant 2: Why not just use Nvidia CUDA?

大象2：為什么不直接使用Nvidia CUDA？

A vibrant GPU eco-system is emerging thanks to competition from multiple GPU vendors. This is part of a trend for more and more competition for accelerators in general. The installed base of CUDA applications that make use of Nvidia GPUs are poised to be able to adapt over time to an open, multivendor, multiarchitecture software approach created to serve all vendors, not just Nvidia.

由于多個 GPU 供應商的競爭，一個充滿活力的 GPU 生態系統正在興起。這是加速器競爭越來越激烈的趨勢的一部分。使用 Nvidia GPU 的 CUDA 應用程序的安裝基礎將能夠隨著時間的推移適應開放的、多供應商、多架構的軟件方法，該方法旨在為所有供應商（而不僅僅是 Nvidia）提供服務。

While CUDA has earned a strong following given its value proposition and the strength of Nvidia GPUs in the ecosystem, there are increasing concerns regarding the lock-in that use of CUDA creates. Such concerns stem from the proprietary focus highlighted by these factors:

雖然 CUDA 因其價值主張和 Nvidia GPU 在生態系統中的實力而贏得了眾多追隨者，但人們越來越擔心 CUDA 的使用造成的鎖定。這些擔憂源于以下因素所強調的專有關注：

The definition of CUDA, its implementation and evolution, is managed by Nvidia and evolves specifically to serve Nvidia GPU product designs. Details of new features in CUDA, are generally shielded from public view until NVIDIA has both hardware and software to support them. As discussed more fully below, this control stifles innovations from other vendors.
The licensing for CUDA tools and libraries, from Nvidia, specifically states they must be used to “develop applications only for use in systems with Nvidia GPUs.” Even “open source” from Nvidia includes licensing languagerestricting key parts in the same manner.

1. CUDA 的定義、其實現和發展由 Nvidia 管理，并專門為服務 Nvidia GPU 產品設計而發展。 CUDA 中新功能的詳細信息通常不會公開，直到 NVIDIA 擁有支持它們的硬件和軟件為止。正如下面更全面討論的，這種控制抑制了其他供應商的創新。

2. Nvidia 的 CUDA 工具和庫的許可特別指出，它們必須用于“開發僅在具有 Nvidia GPU 的系統中使用的應用程序”。 即使是 Nvidia 的“開源”也包含以同樣方式限制關鍵部分的許可語言。

Nvidia CUDA can claim credit for bringing accelerated computing to the masses using Nvidia GPUs.With the explosion of competition in the accelerator market, it could appear that CUDA has become a walled garden in an increasingly open and transparent world.The desire for an open, multivendor, multiarchitecture alternative to CUDA is not going away.

Nvidia CUDA 因使用 Nvidia GPU 為大眾帶來加速計算而享有盛譽。隨著加速器市場競爭的爆發，CUDA 似乎已經成為一個日益開放和透明的世界中的圍墻花園。對 CUDA 的開放、多供應商、多架構替代方案的渴望不會消失。

Elephant 3: Why not just use AMD HIP?

大象 3：為什么不直接使用 AMD HIP？

AMD Heterogeneous-Computing Interface for Portability (HIP) is a C++ dialect. AMD tools include a “HIPify tool” to help transform CUDA code into HIP. AMD states that “HIP code can run on AMD hardware (through the HCC compiler) or Nvidia hardware (through the NVCC compiler) with no performance loss compared with the original CUDA code.”

AMD 異構計算可移植接口 (HIP) 是一種 C++ 方言。 AMD 工具包括“HIPify 工具”，可幫助將 CUDA 代碼轉換為 HIP。 AMD 表示，“HIP 代碼可以在 AMD 硬件（通過 HCC 編譯器）或 Nvidia 硬件（通過 NVCC 編譯器）上運行，與原始 CUDA 代碼相比，不會有任何性能損失。”

HIP is a “follow CUDA” strategy – i.e., where AMD develops an update to HIP as quickly as possible after Nvidia has released an update to its CUDA platform. The arguments in favor of HIP rest on the virtue of reuse of a large CUDA codebase for AMD GPUs. Unfortunately, given the opaqueness of CUDA no one can follow CUDA too closely, timely, or accurately. This offers no opportunity for AMD to expose unique AMD hardware innovation without forcing CUDA developers to change their code with #ifdefs for AMD GPUs.

HIP 是一種“跟隨 CUDA”策略，即在 Nvidia 發布其 CUDA 平臺更新后，AMD 盡快開發 HIP 更新。支持 HIP 的論點是基于 AMD GPU 重用大型 CUDA 代碼庫的優點。不幸的是，鑒于 CUDA 的不透明性，沒有人能夠太密切、及時或準確地跟蹤 CUDA。如果不迫使 CUDA 開發人員使用 AMD GPU 的 #ifdefs 更改代碼，AMD 就沒有機會展示獨特的 AMD 硬件創新。

While AMD has created value with HIP for those that seek AMD GPUs as an alternative to Nvidia GPUs, it is not hard to want more. Imagine having a solution that can keep pace with the feature innovation and performance of CUDA!

We believe that innovation will flourish the most in an open field rather than in the shadows of a walled garden.

[Editor’s note: There is a SYCL implementation called hipSYCL that sits on top of HIP and targets AMD GPUs running ROCm and Nvidia GPUs.]

雖然 AMD 通過 HIP 為那些尋求 AMD GPU 作為 Nvidia GPU 替代品的人創造了價值，但想要更多并不難。想象一下，擁有一個能夠與 CUDA 的功能創新和性能保持同步的解決方案！我們相信，創新將在開放的領域而不是在圍墻花園的陰影中蓬勃發展。

[編者注：有一個名為 hipSYCL 的 SYCL 實現，它位于 HIP 之上，并針對運行 ROCm 和 Nvidia GPU 的 AMD GPU。]

Elephant 4: Why not just use OpenCL?

大象4：為什么不直接使用OpenCL？

OpenCL provides an open multivendor alternative, but at a lower layer of the software stack than SYCL or CUDA offers. SYCL grew out of a desire to bring the benefits of OpenCL’s open, multivendor, multiarchitecture approach by providing a standard C++ interface for heterogenous parallel architectures. SYCL implementations often utilize OpenCL for their implementations, but also have the flexibility to use other backends under the hood as of SYCL2020. SYCL delivers on the promise of OpenCL, in a higher productivity form through its C++ abstractions.

OpenCL 提供了一種開放的多供應商替代方案，但其軟件堆棧層低于 SYCL 或 CUDA 提供的軟件堆棧層。 SYCL 的誕生是為了通過為異構并行架構提供標準 C++ 接口來發揮 OpenCL 開放、多供應商、多架構方法的優勢。 SYCL 實現通常使用 OpenCL 進行實現，但從 SYCL2020 開始，也可以靈活地在后臺使用其他后端。 SYCL 通過其 C++ 抽象以更高的生產力形式兌現了 OpenCL 的承諾。

Elephant 5: Can’t we just use C++ ?

大象5：我們不能只使用C++嗎？

Let’s start with the assumption that we want to program heterogeneous machines, we value portability, and we do not want to pay a penalty in performance for portability.

讓我們首先假設我們想要對異構機器進行編程，我們重視可移植性，并且我們不想為可移植性付出性能上的代價。

We might answer ”yes” – C++ is enough when you have SYCL support too. After all, C++ was built to be extended by template libraries like SYCL. SYCL adds no new keywords, but it does benefit from SYCL-aware C++ compilers to help with cross-compilation, fat binaries, and remote memories. Those are simply things C++ compilers have not historically made easy.

我們可能會回答“是”——當您也有 SYCL 支持時，C++ 就足夠了。畢竟，C++ 的構建是為了通過 SYCL 等模板庫進行擴展。 SYCL 沒有添加新的關鍵字，但它確實受益于 SYCL 感知的 C++ 編譯器來幫助交叉編譯、胖二進制文件和遠程內存。這些都是 C++ 編譯器歷史上并不容易做到的事情。

SYCL also offers a solution today, within standard C++, to address programming for full heterogeneous computing built on top of ISO C++. This includes device enumeration (info), defining work (kernels), submitting and coordinating work across devices (queue), and managing remote memories.

如今，SYCL 還在標準 C++ 中提供了一種解決方案，用于解決構建在 ISO C++ 之上的完全異構計算的編程問題。這包括設備枚舉（信息）、定義工作（內核）、跨設備提交和協調工作（隊列）以及管理遠程內存。

That brings us to “No” – the C++ standard does not define support for heterogeneous systems with disjoint (non-coherent) memories. Some think it will add that one day, and there is effort to go in that direction, but even those involved believe the current direction will take at least 10 years and it is limited by the need for C++ to maintain backwards compatibility with millions of lines of existing code. In fact, one of us (MW) has written papers urging C++ in that direction. The response from WG21 (ISO C++), understandably because of the backward compatibility concerns, has been to start with parallel algorithms and executors, and add forward progress guarantees instead of making radical surgical change to the memory and addressing model. Therefore, if you are programming heterogeneous machines it is not likely to be enough to claim “C++ is enough.” There are some trying to move in that direction and that is the beauty of a competitive industry, we can see what will work out in the best interest of the market and consumers. However, today what will work immediately is “C++ plus SYCL” or “C++ plus CUDA” or “C++ plus OpenCL.”

這讓我們得出“不”的結論——C++ 標準沒有定義對具有不相交（非連貫）內存的異構系統的支持。有些人認為有一天會添加這一點，并且正在朝著這個方向努力，但即使是那些參與其中的人也認為當前的方向至少需要 10 年，并且它受到 C++ 需要保持與數百萬行向后兼容性的限制。現有代碼。事實上，我們中的一位 (MW) 已經撰寫了論文，敦促 C++ 朝這個方向發展。出于向后兼容性的考慮，WG21 (ISO C++) 的反應是從并行算法和執行器開始，并添加向前的進度保證，而不是對內存和尋址模型進行根本性的外科手術改變。因此，如果您正在對異構機器進行編程，那么聲稱“C++ 就足夠了”可能還不夠。有些人試圖朝這個方向前進，這就是競爭行業的美妙之處，我們可以看到什么將最符合市場和消費者的利益。然而，今天立即起作用的是“C++ 加 SYCL”或“C++ 加 CUDA”或“C++ 加 OpenCL”。

The purpose of adding SYCL support into our C++ compiler and runtimes, is to add capabilities so C++ supports full heterogeneous support that it does not offer today without SYCL. It is also a way to show how C++ can support heterogeneity in the future, as ISO standards tend to standardize best practices of pre-existing knowledge. We will show one such example below.

將 SYCL 支持添加到我們的 C++ 編譯器和運行時中的目的是添加功能，以便 C++ 支持完整的異構支持，而如果沒有 SYCL，C++ 目前無法提供這種支持。這也是展示 C++ 如何支持未來異構性的一種方式，因為 ISO 標準傾向于標準化現有知識的最佳實踐。下面我們將展示一個這樣的例子。

Elephant 6: Can SYCL queues can make it into ISO C++?

大象6：SYCL隊列可以進入ISO C++嗎？

Queues are how SYCL assigns work to heterogeneous devices, including handing off data within complex memory systems (not necessarily unified and coherent).

隊列是 SYCL 將工作分配給異構設備的方式，包括在復雜的內存系統（不一定是統一和一致的）內傳遞數據。

It is easy to speculate on whether a queue class belongs in C++ long-term, but such speculation is premature.

從長遠來看，很容易推測一個隊列類是否屬于C++，但這種推測還為時過早。

Proposals for C++23 have included various constructs to direct execution to specific devices, including “std::execution” in p2300. We know C++23 will continue to rely on a unified global memory address space and will not support disjoint remote memories (complex memory systems).

C++23 的提案包括各種直接執行到特定設備的結構，包括 p2300 中的“std::execution”。我們知道C++23將繼續依賴統一的全局內存地址空間，并且不會支持不相交的遠程內存（復雜的內存系統）。

It is easy to get caught up on syntax. Eventually, if C++ expands to include full heterogeneous support, the concepts embodied in SYCL queue will be needed. Until then, SYCL fills this void. Some important capabilities, such as parallel directives, and message passing, have remained independent standards (OpenMP and MPI). While it is possible C++ will not grow to include full heterogeneous support, we believe C++ will eventually add such support incrementally.

很容易陷入語法困境。最終，如果 C++ 擴展到包括完整的異構支持，則將需要 SYCL 隊列中體現的概念。在此之前，SYCL 填補了這一空白。一些重要的功能，例如并行指令和消息傳遞，仍然是獨立的標準（OpenMP 和 MPI）。雖然 C++ 可能不會發展到包含完整的異構支持，但我們相信 C++ 最終將逐步添加此類支持。

C++ aims to standardize established best practice instead of inventing new and unproven features, therefore SYCL is an important steppingstone as one of the many feeders of ‘established best practice’ into the intentionally slower moving C++ standardization process.

C++ 的目標是標準化既定的最佳實踐，而不是發明新的和未經驗證的功能，因此 SYCL 是一個重要的踏腳石，作為“既定的最佳實踐”進入故意緩慢發展的 C++ 標準化過程的眾多饋送者之一。

As C++23 settles, and C++26 is considered, the future of C++ for heterogeneous computing will begin to take shape, including syntax but likely a full solution will not emerge for another 5-10 years.

隨著 C++23 的穩定和 C++26 的考慮，C++ 異構計算的未來將開始成形，包括語法，但完整的解決方案可能在未來 5-10 年內不會出現。

SYCL offers a solution today, within standard C++, to address programming for full heterogeneous computing. This includes device enumeration (info), defining work (kernels), submitting work to devices (queue), and managing remote memories.

SYCL 如今在標準 C++ 中提供了一種解決方案，用于解決完全異構計算的編程問題。這包括設備枚舉（信息）、定義工作（內核）、向設備提交工作（隊列）以及管理遠程內存。

Elephant 7: Who is behind SYCL? Is it really Open in the true sense of the word?

大象7：誰是SYCL的幕后推手？它真的是真正意義上的開放嗎？

We believe that open, international standards and Open Source Software (OSS) projects are good for everyone. When individuals from Intel and Codeplay get involved, we have found that they work hard to help develop and promote such standards and OSS – from WiFi, USB, PCIe to OpenMP, MPI, Fortran, C, C++, OpenCL, and SYCL.

我們相信開放的國際標準和開源軟件 (OSS) 項目對每個人都有好處。當英特爾和 Codeplay 的個人參與其中時，我們發現他們努力幫助開發和推廣此類標準和 OSS——從 WiFi、USB、PCIe 到 OpenMP、MPI、Fortran、C、C++、OpenCL 和 SYCL。

Apple was the original force behind OpenCL, which began as a set of C interfaces at a fairly low level. SYCL originally grew out of efforts within OpenCL to consider higher level interfaces, specifically using C++. After multiple years of very open debates, SYCL was born. Codeplay has been instrumental in SYCL from the very beginning. Intel’s interest in SYCL grew after entering both the FPGA market and announcing the Intel Xe architecture to include GPUs for compute. Intel is proud to be an active member in the SYCL committee, and an active contributor to implementations to support SYCL. SYCL is a community effort, and the homes of both authors of this article (Intel and Codeplay) are enthusiastic participants along with many others.

Apple 是 OpenCL 背后的原始力量，它最初是一組相當低級別的 C 接口。 SYCL 最初源于 OpenCL 內部考慮更高級別接口（特別是使用 C++）的努力。經過多年的公開辯論，SYCL 誕生了。 Codeplay 從一開始就在 SYCL 中發揮了重要作用。在進入 FPGA 市場并宣布英特爾 Xe 架構包含用于計算的 GPU 后，英特爾對 SYCL 的興趣與日俱增。英特爾很自豪能夠成為 SYCL 委員會的積極成員，并為支持 SYCL 的實施做出積極貢獻。 SYCL 是一項社區努力，本文的兩位作者（Intel 和 Codeplay）以及許多其他人都是熱情的參與者。

Elephant 8: I see a herd of elephants – why should I believe in SYCL?

大象8：我看到一群大象——我為什么要相信SYCL？

If you have not yet needed to program an application for multiple heterogeneous machines, you may not yet feel the pain to really understand why we are so excited about the prospects for SYCL. Questioning the need is quite logical.

如果您還不需要為多個異構機器編寫應用程序，那么您可能還沒有真正理解為什么我們對 SYCL 的前景如此興奮。質疑這種需求是非常合乎邏輯的。

There are many use cases for heterogeneous programming. In our CPPCON 2021 tutorial, we taught programmers from large companies, small companies, and national labs, how to offload high throughput workloads to specialized accelerators.

異構編程有很多用例。在我們的 CPPCON 2021 教程中，我們向來自大公司、小公司和國家實驗室的程序員教授如何將高吞吐量工作負載卸載到專用加速器。

Based on many experiences like that, we have every reason to be confident that interest in SYCL will continue to grow at a rapid pace because of the need for C++ programming for heterogeneous platforms.

基于許多類似的經驗，我們有充分的理由相信，由于異構平臺對 C++ 編程的需求，對 SYCL 的興趣將繼續快速增長。

If you believe in the power of diversity of hardware and want to harness the impending explosion in architectural diversity, then SYCL is worth a look. Not only it open, multivendor, multiarchitecture play – but it is the key one for C++ programmers (as detailed in “Considering a Heterogeneous Future for C++”).

如果您相信硬件多樣性的力量并希望利用即將到來的架構多樣性爆炸，那么 SYCL 值得一看。它不僅是開放的、多供應商、多架構的游戲，而且是 C++ 程序員的關鍵（詳見“考慮 C++ 的異構未來”）。

Open, Industry Standards are Critical to Enable High-Volume Markets

開放的行業標準對于實現大容量市場至關重要

New technology often starts as proprietary developments, which may be sufficient to enable niche applications and markets. But, as these niche applications grow into technology ecosystems, so does the need for competition and industry standardization to enable widespread adoption. Accelerated computing, for many years only a niche capability, has certainly emerged with the status of “here to stay.” Multiple factors contributed to this, and they are not all going away (power wall, IPC wall, memory wall).

新技術通常始于專有開發，這可能足以實現利基應用和市場。但是，隨著這些利基應用程序成長為技術生態系統，競爭和行業標準化的需求也隨之增加，以實現廣泛采用。多年來，加速計算一直只是一種小眾功能，但無疑已經以“長期存在”的狀態出現。造成這種情況的因素有很多，而且它們并不會全部消失（電源墻、IPC 墻、內存墻）。

SYCL and related efforts, like oneAPI, were introduced to bring open, industry standards to the historically proprietary universe of accelerated computing.

SYCL 和相關工作（例如 oneAPI）的推出是為了將開放的行業標準帶入歷史上專有的加速計算領域。

The biggest question is: how many influencers are eager to promote a move to standards, vs. how many are locked up by proprietary interests?

最大的問題是：有多少影響者渴望推動標準的發展，而有多少人被專有利益所束縛？

As the Cambrian explosion of novel computer architectures unfolds, the case for open, multivendor, multiarchitecture standards only grow stronger.

隨著新型計算機架構的大爆炸的展開，開放、多供應商、多架構標準的需求只會變得更加強烈。

SYCL is an open standard that invites feedback and contributions from everyone to the standard and the open source projects to implement it. The shared goal by everyone involved is to unambiguously ensure paths to high performance for all accelerators in this exciting new golden age for computer architecture.

SYCL 是一個開放標準，邀請每個人對該標準以及實施該標準的開源項目提供反饋和貢獻。所有參與者的共同目標是明確確保所有加速器在這個令人興奮的計算機架構新黃金時代實現高性能。

About the Authors

James Reinders believes the full benefits of the evolution to full heterogeneous computing will be best realized with an open, multivendor, multiarchitecture approach. Reinders rejoined Intel a year ago, specifically because he believes Intel can meaningfully help realize this open future. Reinders is an author (or co-author and/or editor) of ten technical books related to parallel programming; his latest book is about SYCL (it can be freely downloaded here).

Michael Wong is the Distinguished Engineer at Codeplay Software. He is a current Director and VP of ISOCPP Foundation, and a senior member of the C++ Standards Committee with more than 25 years of experience. He is a member of the C++ Directions Group. He chairs the WG21 SG19 Machine Learning and SG14 Games Development/Low Latency/Financials C++ groups and is the co-author of a number C++/OpenMP/Transactional memory features including generalized attributes, user-defined literals, inheriting constructors, weakly ordered memory models, and explicit conversion operators. He has published numerous research papers and is the author of a book on C++11. He has been an invited speaker and keynote at numerous conferences. He is currently the editor of SG1 Concurrency TS and SG5 Transactional Memory TS. He is also the Chair of the SYCL standard and all Programming Languages for Standards Council of Canada. Previously, he was CEO of OpenMP involved with taking OpenMP toward Accelerator support and the Technical Strategy Architect responsible for moving IBM’s compilers to Clang/LLVM after leading IBM’s XL C++ compiler team.

[1] Elephants in the Room can be defined as important questions that are obvious, but no one mentions them because they make at least some persons uncomfortable.

你都看到這里了，不如我們嘮叨幾句吧！

從（國內）芯片公司的角度，不想&也不愿去考慮用戶可能需要面對多個異構機器編寫應用程序。但這是市場需要的，這種革命性的想法，只會來自于第三方。????????
我知道Codeplay 今年被intel全資收購了。但國內有這樣的公司生存的土壤嗎？像澎峰科技、一流科技這樣的從事基礎軟件研發的公司，是近年中國少有的火苗，如果他們都不能生存，中國的計算產業有能有什么希望？也希望投資者別去扭曲這種小而美的軟件企業，去幫助他們，大家一起獲得成功。
?