Support for heterogeneous acceleration architecture #3811

HPFilter · 2022-03-19T08:27:15Z

Is your feature request related to a problem? Please describe.

The security of FATE is dependent on several privacy computing techniques while corresponding cryptographic operations are time-consuming. The bottleneck comes from the complex calculations of ciphertexts which are usually large integers. It is hard for CPU to achieve high performance without changing the underlying algorithm. Thus, it is worthwhile to support different types of computing devices for the calculations.

Describe the solution you'd like

This proposal intends to allow FATE to support heterogeneous acceleration architecture, including APIs, data structure, etc., which efficiently combines FATE with hardware devices.

Different from CPU, hardware devices like FPGA and GPU are designed with SIMD architecture. They can achieve considerate throughput with the nature of high computing parallelism. Therefore, it is promising to use these devices to accelerate the computing-density calculations in FATE. The assistance of hardware devices is called heterogeneous acceleration for FATE.

To successfully leverage hardware-based accelerators, a three-layer architecture is needed in general.

The lowermost layer is a library which implements different cryptographic operations by managing and calling the devices. It is expandable considering increasing operations and variant devices. To achieve efficient interaction with hardware device, the library is usually developed with C/C++ language. As a result, an effective cross-language binding is also required to make the library accessible for FATE.

The middle layer performs as the middleware of the architecture. It defines the data structure to store and transfer data related to cryptographic operations. As mentioned above, the high throughput of hardware devices comes from high parallelism. It indicates that operations with single slice of data lead to great waste of performance of devices. In order to maximize the utilization of the device and minimize the overhead of cross-device interaction, data should be stored in blocks and processed by the device parallelly. Reasonable memory layout which is both transfer-friendly and computing-friendly is the key point in the design of data structure.

The uppermost layer consists of multiple computing APIs, which are similar to current APIs in FATE. These APIs overload the cryptographic operations for the data structure defined in the middle layer. Thus, they can be called without much modification to FATE. In addition to the calculations, minimal but necessary data format conversion is also required before executing the operations to construct the well-defined data structure.

Furthermore, few minor changes are also required to fit the architecture above. For example, additional training parameters are needed for the user to specify the configuration of hardware devices, including type and number.

luhang-HPU · 2022-05-25T12:51:20Z

Is it possible to let FATE leverage the FPGA accelerator? What is the API and which part of the source code for this?

sagewe · 2022-05-25T15:21:04Z

Is it possible to let FATE leverage the FPGA accelerator? What is the API and which part of the source code for this?

We are working on this, but the current implementation is in very early stages.
We plan to provide a preliminary implementation of the CPU/GPU backend in the next release.
While supporting the FPGA backend is not currently in our support plan, we will leave a friendly enough interface for users who want to implement it themselves.

github-actions · 2024-07-09T06:11:18Z

This issue has been marked as stale because it has been open for 365 days with no activity. If this issue is still relevant or if there is new information, please feel free to update or reopen it.

github-actions · 2024-07-10T06:27:40Z

This issue was closed because it has been inactive for 1 days since being marked as stale. If this issue is still relevant or if there is new information, please feel free to update or reopen it.

zchunhai mentioned this issue Mar 25, 2022

Hardware Accleration Support FederatedAI/FATE-Community#33

Merged

dylan-fan assigned dylan-fan and sagewe Mar 29, 2022

dylan-fan added the arch Architecture related label Mar 29, 2022

sagewe linked a pull request Jul 20, 2022 that will close this issue

feat: implement paillier tensor block for GPU & FPGA #4123

Open

github-actions bot added the stale label Jul 9, 2024

github-actions bot closed this as completed Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for heterogeneous acceleration architecture #3811

Support for heterogeneous acceleration architecture #3811

HPFilter commented Mar 19, 2022

luhang-HPU commented May 25, 2022

sagewe commented May 25, 2022

github-actions bot commented Jul 9, 2024

github-actions bot commented Jul 10, 2024

Support for heterogeneous acceleration architecture #3811

Support for heterogeneous acceleration architecture #3811

Comments

HPFilter commented Mar 19, 2022

luhang-HPU commented May 25, 2022

sagewe commented May 25, 2022

github-actions bot commented Jul 9, 2024

github-actions bot commented Jul 10, 2024