Introduction
This book includes information, guides and detail about Fletcher platform support for the Open Programmable Acceleration Engine (OPAE).
Fletcher is a framework to integrate FPGA accelerators with Apache Arrow
OPAE is a software framework for managing and accessing programmable accelerators (FPGAs)
Apache Arrow is a cross-language development platform for in-memory analytics
Setup
This section describes how to setup a development environment to start using Fletcher and OPAE.
Support
Support is limited and only lists tested operation systems and devices for now.
Operating systems
- Centos 7.7
Devices
- Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA
- Not working
- Local memory support
- Networking interface
- Not working
Host system setup
This subsection describes the required setup steps for the host system.
Docker
The development environment is based on a Docker image provided by this project. It includes all the required tools and components to:
- Generate Fletcher projects
- fletchgen
- vhdmmio
- pyarrow
- Build Fletcher projects
- C++11 compiler
- Apache Arrow 1.0+
- Hardware/software co-simulation
- Modelsim
- OPAE ASE
- Generate bistreams
- Quartus
- Updated Platform Interface Manager
- PACsign
- Update bitstream on FPPGA
- fpgaconf
- Run on hardware
- Fletcher runtime
- Fletcher OPAE platform support
- OPAE library
Install the latest stable version of Docker.
Driver
If you have access to a supported device and want to run on hardware start by installing the Intel FPGA driver.
sudo yum install -y https://github.com/OPAE/opae-sdk/releases/download/1.4.0-1/opae-intel-fpga-driver-2.0.4-2.x86_64.rpm
Validate that the driver installed successfully and is loaded.
$ lsmod | grep fpga
intel_fpga_pac_hssi 24389 0
intel_fpga_fme 87452 0
intel_fpga_afu 36165 0
ifpga_sec_mgr 13757 1 intel_fpga_fme
fpga_mgr_mod 14812 1 intel_fpga_fme
intel_fpga_pci 26500 2 intel_fpga_afu,intel_fpga_fme
Updating the FIM and BMC firmware
TODO
Development environment setup
This subsection describes how to build the development environment image for this project.
Get the Dockerfile
Download the Dockerfile or clone the repository.
git clone https://github.com/teratide/fletcher-opae
cd fletcher-opae/
Build the image.
docker build -t ias:1.2.1 - < Dockerfile
That's it.
Sum example
This section explains how to use the sum example.
Prepare
Project structure
Like most Fletcher projects, this example project contains software (host application) and hardware (accelerator functional unit) sources.
Software
The host application consists of a single C++ source file that contains the host application that interacts with the accelerator. A CMake file is included that can be used to build the application. Instructions are available in the simulate and hardware sections.
Hardware
The hardware directory contains a number of important files.
sv
ofs_plat_afu.sv
: this is the platform specific top-level that wraps the top-level generated by Fletcher
vhdl
Sum.vhd
: this is the sum kernel, taken (without modification) from the Fletcher project.
generate-input.py
: an example python script to generate an Arrow recordbatch file with some numbers that the accelerator will add.sources.txt
: a list of sources that are used by the tools to discover the required files.sum.json
: contains information about the accelerator that is used by the tools. Theaccelerator-type-uuid
is used by the host application for discovery.sum.mmio.yml
: a customvhdmmio
input file. Needed because currently Fletchgen does not support generating compatible MMIO files for this platform.
The first step is to use fletchgen
to generate the required hardware to wrap the sum kernel.
Start by starting a new container from the root of the example.
cd fletcher-opae/examples/sum/
docker run -it --rm -v `pwd`:/src ias:1.2.1
Generate an input recordbatch file using the provided generate-input.py
Python script.
python3 generate-input.py
This script generates the example.rb
file.
Then run Fletchgen on the mounted source folder.
cd /src/hw
fletchgen -n Sum -r example.rb -l vhdl --mmio64 --mmio-offset 64 --axi
Because Fletchgen currently does not support generating MMIO files for this platform run vhdmmio
on the provided MMIO yaml file.
vhdmmio -V vhdl -P vhdl sum.mmio.yml
You can now exit
the container and inspect the generated files.
Simulate
This subsection shows how to run hardware/software co-simulation using OPAE ASE.
Start by starting a new container for simulation:
cd fletcher-opae/examples/sum
docker run -it --rm --name ias -e DISPLAY -v `pwd`:/src:ro ias:1.2.1
Start simulation
afu_sim_setup -s /src/hw/sources.txt /sim
cd /sim
make
Start the simulation.
make sim
Start host application
Start another shell in the running container.
docker exec -it ias bash
Build the host application.
mkdir -p /build
cd /build
cmake3 /src/sw
make
Check if the simulator is ready and run your host application.
export ASE_WORKDIR=/sim/work
./sum /src/hw/example.rb
The host application should output -6
.
You can inspect the waveform.
cd /sim
make wave
Hardware
This subsection shows how to synthesize the hardware design, flash the bitstream to the FPGA, and run your application using the accelerator.
Synthesis
Start by running a new container.
cd fletcher-opae/examples/sum
docker run -it --rm --name ias --net=host -v `pwd`:/src:ro ias:1.2.1
Create the synthesis environment and generate the bitstream.
afu_synth_setup -s /src/hw/sources.txt /synth
cd /synth
${OPAE_PLATFORM_ROOT}/bin/run.sh
In order to flash the resulting bitstream we must run it through PACsign. In this case the bitstream is not signed, but this step is still required.
PACSign PR -y -t UPDATE -H openssl_manager -i sum.gbs -o sum_unsigned.gbs
Start a new shell and copy the resulting unsigned bitstream to your local machine.
cd fletcher-opae/examples/sum
docker cp ias:/synth/sum_unsigned.gbs .
Flash the bistream
To flash the bistream start new privileged container to access the FPGA.
cd fletcher-opae/examples/sum
docker run -it --rm --privileged -v `pwd`:/src:ro ias:1.2.1
Flash the bitstream using fpgaconf
.
fpgaconf sum_unsigned.gbs
Run the host application
Start a new container with access to the device.
cd fletcher-opae/examples/sum
docker run -it --rm --device /dev/intel-fpga-port.0 -v `pwd`:/src:ro ias:1.2.1
Build the host application. It's important to use a release build to disable simulation mode.
mkdir -p /build
cd /build
cmake3 -DCMAKE_BUILD_TYPE=release /src/sw
make
Run the host application, using the accelerator.
./sum /src/hw/example.rb
The result should be -6.
Primmap example
This section explains how to use the primmap example.
This example read from an input recordbatch with numbers, adds one to all the values, and writes these values to an output recordbatch in host memory.
Prepare
Project structure
Like most Fletcher projects, this example project contains software (host application) and hardware (accelerator functional unit) sources.
Software
The host application consists of a single C++ source file that contains the host application that interacts with the accelerator. A CMake file is included that can be used to build the application. Instructions are available in the simulate and hardware sections.
Hardware
The hardware directory contains a number of important files.
sv
ofs_plat_afu.sv
: this is the platform specific top-level that wraps the top-level generated by Fletcher
vhdl
primmap.vhd
: this is the primmap kernel. It adds one to all elements of the input batch and writes the result to the output batch.
generate.py
: an example python script to generate an Arrow recordbatch file with some numbers that the accelerator will add and the output schema.sources.txt
: a list of sources that are used by the tools to discover the required files.primmap.json
: contains information about the accelerator that is used by the tools. Theaccelerator-type-uuid
is used by the host application for discovery.primmap.mmio.yml
: a customvhdmmio
input file. Needed because currently Fletchgen does not support generating compatible MMIO files for this platform.primmap.ext.yml
: a set of custom signals that are added to the design by fletchgen.
The first step is to use fletchgen
to generate the required hardware to wrap the sum kernel.
Start by starting a new container from the root of the example.
cd fletcher-opae/examples/primmap/
docker run -it --rm -v `pwd`:/src ias:1.2.1
Generate an input recordbatch file and the output schema using the provided generate.py
Python script.
python3 generate.py
This script generates the in.rb
and out.as
files.
Then run Fletchgen on the mounted source folder.
cd /src/hw
fletchgen -n primmap -r in.rb -i out.as -l vhdl --mmio64 --mmio-offset 64 -e primmap.ext.yml --axi
Because Fletchgen currently does not support generating MMIO files for this platform run vhdmmio
on the provided MMIO yaml file.
vhdmmio -V vhdl -P vhdl primmap.mmio.yml
You can now exit
the container and inspect the generated files.
Use
To simulate or generate hardware for this example, please refer to the Simulate and Hardware sections of the Sum example. The steps are mostly the same (except for some different filenames).