# fpgadiag # ## SYNOPSIS ## ```console fpgadiag [-m | --mode=] [-t | --target=] [options] ``` ## DESCRIPTION ## Includes several tests to diagnose, test, and report on the FPGA hardware. `````` chooses which test to run. `````` specifies the platform that runs the test. `````` can be either ```fpga``` or ```ase``` where ```ase```. `````` is the abbreviation for Accelerator Simulation Environment. The `````` selects from the following tests: **lpbk1** This test runs a loopback test on the number of cachelines specified with the ```BEGIN``` option. ```fpgadiag``` sets up source and destination buffers in main memory. The FPGA then performs a ```memcpy``` from a source buffer to the destination buffer, one cacheline at a time. A cacheline is 64 bytes. When `BEGIN = END`, the test performs one iteration. When `BEGIN = END + x`, the test performs `x` iterations. The first iteration consists of copying `BEGIN` cachelines; the second iteration consists of copying `BEGIN+1` cache lines. The third iteration consists of copying `BEGIN+2` cache lines, and so on. The latency is shown as the number of clock cycles. When you specify `MULTI-CL`, you copy `MULTI-CL` cache lines at a time. The WR-FENCE chooses on which virtual channel the WrFence occurs. If you specify continuous mode with `--cont`, the program iterates until the timeout specified in `TIMEOUT` completes. **read** This test performs reads. Use this test to measure read bandwidth. **write** This test performs writes. Use it to measure write bandwidth. **trput** This test measures both read and write bandwidth by performing 50% read and 50% write tests. **sw** This is a send-and-respond (ping-pong) test. One side sends data and waits for response. Each test requires a particular AF. Before running a test, make sure the required AF is properly configured on the platform. * The lpbk1 test requires the nlb mode 0 AF. * The trput test requires the nlb mode 3 AF. * The sw test requires the nlb mode 7 AF. This AF is only available for the integrated FPGA platform. You cannot run it on the PCIe accelerator card (PAC). **fpgalpbk** This enable/disable FPGA loopback. **fpgastats** This get fpga mac statistics. **mactest** This compare mac addresses that read from MAC ROM with mac addresses read from Host side. ## OPTIONS ## ### Common options ### `--help, -h` Print help information and exit. `--target=, -t` This switch specifies fpga (hardware) or ase (simulation). The default=fpga. `--mode=, -m` The test to run. The valid values are `lpbk1`, `read`, `write`, `trput`, and `sw`. `--config=, -c` A configuration file in the JSON format that specifies options for a test. If an option is specified both in the configuration file and on the command line, the value in the configuration file takes precedence. `--dsm-timeout-usec` Timeout in microseconds for test completion. The test fails if not completed by specified timeout. The default=1000000. `--socket-id=, -s` Socket ID encoded in FPGA Interface Manager (FIM). The default=0. `--bus=, -B` Bus number of the PCIe device. The default=0. `--device=, -D` Device number of the PCIe device. The default=0. `--function=, -F` Function number of the PCIe device. The default=0. `--freq=, -T` Clock frequency (in Hz) used for bandwidth calculation. The default=400000000 Hz (400 MHz). ```eval_rst .. note:: This frequency is used only when the software cannot infer the frequency from the accelerator. ``` `--suppress-hdr, -S` Suppress column headers for text output. The default=off. `--csv, -V` Comma separated value format. The default=off. `--suppress-stats` Suppress statistics output at the end of test. The default=off. ### **lpbk1** test options ### `--guid=, -g` AFU ID to enumerate. The default=D8424DC4-A4A3-C413-F89E-433683F9040B. `--begin=B, -b` 1 <= B <= 65535. The default=1, B = number of cache lines. `--end=E, -e` 1 <= E <= 65535. The default=B, B and E designate number of cache lines. `--multi-cl=M, -u` M can equal 1, 2, or 4. The default=1. `--cont, -L` Continuous mode. The default=off. `--timeout-usec=, --timeout-msec=, --timeout-sec=, --timeout-min=, --timeout-hour=` timeout for --cont mode. The default for all options is 0. `--cache-policy=, -p` Can be wrline-I, wrline-M, or wrpush-I The default=wrline-M. `--cache-hint=, -i` Can be rdline-I or rdline-S. The default=rdline-I. `--read-vc=, -r` Can be auto, vl0, vh0, vh1, random. The default=auto. `--write-vc=, -w` Can be auto, vl0, vh0, vh1, random. The default=auto. `--wrfence-vc=, -f` Can be auto, vl0, vh0, vh1. The default=auto. ### **read** test options ### `--guid=, -g` AFU ID to enumerate. The default=F7DF405C-BD7A-CF72-22F1-44B0B93ACD18. `--begin=B, -b` 1 <= B <= 65535. The default=1, B = number of cache lines. `--end=E, -e` 1 <= E <= 65535. The default=B, B and E designate number of cache lines. `--multi-cl=M, -u` M can equal 1, 2, or 4. The default=1. `--strided-access=S, -a` 1<= S <= 64. The default=1. `--cont, -L` Continuous mode. The default=off. `--timeout-usec=, --timeout-msec=, --timeout-sec=, --timeout-min=, --timeout-hour=` timeout for --cont mode. The default for all options is 0. `--cache-hint=, -i` Can be rdline-I or rdline-S. The default=rdline-I. `--warm-fpga-cache -H; --cool-fpga-cache -M` Try to prime the cache with hits. The default=off. Try to prime the cache with misses. The default=off. `--cool-cpu-cache, -C` Try to prime the cpu cache with misses. The default=off. `--read-vc=, -r` Can be auto, vl0, vh0, vh1, random. The default=auto ### **write** test options ### `--guid=, -g` AFU ID to enumerate. The default=F7DF405C-BD7A-CF72-22F1-44B0B93ACD18 `--begin=B, -b` 1 <= E <= 65535. The default=B, B and E designate number of cache lines. `--multi-cl=M, -u` M can equal 1, 2, or 4. The default=1. `--strided-access=S, -a` 1<= S <= 64. The default=1. `--cont, -L` Continuous mode. The default=off. `--timeout-usec=, --timeout-msec=, --timeout-sec=, --timeout-min=, --timeout-hour=` timeout for --cont mode. The default for all options is 0. `--cache-policy=, -p` Can be wrline-I, wrline-M, or wrpush-I The default=wrline-M `--warm-fpga-cache -H; --cool-fpga-cache -M` Try to prime the cache with hits. The default=off. Try to prime the cache with misses. The default=off. `--cool-cpu-cache, -C` Try to prime the cpu cache with misses. The default=off. `--write-vc=, -w` Can be auto, vl0, vh0, vh1, random. The default=auto. `--wrfence-vc=, -f` Can be auto, vl0, vh0, vh1, random. The default=`WRITE-VC`. `--alt-wr-pattern, -l` Alternate Write Pattern. The default=off. ### **trput** test options ### `--guid=, -g` AFU ID to enumerate. The default=F7DF405C-BD7A-CF72-22F1-44B0B93ACD18. `--begin=B, -b` 1 <= B <= 65535. The default=1, B = number of cache lines. `--end=E, -e` 1 <= E <= 65535. The default=B, B and E designate number of cache lines. `--multi-cl=M, -u` M can equal 1, 2, or 4. The default=1. `--strided-access=S, -a` 1<= S <= 64. The default=1 `--cont, -L` Continuous mode. The default=off. `--timeout-usec=, --timeout-msec=, --timeout-sec=, --timeout-min=, --timeout-hour=` timeout for --cont mode. The default for all options is 0. `--cache-policy=, -p` Can be wrline-I, wrline-M, or wrpush-I The default=wrline-M. `--cache-hint=, -i` Can be rdline-I or rdline-S. The default=rdline-I. `--read-vc=, -r` Can be auto, vl0, vh0, vh1, random. The default=auto. `--write-vc=, -w` Can be auto, vl0, vh0, vh1, random. The default=auto. `--wrfence-vc=, -f` Can be auto, vl0, vh0, vh1. The default=`WRITE-VC`. ### **sw** test options ### `--guid=, -g` AFU ID to enumerate. The default=7BAF4DEA-A57C-E91E-168A-455D9BDA88A3. `--begin=B, -b` 1 <= B <= 65535. The default=1, B = number of cache lines. `--end=E, -e` 1 <= E <= 65535. The default=B, B and E designate number of cache lines. `--cache-policy=, -p` Can be wrline-I, wrline-M, or wrpush-I. The default=wrline-M. `--cache-hint= -i` Can be rdline-I or rdline-S. The default=rdline-I. `--read-vc=, -r` Can be auto, vl0, vh0, vh1, random The default=auto. `--write-vc=, -w` Can be auto, vl0, vh0, vh1, random The default=auto. `--wrfence-vc=, -f` Can be auto, vl0, vh0, vh1. The default=`WRITE-VC`. `--notice=, -N` Can be poll or csr-write. The default=poll. ### **Enable FPGA N3000 Ethernet group VFIO mdev** ### FPGA DFL driver does not support any ioctls to read/write ethernet group info and registers. Users can read/write eth group registers by enabling VFIO mdev. Unbind the dfl_eth_group driver and bind vfio-mdev-dfl driver for ethernet group dfl-device; then userspace can take full control of ethernet group feature id 10. Ethernet group must be enabled before running fpgalpbk, mactest tools. #### **Steps to enable/create vfio mdev** #### unbind eth group feature id 10: echo dfl-fme.0.8 > /sys/bus/dfl/drivers/dfl-eth-group/unbind echo dfl-fme.0.7 > /sys/bus/dfl/drivers/dfl-eth-group/unbind bind to vfio-mdev-dfl: echo vfio-mdev-dfl > /sys/bus/dfl/devices/dfl-fme.0.7/driver_override echo vfio-mdev-dfl > /sys/bus/dfl/devices/dfl-fme.0.8/driver_override load vfio driver: modprobe vfio_pci modprobe vfio_iommu_type1 modprobe vfio_mdev modprobe vfio_mdev_dfl trigger mdev: echo dfl-fme.0.7 >/sys/bus/dfl/drivers_probe echo dfl-fme.0.8 >/sys/bus/dfl/drivers_probe echo 83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 > /sys/bus/dfl/devices/dfl-fme.0.7/mdev_supported_types/vfio-mdev-dfl-1/create echo 83b8f4f2-509f-382f-3c1e-e6bfe0fa1002 > /sys/bus/dfl/devices/dfl-fme.0.8/mdev_supported_types/vfio-mdev-dfl-1/create linux kerenl msg after enabling mdev: i40e 0000:b3:00.0 eth1: NIC Link is Down i40e 0000:b1:00.1 eth0: NIC Link is Down vfio-mdev-dfl dfl-fme.2.7: MDEV: Registered vfio-mdev-dfl dfl-fme.2.8: MDEV: Registered vfio_mdev 83b8f4f2-509f-382f-3c1e-e6bfe0fa1005: Adding to iommu group 140 vfio_mdev 83b8f4f2-509f-382f-3c1e-e6bfe0fa1005: MDEV: group_id = 140 vfio_mdev 83b8f4f2-509f-382f-3c1e-e6bfe0fa1006: Adding to iommu group 141 vfio_mdev 83b8f4f2-509f-382f-3c1e-e6bfe0fa1006: MDEV: group_id = 141 #### **Remove vfio mdev** #### echo 1 | sudo tee /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1002/remove echo 1 | sudo tee /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove rmmod vfio_mdev_dfl modprobe dfl_eth_group echo dfl-fme.0.7 >/sys/bus/dfl/drivers_probe echo dfl-fme.0.8 >/sys/bus/dfl/drivers_probe echo dfl-eth-group > /sys/bus/dfl/devices/dfl-fme.0.7/driver_override echo dfl-eth-group > /sys/bus/dfl/devices/dfl-fme.0.8/driver_override ### **fpgalpbk** test options ### `--enable` Enable fpga phy loopback. `--disable` Disable fpga phy loopback. `--direction` Can be local, remote. `--type` Can be serial, precdr, postcdr. `--side` Can be line, host. `--port` 0 <= port <= 7, the default is all. ### **mactest** test options ### `--offset` Read mac addresses from an offset, The default=0. ## EXAMPLES ## This command starts a `lpbk1` test for the FPGA on bus `0x5e`. The test copies 57535, 57536, 57537 ... up to 65535 cache lines, one line at a time. The test prints output in the comma separated values (CSV) format with the header suppressed. ```console ./fpgadiag --mode=lpbk1 --target=fpga -V --bus=0x5e --begin=57535 --end=65535 --cache-hint=rdline-I --cache-policy=wrpush-I --multi-cl=1 --write-vc=vl0 --read-vc=vh1 --wrfence-vc=auto ``` This command starts a `read` test on the FPGA located on bus `0xbe`. The test reads 2045 cache lines in the continuous mode with a 15-second timeout period. The reads use a strided pattern with a 10-byte stride length. ```console ./fpgadiag --mode=read --target=fpga -V --bus=0xbe --begin=2045 --cont --timeout-sec=15 --cache-hint=rdline-I --multi-cl=1 -a=10 --read-vc=auto --wrfence-vc=auto ``` This command starts a `sw` test on the FPGA located on bus `0xbe`. The test signals completion using a CSR write. ```console ./fpgadiag --mode=sw --target=fpga -V --bus=0xbe --begin=4 --end=8192 --cache-hint=rdline-I --cache-policy=wrline-I --notice=csr-write --write-vc=vl0 --wrfence-vc=auto --read-vc=random ``` This command enable a `fpgalpbk` on the FPGA located on bus `0xbe`. ```console ./fpgadiag -m fpgalpbk --bus 0xbe --enable --direction local --type postcdr --side host ``` This command show `fpgastats` on the FPGA located on bus `0xbe`. ```console ./fpgadiag -m fpgastats --bus 0xbe ``` ## TROUBLESHOOTING ## When a test fails to run or gives errors, check the following: * Is the Intel FPGA driver properly installed? See [Installation Guide](/fpga-doc/docs/install_guide/installation_guide.html) for driver installation instructions. * Are FPGA port permissions set properly? Check the permission bits of the port, for example, `/dev/intel-fpga-port-0`. You need READ and WRITE permissions to run `fpgadiag` tests. * Is hugepage properly configured on the system? See [Installation Guide](/fpga-doc/docs/install_guide/installation_guide.html) for hugepage configuration steps. In particular, `fpgadiag` requires a few 1 GB pages. * Is the required AFU loaded? See [DESCRIPTION](#description) for information about what AFU the test requires. * Are `--begin` and `--end` values set properly? `--end` must be larger than the `--begin`. Also, `--begin` must be a multiple of the `--multi-cl` value. * The `--warm-fpga-cache` and `--cool-fpga-cache` options in the `read` and `write` tests are mutually exclusive. * The timeout options are only meaningful for the continuous mode (with the `--cont` option). ## Revision History ## | Date | Intel Acceleration Stack Version | Changes Made | |:------|----------------------------|:--------------| |2018.05.21| DCP 1.1 Beta (works with Quartus Prime Pro 17.1.1) | fpgadiag now reports the correct values for bandwidth. |