1. pci NIC支持两种模式

  • Packet Mode: 类似网卡
  • Coprocessor Mode: host发请求, octeon给response

2. PEM

  • RC mode: type1 configuration space header
  • EP mode: type 0 configuration space header

  • pcie2.0

  • port0: RC or EP
  • port 1/2: RC

2.1. in EP mode

  • no I/O space
  • bar0: CN71xx SLI CSR; 16KB
    SLIWIN* enable indirect access to all other CPU CSRs(or mem?)
  • bar1: 64M, indirect access to L2/DRAM
EP: pcie bar1(64MB window)

38-bit L2/DRAM address referenced by the PCIe read/write operations in BAR1 space:
37                         22|21                           0
------------------------------------------------------------
+ PEM0_BAR1_INDEXn[ADDR_IDX] | EntryOff(from BAR1 address) +
------------------------------------------------------------
|                ^           |\                             \
|                |           |  \                             \
|                |           |    \                             \
|                |           |      \                             \
|                |           |        v                             v
|                |           |  pcie address from X86_64(lower 26bit)
|                |           |  25  22 21                           0
|                |           |  -------------------------------------
|                |           |  +Entry|        EntryOff(4M)         +
|                |           |  -------------------------------------
|                |           |     |
|              +-+-----------+-----+   
v              v             v 
PEM0_BAR1_INDEXn(0..15) CSR
19                          4 3  2     1   0
------------------------------------------------
+        ADDR_IDX            |CA|END_SWP|ADDR_V+
------------------------------------------------

64M window = 16 block * 4M
  • bar2: direct access to L2/DRAM; 512G/2T capability;
    PEM0_BAR_CTL[BAR2_ENB ]: default is disable --how to enable as default?

cat /proc/iomem

  • rom bar: to bootbus; 64KB

2.2. in RC mode

  • RC mode: does not use the base address in type 1 header.
  • RC mode: use internal bar0 bar1 bar2 to access octeon from EP device
  • internal bar0: 16K, --"just like in EP mode"
    PEM0_P2N_BAR0_START make no sense? only expose SLI CSRs to device?
  • internal bar1: 64M, --"just like in EP mode", but can change base? --PEM1_P2N_BAR0_START
  • internal bar2: 2T, --"just like in EP mode", but can change base? --PEM2_P2N_BAR0_START ?

3. DPI & SLI

  • DPI负责DMA
  • SLI负责IO BUS
  • 为什么DMA能把X86的内存拷贝到OCTEON? 反向看X86的内存?
  • 用户态中断怎么处理?UIO? VFIO?
  • 可DMA的buffer怎么申请? 尤其是用户态?
    参考

Though the topic is about DMA memory used by device operation, the approach is the same if a memory mapped memory is shared by multiple processes on different virtual mappings.



# DPI(以下以61xx为例)
![](img/octeon_pci_NIC_20220922231139.png)  

* 32 instruction input ring: 8 ring each port
* DPI input ports in PIP/IPD: 32 33 34 35

## input
![](img/octeon_pci_NIC_20220922231201.png)  

这里面DPTR要么
1. 直接模式: 和DPTR1 2 3 联用, 指向data
2. 间接模式: 指向一个链表, 每个component如下, 64bit对齐  
![](img/octeon_pci_NIC_20220922231234.png)  

* DPI 根据instruction构造packet  
![](img/octeon_pci_NIC_20220922231434.png)  

* 4种DPTR format  
![](img/octeon_pci_NIC_20220922231502.png)  
![](img/octeon_pci_NIC_20220922231625.png)  
![](img/octeon_pci_NIC_20220922231535.png)  
![](img/octeon_pci_NIC_20220922231643.png)  
![](img/octeon_pci_NIC_20220922231700.png)  

## output
* 32 output ring: 8 ring each port
* 4 output port of PKO: 32 33 34 35
* ring里面的每个项:
![](img/octeon_pci_NIC_20220922231731.png)  

对每个ring  
SLI_PKT_DPADDR[DPTR<n>]=1时,  

这两个pointer的format是上面的DPTR0  
SLI_PKT_DPADDR[DPTR<n>]=0时, 

这两个pointer的format是上面的DPTR1

* output有两种mode  
SLI_PKT_IPTR[IPTR<n>]=1时,  Info-Pointer Mode  
当报文小于一个buffer size的时候, 很简单, DPI把报文写入buffer pointer的地方, 同时把大小写入info pointer的packet length;  
但报文太大的时候, DPI把报文依次写在buffer pointer里面, 但只用第一个对应的Info pointer, 这里面会记录整个packet的size  
info的格式  
![](img/octeon_pci_NIC_20220922231908.png)  

注: 这里面的info bytes由SLI_PKT(0..31)_OUT_SIZE[ISIZE]决定, 最多120  
同时SLI_PKT(0..31)_OUT_SIZE[BSIZE]决定了buffer size, 最多65536

SLI_PKT_IPTR[IPTR<n>]=0时,  Buffer-Pointer-Only Mode  
先写packet size, 紧跟着是data

## DPI DMA
* 8个命令queue, 给6个DMA engine
## DMA的命令字
![](img/octeon_pci_NIC_20220922232120.png)  
![](img/octeon_pci_NIC_20220922232141.png)  

HDR里面会指明DMA的种类
![](img/octeon_pci_NIC_20220922232206.png)  

pointer的格式  
这里面的size只有12位, 说明每个PTR只能管4K byte  
![](img/octeon_pci_NIC_20220922232303.png)  

## DMA的命令queue
* 是个链表, 在ram里. 软件写链表头, 硬件读链表尾  
![](img/octeon_pci_NIC_20220922232325.png)  

## DMA engine 0..5
* DPI_DMA_CONTROL[PKT_EN]=1 ==> DPI use engine 5; 其他情况从8 queue 到6 engine任意map
* 其他情况下, DMA和DPI有鸟关系? 可能只是给通用DMA用的? 从core到host mem, 通用方式?

# PCI-BASE
components/driver/OCT-PCI-BASE-GUIDE-2.3.pdf
## packages
* OCTEON-PCI-BASE: x86 pci driver && octeon se
* OCTEON-PCI-NIC: x86 network device; on top of OCTEON-PCI-BASE
* OCTEON-PCI-CNTQ: x86 app; on top of OCTEON-PCI-BASE

## overview
![](img/octeon_pci_NIC_20220922232710.png)  
![](img/octeon_pci_NIC_20220922232727.png)  

## input rings
![](img/octeon_pci_NIC_20220922232756.png)  

## output rings
![](img/octeon_pci_NIC_20220922232810.png)  

## dma
![](img/octeon_pci_NIC_20220922232824.png)  

## dir tree
![](img/octeon_pci_NIC_20220922232859.png)  

## host driver
* Initialization of various OCTEON PCI blocks including Input/Output  
rings and DMA engines.
* Receiving requests from user and kernel space applications and
forwarding them to OCTEON.
* Managing different requests types while they wait being fetched by
OCTEON or till a response arrives from OCTEON.
* Receiving packets from OCTEON via the output rings and dispatch them
to kernel applications.
* Managing various hooks that other kernel modules (like NIC) can call to change the default packet processing behavior.

### device configuration
See OCTEON_config.h/OCTEON_main.c

### input ring initialization
See `request_manager.c` - `OCTEON_init_instr_queue()`  
默认配置:  
* 32字节type, 1024 entries for ring0, 128 entries for other rings  
* for all input rings: 64-bit endian swapping and disables relaxed ordering and NoSnoop operations

### output ring initialization
See `OCTEON_droq.c` - `OCTEON_init_droq()`
默认配置:
* 1024 entries for ring0, 128 entries for other rings
* info pointer mode
* pre-allocated buffer(1024 bytes) for each entry in descriptor ring
* 100ms timer interrupt

### dma initialization
完成通知:
* Issue a work-queue entry to the POW
* Write 0 to a memory location

### other initialization
* request list: from app
* poll list: in kernel thread
* dispatch: opcode for packages arriving on output rings

### linux ko
`components/driver/host/driver/linux/octeon_linux.c`

## input ring processing
![](img/octeon_pci_NIC_20220922233859.png)  
![](img/octeon_pci_NIC_20220922233919.png)  

### 发送报文
![](img/octeon_pci_NIC_20220922234031.png)  
* `OCTEON_send_request()` from kernel space
* `ioctl` from user space

#### 发送报文属性
![](img/octeon_pci_NIC_20220922234105.png)  
![](img/octeon_pci_NIC_20220922234155.png)  

#### DMA支持的四种模式
![](img/octeon_pci_NIC_20220922234221.png)  
注: 即使对app支持多buffer模式, 但从user space拷贝到kernel space时, 驱动会拷贝到一个buffer里面

#### 软件发送接口
`OCTEON_soft_request()`, 用到的结构体OCTEON_soft_request_t在`components/driver/host/include/cavium_defs.h`

#### response格式
![](img/octeon_pci_NIC_20220922234425.png)  
![](img/octeon_pci_NIC_20220922234536.png)  
![](img/octeon_pci_NIC_20220922234554.png)  

## octeon设备文件
`/dev/OCTEON_device`, user api 库(liboctapi.a)依赖它
api在`components/driver/host/api/octeon_user.h`

## output ring processing
* 100ms中断
* 从ring0开始依次处理

![](img/octeon_pci_NIC_20220922234715.png)  

可以根据opcode调用相应的回调函数  
可以根据不同的ring做fast path, 用`OCTEON_register_droq_ops()`来注册fast path处理函数  
默认走slow path: 即所有来的package都查opcode做相应处理, 没有处理函数则driver就把buffer回收了  
output ring 重填可以update新的buffer info

## SE都做了什么
比如初始化DMA

cvm_drv_init() cvm_drv_local_init() cvm_drv_setup_app_mode() cvm_drv_start() ``` 从pcie进来的报文进WQE, 和其他接口进来的报文类似.data前有24字节的头?

octeon发报文给x86时, 就像发给SGMII一样, 给PKO写命令, 此时, 硬件完成

  1. 从output ring获取下一个报文buffer
  2. 把PKO来的报文分片, 以写入一个或多个buffer
  3. 写info pointer, 比如要写报文长度
  4. 发中断(报文数或者时间片) DMA不要求octeon和x86的buffer数是一样的, 在DMA传输时, 只要总字节数一样就OK
    core driver会填DMA命令字来做DMA双向传输

3.1. test

  • oct_req是用户态程序
  • req_resp是内核态程序
  • droq_test是内核态程序, 在output ring上注册了一个收包函数
  • cvmcs:SE 在applications/pci-core-app
  • oct_dbg:一个带menu的dgb工具
  • oct_stats:显示input output DMA queue的工具

results matching ""

    No results matching ""