DMA CACHE一致性问题解决方案

2021-12-08 10:01:15 阅读：217 来源： 互联网

标签：DMA cache tx CACHE queue bp 一致性 ring size

static int macb_alloc_consistent(struct macb *bp)
{
    struct macb_queue *queue;
    unsigned int q;
    int size;

    for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
        size = TX_RING_BYTES(bp) + bp->tx_bd_rd_prefetch;
        queue->tx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
                            &queue->tx_ring_dma,
                            GFP_KERNEL);
        if (!queue->tx_ring)
            goto out_err;
        netdev_dbg(bp->dev,
               "Allocated TX ring for queue %u of %d bytes at %08lx (mapped %p)\n",
               q, size, (unsigned long)queue->tx_ring_dma,
               queue->tx_ring);

        size = bp->tx_ring_size * sizeof(struct macb_tx_skb);
        queue->tx_skb = kmalloc(size, GFP_KERNEL);
        if (!queue->tx_skb)
            goto out_err;

        size = RX_RING_BYTES(bp) + bp->rx_bd_rd_prefetch;
        queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
                         &queue->rx_ring_dma, GFP_KERNEL);
        if (!queue->rx_ring)
            goto out_err;
        netdev_dbg(bp->dev,
               "Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
               size, (unsigned long)queue->rx_ring_dma, queue->rx_ring);
    }
    if (bp->macbgem_ops.mog_alloc_rx_buffers(bp))
        goto out_err;

    return 0;

out_err:
    macb_free_consistent(bp);
    return -ENOMEM;
}

dma_alloc_coherent 在 arm 平台上会禁止页表项中的 C （Cacheable）域以及 B (Bufferable)域。
而 dma_alloc_writecombine 只禁止 C （Cacheable）域.

C 代表是否使用高速缓冲存储器（cacheline），而 B 代表是否使用写缓冲区。

这样，dma_alloc_writecombine 分配出来的内存不使用缓存，但是会使用写缓冲区。而 dma_alloc_coherent 则二者都不使用。
C B 位的具体含义
0 0 无cache，无写缓冲；任何对memory的读写都反映到总线上。对 memory 的操作过程中CPU需要等待。
0 1 无cache，有写缓冲；读操作直接反映到总线上；写操作，CPU将数据写入到写缓冲后继续运行，由写缓冲进行写回操作。
1 0 有cache，写通模式；读操作首先考虑cache hit；写操作时直接将数据写入写缓冲，如果同时出现cache hit，那么也更新cache。
1 1 有cache，写回模式；读操作首先考虑cache hit；写操作也首先考虑cache hit。

效率最高的写回，其次写通，再次写缓冲，最次非CACHE一致性操作。

其实，写缓冲也是一种非常简单得CACHE，为何这么说呢。

我们知道，DDR是以突发读写的，一次读写总线上实际会传输一个burst的长度，这个长度一般等于一个cache line的长度。

cache line是32bytes。即使读1个字节数据，也会传输32字节，放弃31字节。

写缓冲是以CACHE LINE进行的，所以写效率会高很多。

DMA 导致的 CACHE 一致性问题解决方案

标签：DMA,cache,tx,CACHE,queue,bp,一致性,ring,size
来源： https://www.cnblogs.com/dream397/p/15660063.html

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9

DMA CACHE一致性问题解决方案