<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7638.1">
<TITLE>RE: Create permanent mapping from PCI bus to region of physical memory</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>>> The periperal is an FPGA. No, there is no internal processor;<BR>
>> everything is coded in Verilog.<BR>
>><BR>
>> Scatter/gather isn't a viable option because of this.<BR>
<BR>
>Er, why not, its an FPGA, everything is possible :)<BR>
<BR>
Not if you're tight on routing resouces and/or design time ;-)<BR>
<BR>
>So you have a PCI core, since you are planning to write to<BR>
>the host memory space, its a Bus Master PCI core.<BR>
>Who's FPGA, Altera, Xilinx, or someone else?<BR>
<BR>
Xilinx Virtex 4, no internal PPC. The vendor really isn't<BR>
germain to the problem at hand however.<BR>
<BR>
>> Additionally non-contiguous memory would reduce bandwidth<BR>
>> and increase FPGA design complexity.<BR>
<BR>
>Not necessarily. If the target is using bus master DMA to<BR>
>write to the host memory, then you can hit pretty close<BR>
>to the bandwidth of the PCI bus. If you are DMAing in<BR>
>big blocks, the overhead of a block change isn't too much.<BR>
>I did tests with the 440EP using a DMA controller on an<BR>
>adapter board and found that the PCI bridge in the 440EP<BR>
>was the limiting factor, i.e., for a 33MHz 32-bit bus<BR>
>with a potential for 132MB/s, the *best* you can do is<BR>
>about 40MB/s since the bridge only accepts data in cache<BR>
>line sizes before sending a retry to the target. I can<BR>
>send you those results.<BR>
<BR>
Dave, I appreciate your input but scatter/gather just isn't<BR>
an option here for a variety of reasons. Bandwidth/complexity/<BR>
latency/time to design/time to debug/FPGA density all<BR>
factor into this decision. BTW we're talking 66/64 PCI-X;<BR>
33/32 PCI isn't nearly fast enough.<BR>
<BR>
>Randomly accessible from where; the host or an I/O interface<BR>
>at the FPGA. The pages can be made to appear contiguous to<BR>
>a host processor user-space process using the nopage callback<BR>
>of the VMA.<BR>
<BR>
>From the point of view of the FPGA.<BR>
<BR>
>> I realize this isn't a standard linux request but having<BR>
>> fixed, linear memory is quite common in embedded apps. There<BR>
>> should be a way to create this mapping in the 440GX's hardware<BR>
>> and I'm just looking for a system call (if there is one) to<BR>
>> implement it.<BR>
<BR>
>Alas, this is one of the concessions one must make if you<BR>
>want to use a processor that enables the MMU.<BR>
<BR>
Nope, I disagree. The MMU isn't involved here at all. I'm<BR>
talking about setting up the PIM (PCI Inbound Map) via<BR>
a system call as opposed to just writing it directly.<BR>
<BR>
>However,<BR>
>I don't see any fundamental limitation in the design<BR>
>that would preclude a little extra work on the FPGA.<BR>
>But, it does require additional Verilog to support<BR>
>the flexibility. The long-term advantage is that you<BR>
>don't have to provide a hack (eg. reserve a block of<BR>
>high-memory under Linux).<BR>
<BR>
I really can't go into detail about my application for a<BR>
variety of reasons (both technical and legal) but suffice<BR>
it to say that 16 MB is just the starting point.<BR>
<BR>
I guess I'll dig a little deeper into the source to see<BR>
where the kernel does the PIM mapping. Re-architecting<BR>
our app at this stage just isn't a practical consideration.<BR>
<BR>
Thanks again,<BR>
<BR>
Marc<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
</FONT>
</P>
</BODY>
</HTML>