BS in Deutsche Bank job ads regarding TCP/IP network latency (HFT)?

Btw, for the discriminating S&M practitioner:

OpenFabrics Alliance.
Oracle's take.
EPAM's pitch.
An academic paper.
State machine replication for high availability using RDMA.
Another pitch, this time from Arista networks.
Yada, yada, yada...

Code:
Package: librdmacm-dev           
Version: 22.1-1
State: not installed
Multi-Arch: same
Priority: optional
Section: libdevel
Maintainer: Benjamin Drung <benjamin.drung@cloud.ionos.com>
Architecture: amd64
Uncompressed Size: 325 k
Depends: libibverbs-dev, librdmacm1 (= 22.1-1)
Description: Development files for the librdmacm library
librdmacm is a library that allows applications to set up reliable connected and unreliable datagram
transfers when using RDMA adapters. It provides a transport-neutral interface in the sense that the same code
can be used for both InfiniBand and iWARP adapters.  The interface is based on sockets, but adapted for queue
pair (QP) based semantics: communication must use a specific RDMA device, and data transfers are
message-based.
librdmacm only provides communication management (connection setup and tear-down) and works in conjunction
with the verbs interface provided by libibverbs, which provides the interface used to actually transfer data.
This package is needed to compile programs against librdmacm1. It contains the header files and static
libraries (optionally) needed for compiling.
Homepage: https://github.com/linux-rdma/rdma-core
Tags: devel::library, role::devel-lib
Code:
Package: libibverbs-dev
Version: 22.1-1
State: not installed
Multi-Arch: same
Priority: optional
Section: libdevel
Maintainer: Benjamin Drung <benjamin.drung@cloud.ionos.com>
Architecture: amd64
Uncompressed Size: 1,397 k
Depends: ibverbs-providers (= 22.1-1), libibverbs1 (= 22.1-1), libnl-3-dev, libnl-route-3-dev
Description: Development files for the libibverbs library
libibverbs is a library that allows userspace processes to use RDMA "verbs" as described in the InfiniBand
Architecture Specification and the RDMA Protocol Verbs Specification.  iWARP ethernet NICs support RDMA over
hardware-offloaded TCP/IP, while InfiniBand is a high-throughput, low-latency networking technology.
InfiniBand host channel adapters (HCAs) and iWARP NICs commonly support direct hardware access from userspace
(kernel bypass), and libibverbs supports this when available.
This package is needed to compile programs against libibverbs1. It contains the header files and static
libraries (optionally) needed for compiling.
Homepage: https://github.com/linux-rdma/rdma-core
Tags: devel::library, role::devel-lib
 
Last edited:
In many of their job ads they say "You will use top-notch hardware and network solutions as well as efficient software techniques - those include direct access to network cards to bypass the standard TCP/IP stack ...". Example for such a job ad: https://careers.db.com/professionals/search-roles/#/professional/job/38866

I wonder how much % one can reduce the network latency by such bypassing the standard TCP/IP stack. I personally have big doubts about this issue; I rather think they just use some marketing buzzwords of the past, as IMO it practically does not make any sense to bypass the standard TCP/IP stack. Or does it?
Yes, it does make sense.
Been there, done that.
This is not unusual for HFTs to do such things.
Some even try to have their fiber cables as straight and short as possible because it's faster when cables are that way.
HFTs are working at the sub-microsecond level, and there are 1,000,000 microseconds in one second.
 
Yes, it does make sense.
Been there, done that.
This is not unusual for HFTs to do such things.
Some even try to have their fiber cables as straight and short as possible because it's faster when cables are that way.
HFTs are working at the sub-microsecond level, and there are 1,000,000 microseconds in one second.
Is this beyond (or independent of) the link speed?
Why not simply upgrade from say current 1 to a new 2.5, 10, 25, 40, 50, or even 100+ GigaBit link instead of fiddling with bypassing the TCP/IP stack?
 
Last edited:
Is this beyond (or independent of) the link speed?
Why not simply upgrade from say current 1 to a new 2.5, 10, 25, 40, 50, or even 100+ GigaBit link instead of fiddling with bypassing the TCP /IP stack?
Bandwidth is not the speed, but the amount of data you can transfer.
If you have enough bandwidth, increasing it will not make anything faster.

And besides, HFT have colo infra usually. So they have direct links to the destination (exchanges, market data source) using fiber cross-connects.
 
- those include direct access to network cards to bypass the standard TCP/IP stack ...".

I guess the question is what is the latency of the existing TCP/IP stack that comes with the OS including the network card driver etc.

You read from an already connected TCP socket, what is latency since a byte of data first entered in the network card till if it is received in your app. It has to go through a few OS layers...

I dont know the answer myself, not my area of expertise, maybe it is 100 nano seconds, maybe it is 100 microseconds..

Can this latency be noticeably reduced by direct access to the network card?

I am guessing it is large enough to make a difference..
 
Last edited:
fiddling with bypassing the TCP/IP stack

A lot of times it's just a matter of experience and perspective. At one point you just get used to doing things without the networking protocol (for example) and then you gain a few nano seconds and CPU cycles without thinking about it.

Have you heard of "linguistic determinism"?

In a nutshell, you abstract it and get used to it.

I am guessing it is large enough to make a difference..
I really like the RDMA interconnect for high availability clustering. Your apps can still use whatever they want/need if it makes sense. Spoiler alert: they got a 2x improvement.
 
Last edited:
In many of their job ads they say "You will use top-notch hardware and network solutions as well as efficient software techniques - those include direct access to network cards to bypass the standard TCP/IP stack ...". Example for such a job ad: https://careers.db.com/professionals/search-roles/#/professional/job/38866

I wonder how much % one can reduce the network latency by such bypassing the standard TCP/IP stack. I personally have big doubts about this issue; I rather think they just use some marketing buzzwords of the past, as IMO it practically does not make any sense to bypass the standard TCP/IP stack. Or does it?

This is common verbiage for saying that you bypass the kernel's TCP/IP stack by using a userspace driver instead. This does significantly improve latency, also noticeably increases the maximum packet rate, and is a common practice in this space.

The performance limitations with the kernel network stack are well-documented. See for example:
- https://dl.acm.org/doi/abs/10.1145/3297156.3297242
- https://www.cse.iitb.ac.in/~mythili/os/anno_slides/network_stack_kernel_bypass_slides.pdf
- https://blog.cloudflare.com/kernel-bypass/
- https://access.redhat.com/sites/def...olarflare_openonload_performance_brief_10.pdf
 
  • Like
Reactions: spy
This is common verbiage for saying that you bypass the kernel's TCP/IP stack by using a userspace driver instead. This does significantly improve latency, also noticeably increases the maximum packet rate, and is a common practice in this space.
But how is it used/applied in practice for HFT between a client/bot and the exchange/broker?
Does the exchange/broker side need to use such bypass-technique too, or does it use just the standard TCP/IP stack and it suffices to use such user-space TCP/IP on the client side only?
Or: is that intended only for own market making center, ie. for own internal/unofficial/non-public exchange or dark-pool?
And: which US exchanges support/offer this technique? Any links for official/corporate info?
 
Last edited:
Back
Top