Community discussions

MikroTik App
 
128kbps
just joined
Topic Author
Posts: 6
Joined: Fri Nov 10, 2023 7:55 pm

CHR does not correctly balance the use of vCores

Mon Nov 20, 2023 1:57 pm

I am testing on a DELL OEM-R 720xd server, with a "NetXtreme II BCM57800 1/10 Gigabit Ethernet" C-NIC.
Note 1: All server firmware is updated to the latest available version.
note 2: DELL C-NIC designation: "QLogic 57800S Quad Port 2, 1GB x 2, 10Gb rNDC SFP+/DA"
note 3: the server is legacy, I have no control over the hardware.

The Server has 2 E5 2650v2 cores and 128GB DDR3 ECC RAM.
I installed PROXMOX 8.x and am testing to use CHR (7.12).
note 4: "hyper-threading" is disabled.
I pass the network card directly to the CHR through "PCI Passthrough".
note 5: I added the C-NIC to "/etc/modprobe.d/[*].conf" with the vendor and device ID, also I added "bnx2x" to the blacklist so that PROXMOX does not detect the C-NIC and becomes available for "PCI Passthrough".

So far everything works fine, in fact the "PCI Passthrough" is done correctly, I have no errors and the C-NIC interfaces are seen in CHR.

The problem is that when I check the balance of the vcores from the “Profile” tool I see that there is always one that shoots up between 80 to 98% and the rest remain at an equal average between them.

Questions:

a- Is this behavior normal, in CHR?
I ask this question because in my CCR1072-1G-8S+ the balancing of the cores is equitable between all of them.

b- Can this be a bottleneck for my CHR?
I ask this because the “Networking” percentage associated with that “vcore” also skyrockets unlike the rest of the “vcore’s” and perhaps this proves to be the packet loss problem in the Bandwidth Test.

Thank you for your time and dedication in answering this question.
Cordially.

128kbps
 
User avatar
mkx
Forum Guru
Forum Guru
Posts: 11740
Joined: Thu Mar 03, 2016 10:23 pm

Re: CHR does not correctly balance the use of vCores

Mon Nov 20, 2023 10:25 pm

The problem is that when I check the balance of the vcores from the “Profile” tool I see that there is always one that shoots up between 80 to 98% and the rest remain at an equal average between them.

What kind of workload is going on when you see one vCPU load rise towards 100%? If you're, by any chance, running bandwidth test (the ROS own tool), then what you see is inconclusive. Bandwidth test is known to cause huge load (on any platform).
 
128kbps
just joined
Topic Author
Posts: 6
Joined: Fri Nov 10, 2023 7:55 pm

Re: CHR does not correctly balance the use of vCores

Wed Nov 22, 2023 12:05 am

What kind of workload is going on when you see one vCPU load rise towards 100%? If you're, by any chance, running bandwidth test (the ROS own tool), then what you see is inconclusive. Bandwidth test is known to cause huge load (on any platform).

Dear mkx, thank you for your response and sorry for my delay in responding, a lot of work.

I attach an image with the "Bandwidth Test" tests.
As you say, the consumption of this tool is excessive and "if it passes the test, it will surely not have problems in production XD"

My question refers to the fact that, in this case, the "cpu 6" shoots up to more than 90% while the others maintain prudent consumption.

Doing the same test in reverse, the CCR maintains its cores in similar ranges.

I know that Tile and x86 are, so to speak, "Oil and Water", that's why my question is if the behavior I have on CHR is normal or do I have to check some extra configuration.

Test data:
1- Everything happens in a laboratory.
2- CCR and CHR have a direct fiber connection with mikrotik SFP+ modules (S+85DLC03D).
3- I connect via telenet to CCR 1072-1G-8S+.
4- I run the bandwidth-test from the CCR, making the CCR send 7Gbps of traffic and the CHR return 700Mbps.

To finish you will see that I have "Rx Errors" and "lost packets", that is what my second question was about...
Could it be that the connection is degraded by having a CPU that consumes too much?
As if it were a bottleneck.

Cordially.
You do not have the required permissions to view the files attached to this post.
 
User avatar
mkx
Forum Guru
Forum Guru
Posts: 11740
Joined: Thu Mar 03, 2016 10:23 pm

Re: CHR does not correctly balance the use of vCores

Wed Nov 22, 2023 9:25 am

It's a pretty well known fact that ROS internal bandwidth-test tool is pretty CPU-heavy (single CPU bound) and results of it are hardly representative for device which is actually running it.

If you really want to assess the performance of your setup, you have to use external test probes (such as a couple of performance machines, running iperf3).
 
128kbps
just joined
Topic Author
Posts: 6
Joined: Fri Nov 10, 2023 7:55 pm

Re: CHR does not correctly balance the use of vCores

Wed Nov 22, 2023 4:25 pm

Dear nkx, thank you again for your response and above all for the information.

I'm already building an iPerf3 client-server scheme to be able to test the performance of CHR on its interfaces.
As you may have noticed, I am new to this world and any additional information about this tool, based on your experience, will be appreciated.
I'm going to leave this post open to present the results of the tests in iPerf3 and make a closing that will help other newbies like me in the future.

Cordially.
 
User avatar
mkx
Forum Guru
Forum Guru
Posts: 11740
Joined: Thu Mar 03, 2016 10:23 pm

Re: CHR does not correctly balance the use of vCores

Thu Nov 23, 2023 9:18 am

OK, so here goes another experience: ROS will use single core to deal with packets, belonging to same connection (either real TCP connection or "apparent" UDP connection). The reason being to avoid out-of-order packet delivery (which upsets some TCP stacks). On devices with larger number of slower cores (primarily CCR1xxx) this will cause bottleneck for single-threaded connection throughputs.
However, if there are multiple connections (can be between same pair of client-server), the load will spread more evenly between all available CPU cores ... ant that's many times the real-life scenario as well: number of concurrent users connecting to number of internet servers.

So when doing iperf3 tests, always conduct both single-threaded tests as well as multi-.threaded tests (using "--parallel N" command line parameter ... and set N to 8 or more).

Another hint: TCP throughput is bound to two-way delays and sometimes throughput gets lower than expected (even if considering RTT and the rest of unknowns). So just to measure raw speed, it's sometimes good to perform UDP tests. If the results are considerably higher than TCP results, then it's worth to investigate the reason. Sometimes it simply boils down to timing issues which can't be remedied by router settings (but it's good to know if there are any such issues present on the network).
 
128kbps
just joined
Topic Author
Posts: 6
Joined: Fri Nov 10, 2023 7:55 pm

Re: CHR does not correctly balance the use of vCores

Fri Nov 24, 2023 12:39 am

Dear mkx, thank you for your response and also for the shared knowledge, it is very helpful to me.

I attach an image with the layout of the current laboratory.
As you will see, everything is done within the same proxmox server.
I'm waiting for my new NIC "10 Gb Hp 560sfp Dual Port Sfp+ (Intel X520)" so I can take the lab to something closer to reality.
I attach my results because CHR behavior improved and CPU consumption decreased drastically. Even in these conditions where all interfaces are virtualized within proxmox in three bridges.

I decided to make CHR route traffic from one network to another to make it more interesting.

I attach the results of the tests in TCP:

-- TCP trafic -- 1 connection
--COMMAND:
iperf3 -c 10.0.0.10 -f g -t 60
Client - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-60.00 sec 15.1 GBytes 2.16 Gbits/sec 11297 sender
[ 5] 0.00-60.04 sec 15.1 GBytes 2.16 Gbits/sec receiver
Server - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-60.04 sec 15.1 GBytes 2.16 Gbits/sec receiver
--"Tx Queue Drops" in "ether6" = 330

-- TCP trafic -- 50 connection
--COMMAND:
iperf3 -c 10.0.0.10 -f g -t 60 -P 50
Client - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[SUM] 0.00-60.00 sec 54.5 GBytes 7.80 Gbits/sec 394634 sender
[SUM] 0.00-60.00 sec 54.5 GBytes 7.80 Gbits/sec receiver
Server - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[SUM] 0.00-60.00 sec 54.5 GBytes 7.80 Gbits/sec receiver
--"Tx Queue Drops" in "ether6" = 57439

I attach the results of the tests in UDP:

-- UDP trafic -- 50 connection -- 20M bandwidth
--COMMAND:
iperf3 -c 10.0.0.10 -f g -t 60 -u -P 50 -b 20M
Client - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[SUM] 0.00-60.00 sec 6.98 GBytes 1.00 Gbits/sec 0.000 ms 0/5179550 (0%) sender
[SUM] 0.00-60.03 sec 6.98 GBytes 1.00 Gbits/sec 0.177 ms 0/5179525 (0%) receiver
Server - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[SUM] 0.00-60.03 sec 6.98 GBytes 999 Mbits/sec 0.177 ms 0/5179525 (0%) receiver

-- UDP trafic -- 50 connection -- 40M bandwidth
--COMMAND:
iperf3 -c 10.0.0.10 -f g -t 60 -u -P 50 -b 40M
Client - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[SUM] 0.00-60.00 sec 12.5 GBytes 1.79 Gbits/sec 0.000 ms 0/9261200 (0%) sender
[SUM] 0.00-60.03 sec 12.3 GBytes 1.76 Gbits/sec 0.309 ms 119344/9260418 (1.3%) receiver
Server - - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[SUM] 0.00-60.03 sec 12.3 GBytes 1.76 Gbits/sec 0.309 ms 119344/9260418 (1.3%) receiver
--"Tx Queue Drops" in "ether6" = 51938
--"Rx Drops" in "ether7" = 54611

I'm not sure why at more than 1.5Gbps UDP starts to cause problems.
Will they be Linux bridges?
Could I be doing something wrong in the iperf3 parameter configuration?

Well that's what I have until now when the MIC arrives I'm going to make the traffic travel over real physical interfaces and with that get closer to a real production environment.

Cordially.
You do not have the required permissions to view the files attached to this post.
 
User avatar
mkx
Forum Guru
Forum Guru
Posts: 11740
Joined: Thu Mar 03, 2016 10:23 pm

Re: CHR does not correctly balance the use of vCores

Fri Nov 24, 2023 7:17 am

When UDP iperf3 test shows transmitter to fall lower than configured total bandwidth, this usually means bottleneck on the transmitter itself - that's the only place UDP throughput is throttled without packets being dropped.

Who is online

Users browsing this forum: No registered users and 0 guests