Discussion:
Correctly calculating overheads on unknown connections
Dave Taht
2014-09-20 17:55:27 UTC
Permalink
We'd had a very long thread on cerowrt-devel and in the end sebastian
(I think) had developed some scripts to exaustively (it took hours)
derive the right encapsulation frame size on a link. I can't find the
relevant link right now, ccing that list...
Hi,
I am looking to figure out the most fool proof way to calculate stab
overheads for ADSL/VDSL connections.
ppp0 Link encap:Point-to-Point Protocol inet addr:81.149.38.69
P-t-P:81.139.160.1 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP
MULTICAST MTU:1492 Metric:1 RX packets:17368223 errors:0 dropped:0
overruns:0 frame:0 TX packets:12040295 errors:0 dropped:0 overruns:0
carrier:0 collisions:0 txqueuelen:100 RX bytes:17420109286 (16.2 GiB)
TX bytes:3611007028 (3.3 GiB)
I am setting a longer txqueuelen as I am not currently using any fair
queuing (buffer bloat issues with sfq)
Whatever is txqlen is on ppp there is likely some other buffer after it
- the default can hurt with eg, htb as if you don't add qdiscs to
classes it takes (last time I looked) its qlen from that.
Sfq was only ever meant for bulk, so should really be in addition to
some classification to separate interactive - I don't really get the
Hmm? sfq separates bulk from interactive pretty nicely. It tends to do
bad things to bulk as it doesn't manage queue length.

A little bit of prioritization or deprioritization for some traffic is
helpful, but most traffic is hard to classify.
bufferbloat bit, you could make the default 128 limit lower if you wanted.
htb + fq_codel, if available, is the right thing here....

http://www.bufferbloat.net/projects/cerowrt/wiki/Wondershaper_Must_Die
The connection is a BT Infinity FTTC VDSL connection synced at
80mbit/20mbit. The modem is connected directly to the ethernet port
on a server running a slightly tweaked HFSC setup that you folks
helped me set up in July - back when I was on ADSL. I am still
running pppoe I believe from my server.
I have similar since May 2013 and I still haven't got round to reading
up on everything yet :-)
I have extra geek score for using mini jumbos = running pppoe with mtu
1500 which works for me on plusnet. You need a recent pppd for this and
a nic that works with mtu >= 1508.
As for overheads, initial searching indicated that it's not easy or
maybe even truly possible like adsl.
The largest ping packet that I can fit out onto the wire is 1464
# ping -c 2 -s 1464 -M do google.com PING google.com (31.55.166.216)
1464(1492) bytes of data. 1472 bytes from 31.55.166.216: icmp_seq=1
ttl=58 time=11.7 ms 1472 bytes from 31.55.166.216: icmp_seq=2 ttl=58
time=11.9 ms
# ping -c 2 -s 1465 -M do google.com PING google.com (31.55.166.212)
1465(1493) bytes of data. From
host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
Frag needed and DF set (mtu = 1492) From
host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
Frag needed and DF set (mtu = 1492)
You can't work out your overheads like this.
On slow uplink adsl it was possible with ping to infer the fixed part
but you needed to send loads of pings increasing in size and plot the
best time for each to make a stepped graph.
Based on this I believe overhead should be set to 28, however with 28
set as my overhead and hfsc ls m2 20000kbit ul m2 20000kbit I seem
to be loosing about 1.5mbit of upload...
Even if you could do things perfectly I would back off a few kbit just
to be safe. Timers may be different or there may be OAM/Reporting data
going up, albeit rarely.
http://www.thinkbroadband.com/speedtest/results.html?id=141116089424883990118
http://www.thinkbroadband.com/speedtest/results.html?id=141116216621093133034
Am I calculating overhead incorrectly?
VDSL doesn't use ATM I think the PTM it uses is 64/65 - so don't specify
atm with stab. Unfortunately stab doesn't do 64/65.
As for the fixed part - I am not sure, but roughly starting with IP as
that's what tc sees on ppp (as opposed to ip + 14 on eth)
IP
+8 for PPPOE
+14 for ethertype and macs
+4 because Openreach modem uses vlan
+2 CRC ??
+ "a few" 64/65
That's it for fixed - of course 64/65 adds another one for every 64 TBH
I didn't get the precice detail from the spec and not having looked
recently I can't remember.
BT Sin 498 does give some of this info and a couple of examples of
throughput for different frame sizes - but it's rounded to kbit which
means I couldn't work out to the byte what the overheads were.
Worse still VDSL can use link layer retransmits and the sin says that
though currently (2013) not enabled, they would be in due course. I have
no clue how these work.
--
To unsubscribe from this list: send the line "unsubscribe lartc" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Dave Täht

https://www.bufferbloat.net/projects/make-wifi-fast
Sebastian Moeller
2014-09-21 18:35:06 UTC
Permalink
Hi Dave, hi Andy,
Sebastian Moeller
2014-09-22 09:05:26 UTC
Permalink
Hi Bill,
On my Linux boxes ping has a -A option for adaptive ping, effectively sending out a new ping as soon as the reply to the last one is received, instead of having to wait a fixed period of time between pings.
Interesting, the current version of ping_sweeper sends a ping every 10ms, with the typical RTT on a ADSL link > 10ms I am not sure how much “-A” speeds things up (except your method does work with any uplink speed, while with fixed intervals one needs to tweak the ping interval). One other reason for the ping interval was, that some routers/BRAS/DSLAMs are rumored to rate limit ICMP processing so I wanted to be able to control the rate to be able to work around such limitations.
I modified ping_sweeper to use that last December when I was still on a DSL link and was able to find the overhead with only a few minutes of collecting data. (The connection was 6Mbps down, 512kbps up.)
So, the slower the link is the fewer packed you need, as the per ATM cell time increase gets larger and hence easier to detect in the noise. xDSL uses a symbol rate of 4KHz, so there should be a quantization of 0.25 ms caused that will make detection of fast uplinks trickier (in my experience it works up to 2600Kbps the fastest I had available)...
There was a bit of noise in the data from other traffic in the house, but the stair-step shape of the plot was unmistakeable and the octave script had no trouble identifying the per-packet overhead.
Great to hear that is worked out.

Best Regards
Sebastian
Hi Dave, hi Andy,
Post by Dave Taht
We'd had a very long thread on cerowrt-devel and in the end sebastian
(I think) had developed some scripts to exaustively (it took hours)
derive the right encapsulation frame size on a link. I can't find the
relevant link right now, ccing that list…
I am certainly not the first to have looked at ATM encapsulation effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis about this topic (see http://www.adsl-optimizer.dk) and together with Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/) I believe they taught the linux kernel about how to account for encapsulation. What you need to tell the kernel is whether or not you have ATM encapsulation (ATM is weird in that each ip Packet gets chopped into 48 byte cells, with the last partially full cell padded) and the per packet overhead on your link. You can either get this information from your ISP and/or from the DSL-modem’s information page, but both are not guaranteed to be available/useful. So I set out to empirically deduce this information from measurements on my own link. I naively started out with using ICMP echo requests as probes (as I easily could generate probe packets with different sizes with the linux/macosx ping binary), as it turned out, this works well enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh (attached) is the program I use (on an otherwise idle link, typically over night) to collect ~1000 repetitions of time stamped ping packets spanning two (potential) ATM cells. I then use tc_stab_parameter_guide.m (a matlab/octave program) to read in the output of the ping_sweeper script and process the data. In short if the link runs ATM encapsulation the plot of the data needs to look like a stair with 48 byte step width, if it is just smoothly increasing the carrier is not ATM. For ATM links and only ATM links, the script also tries to figure out the per packet overhead which always worked well for me. (My home-link got recently a silent upgrade where the encapsulation changed from 40 bytes to 44 bytes (probably due to the introduction of VLAN tags), which caused some disturbances in link capacity measurements I was running at the time; so I ran my code again and lo and behold the overhead had increased, which caused the issues with the measurements, as after taking the real overhead into account the disturbances went away, but I guess I digress ;) )
Best Regards
Sebastian
Post by Dave Taht
Hi,
I am looking to figure out the most fool proof way to calculate stab
overheads for ADSL/VDSL connections.
ppp0 Link encap:Point-to-Point Protocol inet addr:81.149.38.69
P-t-P:81.139.160.1 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP
MULTICAST MTU:1492 Metric:1 RX packets:17368223 errors:0 dropped:0
overruns:0 frame:0 TX packets:12040295 errors:0 dropped:0 overruns:0
carrier:0 collisions:0 txqueuelen:100 RX bytes:17420109286 (16.2 GiB)
TX bytes:3611007028 (3.3 GiB)
I am setting a longer txqueuelen as I am not currently using any fair
queuing (buffer bloat issues with sfq)
Whatever is txqlen is on ppp there is likely some other buffer after it
- the default can hurt with eg, htb as if you don't add qdiscs to
classes it takes (last time I looked) its qlen from that.
Sfq was only ever meant for bulk, so should really be in addition to
some classification to separate interactive - I don't really get the
Hmm? sfq separates bulk from interactive pretty nicely. It tends to do
bad things to bulk as it doesn't manage queue length.
A little bit of prioritization or deprioritization for some traffic is
helpful, but most traffic is hard to classify.
bufferbloat bit, you could make the default 128 limit lower if you wanted.
htb + fq_codel, if available, is the right thing here....
http://www.bufferbloat.net/projects/cerowrt/wiki/Wondershaper_Must_Die
The connection is a BT Infinity FTTC VDSL connection synced at
80mbit/20mbit. The modem is connected directly to the ethernet port
on a server running a slightly tweaked HFSC setup that you folks
helped me set up in July - back when I was on ADSL. I am still
running pppoe I believe from my server.
I have similar since May 2013 and I still haven't got round to reading
up on everything yet :-)
I have extra geek score for using mini jumbos = running pppoe with mtu
1500 which works for me on plusnet. You need a recent pppd for this and
a nic that works with mtu >= 1508.
As for overheads, initial searching indicated that it's not easy or
maybe even truly possible like adsl.
The largest ping packet that I can fit out onto the wire is 1464
# ping -c 2 -s 1464 -M do google.com PING google.com (31.55.166.216)
1464(1492) bytes of data. 1472 bytes from 31.55.166.216: icmp_seq=1
ttl=58 time=11.7 ms 1472 bytes from 31.55.166.216: icmp_seq=2 ttl=58
time=11.9 ms
# ping -c 2 -s 1465 -M do google.com PING google.com (31.55.166.212)
1465(1493) bytes of data. From
host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
Frag needed and DF set (mtu = 1492) From
host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
Frag needed and DF set (mtu = 1492)
You can't work out your overheads like this.
On slow uplink adsl it was possible with ping to infer the fixed part
but you needed to send loads of pings increasing in size and plot the
best time for each to make a stepped graph.
Based on this I believe overhead should be set to 28, however with 28
set as my overhead and hfsc ls m2 20000kbit ul m2 20000kbit I seem
to be loosing about 1.5mbit of upload...
Even if you could do things perfectly I would back off a few kbit just
to be safe. Timers may be different or there may be OAM/Reporting data
going up, albeit rarely.
http://www.thinkbroadband.com/speedtest/results.html?id=141116089424883990118
http://www.thinkbroadband.com/speedtest/results.html?id=141116216621093133034
Am I calculating overhead incorrectly?
VDSL doesn't use ATM I think the PTM it uses is 64/65 - so don't specify
atm with stab. Unfortunately stab doesn't do 64/65.
As for the fixed part - I am not sure, but roughly starting with IP as
that's what tc sees on ppp (as opposed to ip + 14 on eth)
IP
+8 for PPPOE
+14 for ethertype and macs
+4 because Openreach modem uses vlan
+2 CRC ??
+ "a few" 64/65
That's it for fixed - of course 64/65 adds another one for every 64 TBH
I didn't get the precice detail from the spec and not having looked
recently I can't remember.
BT Sin 498 does give some of this info and a couple of examples of
throughput for different frame sizes - but it's rounded to kbit which
means I couldn't work out to the byte what the overheads were.
Worse still VDSL can use link layer retransmits and the sin says that
though currently (2013) not enabled, they would be in due course. I have
no clue how these work.
--
To unsubscribe from this list: send the line "unsubscribe lartc" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Dave Täht
https://www.bufferbloat.net/projects/make-wifi-fast
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Sebastian Moeller
2014-09-22 10:20:20 UTC
Permalink
Hi Andy,
Post by Sebastian Moeller
Hi Dave, hi Andy,
Post by Dave Taht
We'd had a very long thread on cerowrt-devel and in the end
sebastian (I think) had developed some scripts to exaustively (it
took hours) derive the right encapsulation frame size on a link. I
can't find the relevant link right now, ccing that list…
I am certainly not the first to have looked at ATM encapsulation
effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis
about this topic (see http://www.adsl-optimizer.dk) and together with
Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/)
One note about Russel’s handy list of ADSL overheads, these do not include VLAN tags so all the shown combinations can be 4 byte larger if a clan tag is added, as is quite common nowadays for double- and tripple play connections. Fun fact for a rather run of the mill encapsulation with, LLC/SNAP over AAL5 with clan tbs we now have three independent methods to multiplex different “connections over the line” (and that does not count PPPoE), all provisioned out of the bandwidth the end user pays for and best case is only one of the is functional, but I digress...
Post by Sebastian Moeller
I believe they taught the linux kernel about how to account for
encapsulation. What you need to tell the kernel is whether or not you
have ATM encapsulation (ATM is weird in that each ip Packet gets
chopped into 48 byte cells, with the last partially full cell padded)
and the per packet overhead on your link. You can either get this
information from your ISP and/or from the DSL-modem’s information
page, but both are not guaranteed to be available/useful. So I set
out to empirically deduce this information from measurements on my
own link. I naively started out with using ICMP echo requests as
probes (as I easily could generate probe packets with different sizes
with the linux/macosx ping binary), as it turned out, this works well
enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh
(attached) is the program I use (on an otherwise idle link, typically
over night) to collect ~1000 repetitions of time stamped ping packets
spanning two (potential) ATM cells. I then use
tc_stab_parameter_guide.m (a matlab/octave program) to read in the
output of the ping_sweeper script and process the data. In short if
the link runs ATM encapsulation the plot of the data needs to look
like a stair with 48 byte step width, if it is just smoothly
increasing the carrier is not ATM. For ATM links and only ATM links,
the script also tries to figure out the per packet overhead which
always worked well for me. (My home-link got recently a silent
upgrade where the encapsulation changed from 40 bytes to 44 bytes
(probably due to the introduction of VLAN tags), which caused some
disturbances in link capacity measurements I was running at the time;
so I ran my code again and lo and behold the overhead had increased,
which caused the issues with the measurements, as after taking the
real overhead into account the disturbances went away, but I guess I
digress ;) )
Sounds like a handy script, though I am not so sure it would help for
vdsl 64/65 (if that is actually used!).
No, currently my script will tell you whether you have ATM cell encapsulation on your link or not (as far as I know VDSL2 means PTM (64/65), ADSL[1,2,2+] means ATM, not sure about VDSL1, but I think neither is VDSL2 prohibited from using ATM nor is ADSL stuck on ATM). If, and only if ATM is used will the script help to deduce the per packet overhead. I am still waiting for the upgrade to VDSL2 on my home link, once I have that available I will see whether I can figure out information about the per packet overhead or not; all I know is that my current approach will not work, because it relies on the ATM quantization.
I don't think there is any
padding (but may be wrong!).
No, according to the standard the 64/64 encapsulation is “continous”, so not padded-out or reset for each packet
As for the history, Yea Jesper got his stuff in - but didn't allow
negative overheads so I still used to have to patch tc to workaround that.
True, but the “stab” work for tc got this right. Also note that the stab option does not automatically include the known overhead to the packet as indicated by the outdated man-page, so that the ability to specify negative overheads is basically not needed or useful. And yes the kernel needs to be fixed, one of these days. Speaking of kernel code, Jesper “recently” hoisted HTB’s link layer adjustment method into the present (getting basically rid of the tables and allowing for GRO packets), something that stab also needs to have fixed...
Before his work there was some user space code by IIRC Dan Singletary
which I used for a while and later Ed Wildgoose analysed the kernel code
and posted patches for htb and tc on the original lartc list which I
used for some time before Jespers code got in.
Interting piece of history, all that happened before I cared, heck even Jespers thesis was out by then.


Best Regards
Sebastian
Alan Goodman
2014-09-21 21:40:14 UTC
Permalink
Hi Billy,

Please can you share your modified script?

Alan
On my Linux boxes ping has a -A option for adaptive ping, effectively
sending out a new ping as soon as the reply to the last one is received,
instead of having to wait a fixed period of time between pings. I
modified ping_sweeper to use that last December when I was still on a
DSL link and was able to find the overhead with only a few minutes of
collecting data. (The connection was 6Mbps down, 512kbps up.) There was
a bit of noise in the data from other traffic in the house, but the
stair-step shape of the plot was unmistakeable and the octave script had
no trouble identifying the per-packet overhead.
Andy Furniss
2014-09-22 10:01:34 UTC
Permalink
Post by Sebastian Moeller
Hi Dave, hi Andy,
Post by Dave Taht
We'd had a very long thread on cerowrt-devel and in the end
sebastian (I think) had developed some scripts to exaustively (it
took hours) derive the right encapsulation frame size on a link. I
can't find the relevant link right now, ccing that list…
I am certainly not the first to have looked at ATM encapsulation
effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis
about this topic (see http://www.adsl-optimizer.dk) and together with
Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/)
I believe they taught the linux kernel about how to account for
encapsulation. What you need to tell the kernel is whether or not you
have ATM encapsulation (ATM is weird in that each ip Packet gets
chopped into 48 byte cells, with the last partially full cell padded)
and the per packet overhead on your link. You can either get this
information from your ISP and/or from the DSL-modem’s information
page, but both are not guaranteed to be available/useful. So I set
out to empirically deduce this information from measurements on my
own link. I naively started out with using ICMP echo requests as
probes (as I easily could generate probe packets with different sizes
with the linux/macosx ping binary), as it turned out, this works well
enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh
(attached) is the program I use (on an otherwise idle link, typically
over night) to collect ~1000 repetitions of time stamped ping packets
spanning two (potential) ATM cells. I then use
tc_stab_parameter_guide.m (a matlab/octave program) to read in the
output of the ping_sweeper script and process the data. In short if
the link runs ATM encapsulation the plot of the data needs to look
like a stair with 48 byte step width, if it is just smoothly
increasing the carrier is not ATM. For ATM links and only ATM links,
the script also tries to figure out the per packet overhead which
always worked well for me. (My home-link got recently a silent
upgrade where the encapsulation changed from 40 bytes to 44 bytes
(probably due to the introduction of VLAN tags), which caused some
disturbances in link capacity measurements I was running at the time;
so I ran my code again and lo and behold the overhead had increased,
which caused the issues with the measurements, as after taking the
real overhead into account the disturbances went away, but I guess I
digress ;) )
Sounds like a handy script, though I am not so sure it would help for
vdsl 64/65 (if that is actually used!). I don't think there is any
padding (but may be wrong!).

As for the history, Yea Jesper got his stuff in - but didn't allow
negative overheads so I still used to have to patch tc to workaround that.

Before his work there was some user space code by IIRC Dan Singletary
which I used for a while and later Ed Wildgoose analysed the kernel code
and posted patches for htb and tc on the original lartc list which I
used for some time before Jespers code got in.
Alan Goodman
2014-09-22 13:09:31 UTC
Permalink
Hello all once again,

I tried running the attached ping sweeper yesterday evening as is and
didnt get particularly plausible looking results. I therefore decided
to increase the upper limit of the size of ping packets sent and let the
script run over night while the connection was quiet.

Here is a screen shot of the resulting graph which does appear to have a
stepped appearance, but perhaps not as expected?
http://imgur.com/RjmT8Qh

This test was ran on a BT Infinity VDSL/FTTC connection with the modem
plugged directly into a CentOS 6 machine which is doing PPPoE. The
connection is synced at 80mbit down and 20mbit up. BT restrict
downstream speed to 77.44Mbps IP traffic.

I can run the test on a slower BT connection over the week end if anyone
is interested in the results?

Alan
Post by Sebastian Moeller
Hi Dave, hi Andy,
Post by Dave Taht
We'd had a very long thread on cerowrt-devel and in the end sebastian
(I think) had developed some scripts to exaustively (it took hours)
derive the right encapsulation frame size on a link. I can't find the
relevant link right now, ccing that list…
I am certainly not the first to have looked at ATM encapsulation effects on DSL-links, e.g. Jesper Dangaard Brouer wrote a thesis about this topic (see http://www.adsl-optimizer.dk) and together with Russel Stuart (http://ace-host.stuart.id.au/russell/files/tc/tc-atm/) I believe they taught the linux kernel about how to account for encapsulation. What you need to tell the kernel is whether or not you have ATM encapsulation (ATM is weird in that each ip Packet gets chopped into 48 byte cells, with the last partially full cell padded) and the per packet overhead on your link. You can either get this information from your ISP and/or from the DSL-modem’s information page, but both are not guaranteed to be available/useful. So I set out to empirically deduce this information from measurements on my own link. I naively started out with using ICMP echo requests as probes (as I easily could generate probe packets with different sizes with the linux/macosx ping binary), as it tur
ned out,
this works well enough, at least for relatively slow ADSL-links. So ping_sweeper6.sh (attached) is the program I use (on an otherwise idle link, typically over night) to collect ~1000 repetitions of time stamped ping packets spanning two (potential) ATM cells. I then use tc_stab_parameter_guide.m (a matlab/octave program) to read in the output of the ping_sweeper script and process the data. In short if the link runs ATM encapsulation the plot of the data needs to look like a stair with 48 byte step width, if it is just smoothly increasing the carrier is not ATM. For ATM links and only ATM links, the script also tries to figure out the per packet overhead which always worked well for me. (My home-link got recently a silent upgrade where the encapsulation changed from 40 bytes to 44 bytes (probably due to the introduction of VLAN tags), which caused some disturbances in link capacity measurements I was running at the time; so I ran my code again and lo and behold the overhead
had incre
ased, which caused the issues with the measurements, as after taking the real overhead into account the disturbances went away, but I guess I digress ;) )
Post by Sebastian Moeller
Best Regards
Sebastian
Post by Dave Taht
Hi,
I am looking to figure out the most fool proof way to calculate stab
overheads for ADSL/VDSL connections.
ppp0 Link encap:Point-to-Point Protocol inet addr:81.149.38.69
P-t-P:81.139.160.1 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP
MULTICAST MTU:1492 Metric:1 RX packets:17368223 errors:0 dropped:0
overruns:0 frame:0 TX packets:12040295 errors:0 dropped:0 overruns:0
carrier:0 collisions:0 txqueuelen:100 RX bytes:17420109286 (16.2 GiB)
TX bytes:3611007028 (3.3 GiB)
I am setting a longer txqueuelen as I am not currently using any fair
queuing (buffer bloat issues with sfq)
Whatever is txqlen is on ppp there is likely some other buffer after it
- the default can hurt with eg, htb as if you don't add qdiscs to
classes it takes (last time I looked) its qlen from that.
Sfq was only ever meant for bulk, so should really be in addition to
some classification to separate interactive - I don't really get the
Hmm? sfq separates bulk from interactive pretty nicely. It tends to do
bad things to bulk as it doesn't manage queue length.
A little bit of prioritization or deprioritization for some traffic is
helpful, but most traffic is hard to classify.
bufferbloat bit, you could make the default 128 limit lower if you wanted.
htb + fq_codel, if available, is the right thing here....
http://www.bufferbloat.net/projects/cerowrt/wiki/Wondershaper_Must_Die
The connection is a BT Infinity FTTC VDSL connection synced at
80mbit/20mbit. The modem is connected directly to the ethernet port
on a server running a slightly tweaked HFSC setup that you folks
helped me set up in July - back when I was on ADSL. I am still
running pppoe I believe from my server.
I have similar since May 2013 and I still haven't got round to reading
up on everything yet :-)
I have extra geek score for using mini jumbos = running pppoe with mtu
1500 which works for me on plusnet. You need a recent pppd for this and
a nic that works with mtu >= 1508.
As for overheads, initial searching indicated that it's not easy or
maybe even truly possible like adsl.
The largest ping packet that I can fit out onto the wire is 1464
# ping -c 2 -s 1464 -M do google.com PING google.com (31.55.166.216)
1464(1492) bytes of data. 1472 bytes from 31.55.166.216: icmp_seq=1
ttl=58 time=11.7 ms 1472 bytes from 31.55.166.216: icmp_seq=2 ttl=58
time=11.9 ms
# ping -c 2 -s 1465 -M do google.com PING google.com (31.55.166.212)
1465(1493) bytes of data. From
host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
Frag needed and DF set (mtu = 1492) From
host81-149-38-69.in-addr.btopenworld.com (81.149.38.69) icmp_seq=1
Frag needed and DF set (mtu = 1492)
You can't work out your overheads like this.
On slow uplink adsl it was possible with ping to infer the fixed part
but you needed to send loads of pings increasing in size and plot the
best time for each to make a stepped graph.
Based on this I believe overhead should be set to 28, however with 28
set as my overhead and hfsc ls m2 20000kbit ul m2 20000kbit I seem
to be loosing about 1.5mbit of upload...
Even if you could do things perfectly I would back off a few kbit just
to be safe. Timers may be different or there may be OAM/Reporting data
going up, albeit rarely.
http://www.thinkbroadband.com/speedtest/results.html?id=141116089424883990118
http://www.thinkbroadband.com/speedtest/results.html?id=141116216621093133034
Am I calculating overhead incorrectly?
VDSL doesn't use ATM I think the PTM it uses is 64/65 - so don't specify
atm with stab. Unfortunately stab doesn't do 64/65.
As for the fixed part - I am not sure, but roughly starting with IP as
that's what tc sees on ppp (as opposed to ip + 14 on eth)
IP
+8 for PPPOE
+14 for ethertype and macs
+4 because Openreach modem uses vlan
+2 CRC ??
+ "a few" 64/65
That's it for fixed - of course 64/65 adds another one for every 64 TBH
I didn't get the precice detail from the spec and not having looked
recently I can't remember.
BT Sin 498 does give some of this info and a couple of examples of
throughput for different frame sizes - but it's rounded to kbit which
means I couldn't work out to the byte what the overheads were.
Worse still VDSL can use link layer retransmits and the sin says that
though currently (2013) not enabled, they would be in due course. I have
no clue how these work.
--
To unsubscribe from this list: send the line "unsubscribe lartc" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Dave Täht
https://www.bufferbloat.net/projects/make-wifi-fast
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Sebastian Moeller
2014-09-22 19:52:39 UTC
Permalink
Hi Alan,
Post by Alan Goodman
Hello all once again,
I tried running the attached ping sweeper yesterday evening as is and didnt get particularly plausible looking results.
I concur, that does not look like ATM. I somehow like how I hedged my estimate of the ATM quantization by reporting it likely if the residuals of the stair fit is smaller than the residuals of the linear fit ;) But that clearly is not an ATM carrier...
Post by Alan Goodman
I therefore decided to increase the upper limit of the size of ping packets sent and let the script run over night while the connection was quiet.
I guess not a bad idea, but in this case the simplistic heuristic of just comparing cumulative residuals is just not good enough. (Note though that I do not have sufficient data sets to find a better statistic test)
Post by Alan Goodman
Here is a screen shot of the resulting graph which does appear to have a stepped appearance, but perhaps not as expected?
http://imgur.com/RjmT8Qh
No, that is not the result to expect from an ATM carrier. Attached you will find example plots from a real ATM quantized link (2558 Kbps upload, 16402 Kbps download), notice how well the red line follows the green stair function in f2? Your example basically shows no stair function in the data, but for FTTC or VDLS2 that is to be expected as they finally got rid of the ATM carrier (which had overstayed its well come once the telco backbones switched away from ATM as well...)
Alan Goodman
2014-09-22 23:02:57 UTC
Permalink
This test was ran on a BT Infinity VDSL/FTTC connection with the modem plugged directly into a CentOS 6 machine which is doing PPPoE. The connection is synced at 80mbit down and 20mbit up. BT restrict downstream speed to 77.44Mbps IP traffic.
Thank you very much this is the first data set on a VDSL line I have seen, and clearly me hypothesis that overhead detection on PTM carriers will not work with the current code is nicely demonstrated. I need to ponder this a bit more and I might not be able to find a nice solution for those links...
You're welcome. If you need any more data feel free to drop me a line.
I can run the test on a slower BT connection over the week end if anyone is interested in the results?
The other connection is actually ADSL2, we probably know what the
results there will be... I think I shall run the test on a really slow
ADSL connection later in the year to double check my overheads though.
It seems like a very useful tool.

Also thanks for providing some example plots of how it should look.
That will allow me to better interpret results in future.
1) Speed: It might be that your line is fast enough to hide the ATM quantization below another quantization (like the 4KHz symbol rate of the individual carriers) or two many concurrent carriers;)
Would it be useful if I limited my upload speed with say hfsc to 1mbit
and re-ran the test?

Given the above comments in Sebastian’s very useful emails how would it
be best to shape these FTTC connections at present? Without overhead
set or something else?

Alan
Sebastian Moeller
2014-09-23 09:32:49 UTC
Permalink
H Alan,
Post by Alan Goodman
This test was ran on a BT Infinity VDSL/FTTC connection with the modem plugged directly into a CentOS 6 machine which is doing PPPoE. The connection is synced at 80mbit down and 20mbit up. BT restrict downstream speed to 77.44Mbps IP traffic.
Thank you very much this is the first data set on a VDSL line I have seen, and clearly me hypothesis that overhead detection on PTM carriers will not work with the current code is nicely demonstrated. I need to ponder this a bit more and I might not be able to find a nice solution for those links...
You're welcome. If you need any more data feel free to drop me a line.
Thanks for the offer, I might take you up on it ;) (next month I hope to upgrade to VDSL2 so I have an easier time trying new methods...)
Post by Alan Goodman
I can run the test on a slower BT connection over the week end if anyone is interested in the results?
The other connection is actually ADSL2, we probably know what the results there will be…
I assume that this will work reasonably well, for all adel lines I tested 1000 samples per ping size and a range from 16 to 116 worked out well enough to detect quantization and overhead.
Post by Alan Goodman
I think I shall run the test on a really slow ADSL connection later in the year to double check my overheads though.
I think it is a decent idea to re-check the encapsulation used occasionally, in my case the ISP added VLAN tags (which I neither need nor want) increasing the overhead from 40 bytes to 44 bytes. If that had not caused irregularities in the netperf-wrapper tests I run I would probably not have noted. (If the link is fully saturated the wrong overhead has a strong effect on the link’s latency, but with moderate load that is somewhat hidden so easy to overlook)
Post by Alan Goodman
It seems like a very useful tool.
Glad you like it ;) (I think the idea and method is sound, but those I lifted from Jesper’s thesis, my implementation however is messy)
Post by Alan Goodman
Also thanks for providing some example plots of how it should look. That will allow me to better interpret results in future.
1) Speed: It might be that your line is fast enough to hide the ATM quantization below another quantization (like the 4KHz symbol rate of the individual carriers) or two many concurrent carriers;)
Would it be useful if I limited my upload speed with say hfsc to 1mbit and re-ran the test?
First I would try against different hosts, the fact that there is no linear increase of the RTT with increasing packet size is a sign that something is messing with our probe packets and hence the whole thing gets iffy.

BUT, I strongly assume your VDSL link is using packet transfer mode (PTM) not ATM and so all my code can show you is that there is no quantization, and since overhead detection currently requires ATM cell quantization the reported numbers are just not useful. The reason to still report these is that I have not determined a proper statistical test to classify the link carrier.
Note (I might have explained that earlier, but I am not sure whether that was in this thread): the code tries to find packet sizes at which the RTT increases, or in other words the boundaries of the ATM cells. Once this is done it uses all information it has about pre-payload overhead (ICMP header, IP header…) and finds out how many bytes are missing to fully fill the first (two) ATM cells (these cells are not really shown in the plots), it then reports the previously un-known pre-IP bytes as overhead that needs to be accounted for.
Post by Alan Goodman
Given the above comments in Sebastian’s very useful emails how would it be best to shape these FTTC connections at present?
So
Post by Alan Goodman
Without overhead set or something else?
I would just go and account for all overheads I could deduce, so I would guess: 8 bytes PPPoE, 4 byte VLAN tags, 14 bytes ethernet header (note for tc’s stab method one needs to include the ethernet headers in the specified overhead in spite of the man page), I am uncertain about the 4 byte ethernet frame check sequence (it was typically not included on ATM links). So in total 26 bytes; I would specify those, for PTM getting the overhead wrong is not as bad as with ATM so just try to make a good approximation.
A trickier question is how to select the shaping rate. In theory all xDSL-modem report some sort of line rates, but unfortunately the standards contain quite a lot slightly different rates the modem manufacturer might decide to report, so guess the best one can do is to guess, then iterate over measure and refine cycles to figure out the “optimal” shaping rates. Rich Brown’s betterspeedtest.sh or netperf-wrapper’s RRUL test (see http://www.bufferbloat.net/projects/cerowrt/wiki/Quick_Test_for_Bufferbloat/ ) are decent ways for the measure part…

Best Regards
Sebastian Moeller
Post by Alan Goodman
Alan
Andy Furniss
2014-09-23 15:10:00 UTC
Permalink
Post by Sebastian Moeller
I would just go and account for all overheads I could deduce, so I
would guess: 8 bytes PPPoE, 4 byte VLAN tags, 14 bytes ethernet
header (note for tc’s stab method one needs to include the ethernet
headers in the specified overhead in spite of the man page)
I don't think the man page is wrong - it includes eth in the pppoe example.

There is a difference between shaping on ppp and shaping on eth which
needs to be and is noted.

FWIW I tried a few pings on my VDSL2 and don't think I'll be any use for
results.

I do get an increase with larger packets but it's more than it should be
:-(.

The trouble is that my ISP does DPI/Ellacoya Qos for my ingress and I
guess this affects things a bit too much for sub milisecond accuracy
needed on a 20/80 line.

At least I don't have to bother so much about ingress shaping (not that
I would @80mbit so much anyway).

Ping and game traffic comes in tos marked 0x0a and gets prio on their
egress which is set slightly lower than my sync profile speed.

Additionally it's probably not the best time to test as they had a
recent outage which caused in-balance on their gateways which seems to
still persist.
Sebastian Moeller
2014-09-23 17:47:07 UTC
Permalink
Hi Andy,
Post by Andy Furniss
Post by Sebastian Moeller
I would just go and account for all overheads I could deduce, so I
would guess: 8 bytes PPPoE, 4 byte VLAN tags, 14 bytes ethernet
header (note for tc’s stab method one needs to include the ethernet
headers in the specified overhead in spite of the man page)
I don't think the man page is wrong - it includes eth in the pppoe example.
I am not sure we are talking about the same man page then. From opens use 13.1 “man tc-stab”:
When size table is consulted, and you're shaping traffic for the sake of another modem/router, ethernet header (with-
out padding) will already be added to initial packet's length. You should compensate for that by subtracting 14 from
the above overheads in such case. If you're shaping directly on the router (for example, with speedtouch usb modem)
using ppp daemon, you're using raw ip interface without underlying layer2, so nothing will be added.

For more thorough explanations, please see [1] and [2].

BUT if you look at the kernel code, stab does not automatically include the ethernet overhead, so the subtract 14 in the above is actually wrong. See http://lxr.free-electrons.com/source/net/sched/sch_api.c#L538 where “pkt_len = skb->len + stab->szopts.overhead; is used instead of using “qdisc_skb_cb(skb)->pkt_len” that as filled properly in http://lxr.free-electrons.com/source/net/core/dev.c#L2705 . At least to me this clearly looks like the ethernet overhead is not pre-added when using stab, but I could be wrong.
And on an ADSL link you can see this quite well, with the proper overhead values sqm-scripts still controls the latency under netperf-wrapper’s RRUL test nicely even if the shaping rate equals the line rate, with the overhead to small latency goes down the drain ;)
Post by Andy Furniss
There is a difference between shaping on ppp and shaping on eth which
needs to be and is noted.
Again I am not sure about the validity of the information in the man page...
Post by Andy Furniss
FWIW I tried a few pings on my VDSL2 and don't think I'll be any use for
results.
Well for the overhead calculation my script absolutely requires ATM cell quantization, with PTM as usual on VDSL2 it has no chance of working at all; the “signal” it is searching for simply does not exist with a PTM carrier ;)
Post by Andy Furniss
I do get an increase with larger packets but it's more than it should be
:-(.
If it is nicely linear that would be great.
Post by Andy Furniss
The trouble is that my ISP does DPI/Ellacoya Qos for my ingress and I
guess this affects things a bit too much for sub milisecond accuracy
needed on a 20/80 line.
Okay, so one issue is that with 80/20 you would expect the RTT-difference if you add a single ASTM cell to your packet to be:
((53*8) / (80000 * 1000) + (53*8) / (20000 * 1000) ) * 1000 = 0.0265milliseconds

With ping typically only reporting milliseconds with 1 decimal point this means even if you had an ATM carrier you would be in for a long measurement train… but BT VDSL runs on PTM so even with weeks of measurement time all that would show you is that there is no ATM quantization ;)
Post by Andy Furniss
At least I don't have to bother so much about ingress shaping (not that
I would a) love to have your connection, and b) would still try to shape ingress; but currently not much affordable home routers can actually reliably shape a 80/20 connection...
Post by Andy Furniss
Ping and game traffic comes in tos marked 0x0a and gets prio on their
egress which is set slightly lower than my sync profile speed.
Yeah, it seems excessively hard to calculate the net rate on VDSL links as a number of encapsulation details are well hidden from the end user (think DTU size…) so simply aiming lower and perform a few tests seems like the best approach. A bit of a pity since on ATM we really could account for all (and for that reason I saw great latency results even when shaping my line to 100% of reported line-rate). I am quite curious how tricky this is going to be on VDSL...
Post by Andy Furniss
Additionally it's probably not the best time to test as they had a
recent outage which caused in-balance on their gateways which seems to
still persist.
Andy Furniss
2014-09-23 19:05:52 UTC
Permalink
Post by Sebastian Moeller
Hi Andy,
BUT if you look at the kernel code, stab does not automatically
include the ethernet overhead, so the subtract 14 in the above is
actually wrong. See
http://lxr.free-electrons.com/source/net/sched/sch_api.c#L538 where
“pkt_len = skb->len + stab->szopts.overhead; is used instead of using
“qdisc_skb_cb(skb)->pkt_len” that as filled properly in
http://lxr.free-electrons.com/source/net/core/dev.c#L2705 . At least
to me this clearly looks like the ethernet overhead is not pre-added
when using stab, but I could be wrong. And on an ADSL link you can
see this quite well, with the proper overhead values sqm-scripts
still controls the latency under netperf-wrapper’s RRUL test nicely
even if the shaping rate equals the line rate, with the overhead to
small latency goes down the drain ;)
I guess skb->len varies depending on the interface.

Anyway here's a quick test on my desktop PC running a git kernel and tc.

I used to shape remotely pppoa/vc mux dsl so know that for me

ping -s 10 .... = one cell and -s 11 = 2 cells - overhead on IP was 10.

Paste time -

ph4[/mnt/sda8/Qos/stab-tests]# cat stab-hfsc
#set -x
TC=/sbin/tc

$TC qdisc del dev eth0 root &>/dev/null

if [ "$1" = "stop" ]
then
exit
fi

$TC qdisc add dev eth0 root handle 1: stab overhead -4 linklayer atm
hfsc default ffff
$TC class add dev eth0 parent 1: classid 1:1 hfsc sc rate 1kbit ul rate
1kbit
$TC qdisc add dev eth0 parent 1:1 pfifo limit 200
$TC class add dev eth0 parent 1:0 classid 1:ffff hfsc sc rate 80mbit ul
rate 80mbit

$TC filter add dev eth0 parent 1: protocol ip prio 1 \
u32 match ip protocol 1 0xff classid 1:1

ph4[/mnt/sda8/Qos/stab-tests]# ./stab-hfsc
ph4[/mnt/sda8/Qos/stab-tests]# ping -s 10 -c 1 noki
PING noki.andys.lan (192.168.0.1) 10(38) bytes of data.
18 bytes from noki.andys.lan (192.168.0.1): icmp_req=1 ttl=64

--- noki.andys.lan ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms

ph4[/mnt/sda8/Qos/stab-tests]# tc -s qdisc ls dev eth0
qdisc hfsc 1: root refcnt 2 default ffff
Sent 106 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc pfifo 8005: parent 1:1 limit 200p
Sent 53 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
ph4[/mnt/sda8/Qos/stab-tests]# ./stab-hfsc
ph4[/mnt/sda8/Qos/stab-tests]# ping -s 11 -c 1 noki
PING noki.andys.lan (192.168.0.1) 11(39) bytes of data.
19 bytes from noki.andys.lan (192.168.0.1): icmp_req=1 ttl=64

--- noki.andys.lan ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms

ph4[/mnt/sda8/Qos/stab-tests]# tc -s qdisc ls dev eth0
qdisc hfsc 1: root refcnt 2 default ffff
Sent 106 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc pfifo 8006: parent 1:1 limit 200p
Sent 106 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
ph4[/mnt/sda8/Qos/stab-tests]#

So it seems that overhead -4 is the correct thing to do.

I also tested backlogged (-i 0.2) with -s 10 and 11 and tcpdump showed
the correct deltas -

ph4[/mnt/sda8/Qos/stab-tests]# tcpdump -nnttti eth0 icmp and dst host noki

snip

00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1345, seq 92, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1345, seq 93, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1345, seq 94, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1345, seq 95, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1345, seq 96, length 18
00:00:00.424001 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1345, seq 97, length 18
00:00:00.423999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1345, seq 98, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1347, seq 1, length 19
00:00:00.848000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1347, seq 2, length 19
00:00:00.848001 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1347, seq 3, length 19
00:00:00.847999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1347, seq 4, length 19
00:00:00.848000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1347, seq 5, length 19
00:00:00.847999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1347, seq 6, length 19
00:00:00.848002 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id
1347, seq 7, length 19
Sebastian Moeller
2014-09-23 22:16:20 UTC
Permalink
Hi Andy,
Post by Andy Furniss
Post by Sebastian Moeller
Hi Andy,
BUT if you look at the kernel code, stab does not automatically
include the ethernet overhead, so the subtract 14 in the above is
actually wrong. See
http://lxr.free-electrons.com/source/net/sched/sch_api.c#L538 where
“pkt_len = skb->len + stab->szopts.overhead; is used instead of using
“qdisc_skb_cb(skb)->pkt_len” that as filled properly in
http://lxr.free-electrons.com/source/net/core/dev.c#L2705 . At least
to me this clearly looks like the ethernet overhead is not pre-added
when using stab, but I could be wrong. And on an ADSL link you can
see this quite well, with the proper overhead values sqm-scripts
still controls the latency under netperf-wrapper’s RRUL test nicely
even if the shaping rate equals the line rate, with the overhead to
small latency goes down the drain ;)
I guess skb->len varies depending on the interface.
Anyway here's a quick test on my desktop PC running a git kernel and tc.
I used to shape remotely pppoa/vc mux dsl so know that for me
ping -s 10 .... = one cell and -s 11 = 2 cells - overhead on IP was 10.
Paste time -
ph4[/mnt/sda8/Qos/stab-tests]# cat stab-hfsc
#set -x
TC=/sbin/tc
$TC qdisc del dev eth0 root &>/dev/null
if [ "$1" = "stop" ]
then
exit
fi
$TC qdisc add dev eth0 root handle 1: stab overhead -4 linklayer atm hfsc default ffff
$TC class add dev eth0 parent 1: classid 1:1 hfsc sc rate 1kbit ul rate 1kbit
$TC qdisc add dev eth0 parent 1:1 pfifo limit 200
$TC class add dev eth0 parent 1:0 classid 1:ffff hfsc sc rate 80mbit ul rate 80mbit
$TC filter add dev eth0 parent 1: protocol ip prio 1 \
u32 match ip protocol 1 0xff classid 1:1
ph4[/mnt/sda8/Qos/stab-tests]# ./stab-hfsc
ph4[/mnt/sda8/Qos/stab-tests]# ping -s 10 -c 1 noki
PING noki.andys.lan (192.168.0.1) 10(38) bytes of data.
18 bytes from noki.andys.lan (192.168.0.1): icmp_req=1 ttl=64
--- noki.andys.lan ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
ph4[/mnt/sda8/Qos/stab-tests]# tc -s qdisc ls dev eth0
qdisc hfsc 1: root refcnt 2 default ffff
Sent 106 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc pfifo 8005: parent 1:1 limit 200p
Sent 53 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
ph4[/mnt/sda8/Qos/stab-tests]# ./stab-hfsc
ph4[/mnt/sda8/Qos/stab-tests]# ping -s 11 -c 1 noki
PING noki.andys.lan (192.168.0.1) 11(39) bytes of data.
19 bytes from noki.andys.lan (192.168.0.1): icmp_req=1 ttl=64
--- noki.andys.lan ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
ph4[/mnt/sda8/Qos/stab-tests]# tc -s qdisc ls dev eth0
qdisc hfsc 1: root refcnt 2 default ffff
Sent 106 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc pfifo 8006: parent 1:1 limit 200p
Sent 106 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
ph4[/mnt/sda8/Qos/stab-tests]#
Thanks for sharing your test case; I can repeat these results exactly on my machines (I also tried htb instead hfsc for fun: same result as to be expected see below).
Looking back at http://lxr.free-electrons.com/ident?i=qdisc_pkt_len_init (line 2731):

qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len ;

I begin to realize this function is not responsible for adding single wire packet’s ethernet header, but for figuring out in how many on-the-wire packets to chop down a GSO packet , and add the header overhead for the additional wire packets, I had completely looked over the (gso-segs - 1) part, oops.

@cerowrt-devel: everyone using link layer ATM you might want to try to reduce the the per packet overhead by 14… (but please test)

So I stand corrected, you are right, tic’s stab automatically adds the ethernet header. So I am off to repeat my netperf-wrapper tests right now again with overhead of 30 instead of 44, again these tests confirm your observation. Interestingly, it seems netperf-wrapper’s RRUL test really is suited to figure out the overhead: while shaping to 100% of line rate (on ADSL2+ where line rate rate is the net line rate (after FEC)) specifying too small an overhead the ICMP latency plot shows larger deviations from the expected unload RTT plus 10ms. Too large an overhead however just decreases the good put bait while leaving the latency well under control.
Post by Andy Furniss
So it seems that overhead -4 is the correct thing to do.
And thanks to your help I fully agree.
Post by Andy Furniss
I also tested backlogged (-i 0.2) with -s 10 and 11 and tcpdump showed the correct deltas -
ph4[/mnt/sda8/Qos/stab-tests]# tcpdump -nnttti eth0 icmp and dst host noki
snip
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 92, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 93, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 94, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 95, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 96, length 18
00:00:00.424001 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 97, length 18
00:00:00.423999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1345, seq 98, length 18
00:00:00.424000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 1, length 19
00:00:00.848000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 2, length 19
00:00:00.848001 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 3, length 19
00:00:00.847999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 4, length 19
00:00:00.848000 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 5, length 19
00:00:00.847999 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 6, length 19
00:00:00.848002 IP 192.168.0.3 > 192.168.0.1: ICMP echo request, id 1347, seq 7, length 19
I really appreciate your test script, thanks for taking the time.

Best Regards
Sebastian
Andy Furniss
2014-09-24 09:17:52 UTC
Permalink
Post by Sebastian Moeller
Thanks for sharing your test case; I can repeat these results
exactly on my machines (I also tried htb instead hfsc for fun: same
result as to be expected see below). Looking back at
http://lxr.free-electrons.com/ident?i=qdisc_pkt_len_init (line
qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len ;
I begin to realize this function is not responsible for adding
single wire packet’s ethernet header, but for figuring out in how
many on-the-wire packets to chop down a GSO packet , and add the
header overhead for the additional wire packets, I had completely
looked over the (gso-segs - 1) part, oops.
Glad it helped - I know from trying, and giving up, how hard/error prone
reading kernel code can be :-)
Post by Sebastian Moeller
@cerowrt-devel: everyone using link layer ATM you might want to try
to reduce the the per packet overhead by 14… (but please test)
Maybe you mean overhead calculated by a script?

Just to be clear, I expect that wrt would be shaping on ppp, so you
don't need to take 14 if that's the case.
Post by Sebastian Moeller
So I stand corrected, you are right, tic’s stab automatically adds
the ethernet header. So I am off to repeat my netperf-wrapper tests
right now again with overhead of 30 instead of 44, again these tests
confirm your observation. Interestingly, it seems netperf-wrapper’s
RRUL test really is suited to figure out the overhead: while shaping
to 100% of line rate (on ADSL2+ where line rate rate is the net line
rate (after FEC)) specifying too small an overhead the ICMP latency
plot shows larger deviations from the expected unload RTT plus 10ms.
Too large an overhead however just decreases the good put bait while
leaving the latency well under control.
I wouldn't word it like "stab adds ..." This is nothing to do with stab
really - just the only length stab knows is skb->len and that means
different things on different interfaces because of how the kernel works.

(I haven't retested all this, but I doubt it's changed)

On ppp skb->len = ip len

On eth skb->len = ip len + 14

On vlan skb->len = ip len + 18

If you ran my script on various interfaces without stab I expect you
would still be able to see the difference - everyone who does any tc on
eth gets shaping with ip+14 sized packets.

Even without tc involved I think you could see the difference looking at
ip -s ls xxxx type stats on different interfaces.
Sebastian Moeller
2014-09-24 16:23:37 UTC
Permalink
Hi Andy,
Post by Andy Furniss
Post by Sebastian Moeller
Thanks for sharing your test case; I can repeat these results
exactly on my machines (I also tried htb instead hfsc for fun: same
result as to be expected see below). Looking back at
http://lxr.free-electrons.com/ident?i=qdisc_pkt_len_init (line
qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len ;
I begin to realize this function is not responsible for adding
single wire packet’s ethernet header, but for figuring out in how
many on-the-wire packets to chop down a GSO packet , and add the
header overhead for the additional wire packets, I had completely
looked over the (gso-segs - 1) part, oops.
Glad it helped - I know from trying, and giving up, how hard/error prone
reading kernel code can be :-)
Especially when all one knows about C is basically from reading K&R with almost no hands-on coding experience ;)
Post by Andy Furniss
Post by Sebastian Moeller
@cerowrt-devel: everyone using link layer ATM you might want to try
to reduce the the per packet overhead by 14… (but please test)
Maybe you mean overhead calculated by a script?
Well in cerowrt’s SQM-scripts we expose the stab options so users can take link layer and overhead into account. If you naively determine the overhead, either with the help of the scrips I posted earlier or by looking it up on a table (if the encapsulation options are known) you will end up not handling the kernel’s auto-added overhead well. Currently SQM scripts does not expose PPP devices only ge00 (ethernet) so -14 seems currently the best recommendation in combination with “please test”. What I am curious after your message is what happens if the kernel terminates a pppoe connection but is connected to a “modem” via ethernet, what does the kernel do. And thanks to your pointers I know have an idea of how to test that ;)
Post by Andy Furniss
Just to be clear, I expect that wrt would be shaping on ppp, so you
don't need to take 14 if that's the case.
Good to know.
Post by Andy Furniss
Post by Sebastian Moeller
So I stand corrected, you are right, tic’s stab automatically adds
the ethernet header. So I am off to repeat my netperf-wrapper tests
right now again with overhead of 30 instead of 44, again these tests
confirm your observation. Interestingly, it seems netperf-wrapper’s
RRUL test really is suited to figure out the overhead: while shaping
to 100% of line rate (on ADSL2+ where line rate rate is the net line
rate (after FEC)) specifying too small an overhead the ICMP latency
plot shows larger deviations from the expected unload RTT plus 10ms.
Too large an overhead however just decreases the good put bait while
leaving the latency well under control.
I wouldn't word it like "stab adds ..." This is nothing to do with stab
really - just the only length stab knows is skb->len and that means
different things on different interfaces because of how the kernel works.
(I haven't retested all this, but I doubt it's changed)
On ppp skb->len = ip len
On eth skb->len = ip len + 14
On vlan skb->len = ip len + 18
So this is the information I actually wanted to find and then somehow thought qdisc_pkt_len_init() was the place. Do you by chance have any pointer where this assignment is handled?
Post by Andy Furniss
If you ran my script on various interfaces without stab I expect you
would still be able to see the difference - everyone who does any tc on
eth gets shaping with ip+14 sized packets.
Even without tc involved I think you could see the difference looking at
ip -s ls xxxx type stats on different interfaces.
Thanks again, & Best Regards
Sebastian
Andy Furniss
2014-09-24 22:48:48 UTC
Permalink
Post by Sebastian Moeller
Post by Andy Furniss
Maybe you mean overhead calculated by a script?
Well in cerowrt’s SQM-scripts we expose the stab options so users can
take link layer and overhead into account. If you naively determine
the overhead, either with the help of the scrips I posted earlier or
by looking it up on a table (if the encapsulation options are known)
you will end up not handling the kernel’s auto-added overhead well.
Currently SQM scripts does not expose PPP devices only ge00
(ethernet) so -14 seems currently the best recommendation in
combination with “please test”.
Oh, OK - I know nothing about wrt.
Post by Sebastian Moeller
What I am curious after your message
is what happens if the kernel terminates a pppoe connection but is
connected to a “modem” via ethernet, what does the kernel do. And
thanks to your pointers I know have an idea of how to test that ;)
Well I can't say I know - testing is always best.

I think we are "seeing" skbs just as they enter an interface - so what
form they take depends on the particular interface they have just been
made for.

It's possible to have multiple pppoes/vlans on an eth and use the eth
normally at the same time. What you see I suppose depends on where you
are "attached". I guess shaping a pppoe on the eth rather than on the
actual ppp is doable with a bit of filtering - in which case you may
need to allow for the +14 macs/ethertype and that 8 ppp are already in
the payload - a totally untested theory :-)
Post by Sebastian Moeller
Post by Andy Furniss
On ppp skb->len = ip len
On eth skb->len = ip len + 14
On vlan skb->len = ip len + 18
So this is the information I actually wanted to find and then somehow
thought qdisc_pkt_len_init() was the place. Do you by chance have any
pointer where this assignment is handled?
No, sorry I don't know the code.

Andy Furniss
2014-09-20 22:29:25 UTC
Permalink
Post by Dave Taht
We'd had a very long thread on cerowrt-devel and in the end
sebastian (I think) had developed some scripts to exaustively (it
took hours) derive the right encapsulation frame size on a link. I
can't find the relevant link right now, ccing that list...
Thanks, that sounds cool.
Post by Dave Taht
Sfq was only ever meant for bulk, so should really be in addition
to some classification to separate interactive - I don't really get
the
Hmm? sfq separates bulk from interactive pretty nicely. It tends to
do bad things to bulk as it doesn't manage queue length.
Well I come at this from years of qos stuck on 288 then 448 kbit up atm
dsl before the days of fq_codel.

Since it got jhash sfq does at least manage to avoid collisions, but
it's still a total non starter for use alone on a slow link because the
interactive packet may wait for many bulk packets to dequeue before its
turn.

Of course sfq is cleverer now than it used to be - headdrop and red the
latter I've not used.

I agree it's bloaty for bulk, but that's not quite so critical if your
real buffer is not bloaty if you can get classification to work.
Post by Dave Taht
A little bit of prioritization or deprioritization for some traffic
is helpful, but most traffic is hard to classify.
bufferbloat bit, you could make the default 128 limit lower if you wanted.
htb + fq_codel, if available, is the right thing here....
Yea, though on a link like my old one I still think classification would
just win. I should test really, but IME a slow link can be hard to
simulate (I have 20mbit up now) - the results tend to look a bit better
because of no real bitrate latency.
Loading...