[ovs-dev] [PATCH v2] ovn-northd: logical router icmp response should not care about inport

Flaviof flavio at flaviof.com
Thu May 26 20:05:52 UTC 2016


On Thu, May 26, 2016 at 4:00 PM, Darrell Ball <dlu998 at gmail.com> wrote:

>
>
> On Wed, May 25, 2016 at 9:40 PM, Flavio Fernandes <flavio at flaviof.com>
> wrote:
>
>> When responding to icmp echo requests (aka ping) packets, the logical
>> router should not restrict responses based on the inport.
>>
>> Example diagram:
>>
>> vm: IP1.1 (subnet1)
>> logical_router: IP1.2 (subnet1) and IP2.2 (subnet2)
>>
>>    vm -------[subnet1]------- logical_router -------[subnet2]
>>    <IP1.1>                <IP1.2>        <IP2.2>
>>
>> vm should be able to ping <IP1.2>, even though it is an address
>> of a subnet that can only be reached through L3 routing.
>>
>> Reference to the mailing list thread:
>> http://openvswitch.org/pipermail/discuss/2016-May/021172.html
>>
>> ---
>> Changes v1->v2:
>>   - Add unit test.
>>
>> To be discussed:
>>
>> One caveat here is that ttl should be decremented before vm
>> reaches <IP2.2> and that logic is not invoked until later
>> in the pipeline. Further work may be necessary in order
>> to make ip.ttl part of the match in the logical table, so
>> pings from the non-local subnet are only responded if ttl > 1.
>> As far as I can tell, the match on logical flows currently does
>> not handle ip.ttl.
>>
>> In short, instead of simply removing the inport match in the icmp rule,
>> we could add a second rule that would do something like:
>>
>>   "inport == %s && (ip4.dst == "IP_FMT" || ip4.dst == "IP_FMT") && ... <=
>> priority 90
>>   "ip.ttl > 1 && (ip4.dst == "IP_FMT" || ip4.dst == "IP_FMT") && ... <=
>> priority 90-X
>>
>
>
> These pings are destined to one of a router's own IP addresses; i.e. local
> to the router.
>
> The router is not forwarding the original ICMP request since an ICMP reply
> is a separate packet from the ICMP request.
>
> The below is an excerpt from RFC 1812 and it also references RFC 791.
>
> 4.2.2.9 Time to Live: RFC 791 Section 3.2
> .
> .
> Note in particular that a router MUST NOT check the TTL of a packet
> except when forwarding it.
> .
> .
> A router MUST NOT discard a datagram just because it was received
> with a TTL equal to zero or one; if it is to the router and otherwise
> valid, the router must attempt to receive it.
>
>
That 100% puts away my concern. Thanks Darrell!
I'll submit a revised patch that incorporates this.

-- flaviof


>
>
>>
>> Signed-off-by: Flavio Fernandes <flavio at flaviof.com>
>> ---
>>  ovn/northd/ovn-northd.c |   8 ++-
>>  tests/ovn.at            | 173
>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 178 insertions(+), 3 deletions(-)
>>
>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> index 44e9430..68decbf 100644
>> --- a/ovn/northd/ovn-northd.c
>> +++ b/ovn/northd/ovn-northd.c
>> @@ -1892,11 +1892,13 @@ build_lrouter_flows(struct hmap *datapaths,
>> struct hmap *ports,
>>          free(match);
>>
>>          /* ICMP echo reply.  These flows reply to ICMP echo requests
>> -         * received for the router's IP address. */
>> +         * received for the router's IP address. Since packets only
>> +         * get here as part of the logical router datapath, the inport
>> +         * (i.e. the incoming locally attached net) does not matter. */
>>          match = xasprintf(
>> -            "inport == %s && (ip4.dst == "IP_FMT" || ip4.dst ==
>> "IP_FMT") && "
>> +            "(ip4.dst == "IP_FMT" || ip4.dst == "IP_FMT") && "
>>              "icmp4.type == 8 && icmp4.code == 0",
>> -            op->json_key, IP_ARGS(op->ip), IP_ARGS(op->bcast));
>> +            IP_ARGS(op->ip), IP_ARGS(op->bcast));
>>          char *actions = xasprintf(
>>              "ip4.dst = ip4.src; "
>>              "ip4.src = "IP_FMT"; "
>> diff --git a/tests/ovn.at b/tests/ovn.at
>> index e6ac1d7..8cd3677 100644
>> --- a/tests/ovn.at
>> +++ b/tests/ovn.at
>> @@ -2611,3 +2611,176 @@ OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>
>>  AT_CLEANUP
>> +
>> +AT_SETUP([ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR])
>> +AT_KEYWORDS([router-icmp-reply])
>> +AT_SKIP_IF([test $HAVE_PYTHON = no])
>> +ovn_start
>> +
>> +# Logical network:
>> +# One LR - R1 has switch ls1 (191.168.1.0/24) connected to it,
>> +# and has switch ls2 (172.16.1.0/24) connected to it.
>> +
>> +ovn-nbctl create Logical_Router name=R1
>> +
>> +ovn-nbctl lswitch-add ls1
>> +ovn-nbctl lswitch-add ls2
>> +
>> +# Connect ls1 to R1
>> +ovn-nbctl -- --id=@lrp create Logical_Router_port name=ls1 \
>> +network=192.168.1.1/24 mac=\"00:00:00:01:02:f1\" -- add Logical_Router
>> R1 \
>> +ports @lrp -- lport-add ls1 rp-ls1
>> +
>> +ovn-nbctl set Logical_port rp-ls1 type=router options:router-port=ls1 \
>> +addresses=\"00:00:00:01:02:f1\"
>> +
>> +# Connect ls2 to R1
>> +ovn-nbctl -- --id=@lrp create Logical_Router_port name=ls2 \
>> +network=172.16.1.1/24 mac=\"00:00:00:01:02:f2\" -- add Logical_Router
>> R1 \
>> +ports @lrp -- lport-add ls2 rp-ls2
>> +
>> +ovn-nbctl set Logical_port rp-ls2 type=router options:router-port=ls2 \
>> +addresses=\"00:00:00:01:02:f2\"
>> +
>> +# Create logical port ls1-lp1 in ls1
>> +ovn-nbctl lport-add ls1 ls1-lp1 \
>> +-- lport-set-addresses ls1-lp1 "00:00:00:01:02:03 192.168.1.2"
>> +
>> +# Create logical port ls2-lp1 in ls2
>> +ovn-nbctl lport-add ls2 ls2-lp1 \
>> +-- lport-set-addresses ls2-lp1 "00:00:00:01:02:04 172.16.1.2"
>> +
>> +# Create one hypervisor and create OVS ports corresponding to logical
>> ports.
>> +net_add n1
>> +
>> +sim_add hv1
>> +as hv1
>> +ovs-vsctl add-br br-phys
>> +ovn_attach n1 br-phys 192.168.0.1
>> +ovs-vsctl -- add-port br-int vif1 -- \
>> +    set interface vif1 external-ids:iface-id=ls1-lp1 \
>> +    options:tx_pcap=hv1/vif1-tx.pcap \
>> +    options:rxq_pcap=hv1/vif1-rx.pcap \
>> +    ofport-request=1
>> +
>> +ovs-vsctl -- add-port br-int vif2 -- \
>> +    set interface vif2 external-ids:iface-id=ls2-lp1 \
>> +    options:tx_pcap=hv1/vif2-tx.pcap \
>> +    options:rxq_pcap=hv1/vif2-rx.pcap \
>> +    ofport-request=1
>> +
>> +
>> +# Allow some time for ovn-northd and ovn-controller to catch up.
>> +# XXX This should be more systematic.
>> +sleep 1
>> +
>> +
>> +ip_to_hex() {
>> +    printf "%02x%02x%02x%02x" "$@"
>> +}
>> +trim_zeros() {
>> +    sed 's/\(00\)\{1,\}$//'
>> +}
>> +for i in 1 2; do
>> +    : > vif$i.expected
>> +done
>> +# test_ipv4_icmp_request INPORT ETH_SRC ETH_DST IPV4_SRC IPV4_DST
>> IP_CHKSUM ICMP_CHKSUM [EXP_IP_CHKSUM EXP_ICMP_CHKSUM]
>> +#
>> +# Causes a packet to be received on INPORT.  The packet is an ICMPv4
>> +# request with ETH_SRC, ETH_DST, IPV4_SRC and IPV4_DST as specified.  If
>> EXP_IP_CHKSUM
>> +# is provided, then it should be the ip checksum of the packet responded;
>> +# otherwise no reply is expected.
>> +# In the absence of an ip checksum calculation helpers, we will rely on
>> caller to provide the checksums for the ip
>> +# and the icmp headers.
>> +# XXX This should be more systematic.
>> +#
>> +# INPORT is an lport number, e.g. 11 for vif11.
>> +# ETH_SRC and ETH_DST are each 12 hex digits.
>> +# IPV4_SRC and IPV4_DST are each 8 hex digits.
>> +# IP_CHSUM and ICMP_CHKSUM are each 4 hex digits.
>> +# EXP_IP_CHSUM and EXP_ICMP_CHKSUM are each 4 hex digits.
>> +test_ipv4_icmp_request() {
>> +    local inport=$1 eth_src=$2 eth_dst=$3 ipv4_src=$4 ipv4_dst=$5
>> ip_chksum=$6 icmp_chksum=$7
>> +    local exp_ip_chksum=$8 exp_icmp_chksum=$9
>> +    shift; shift; shift; shift; shift; shift; shift
>> +    shift; shift
>> +
>> +    local ip_ttl=0a
>> +    local icmp_id=5fbf
>> +    local icmp_seq=0001
>> +    local icmp_data=$(seq 1 56 | xargs printf "%02x")
>> +    local icmp_type_code_request=0800
>> +    local
>> icmp_payload=${icmp_type_code_request}${icmp_chksum}${icmp_id}${icmp_seq}${icmp_data}
>> +    local
>> packet=${eth_dst}${eth_src}08004500005400004000${ip_ttl}01${ip_chksum}${ipv4_src}${ipv4_dst}${icmp_payload}
>> +
>> +    as hv1 ovs-appctl netdev-dummy/receive vif$inport $packet
>> +    if test X$exp_ip_chksum != X; then
>> +        # Expect to receive the reply, if any. In same port where packet
>> was sent.
>> +        # Note: src and dst are expected to be reversed.
>> +        local icmp_type_code_response=0000
>> +        local reply_icmp_ttl=fe
>> +        local
>> reply_icmp_payload=${icmp_type_code_response}${exp_icmp_chksum}${icmp_id}${icmp_seq}${icmp_data}
>> +        local
>> reply=${eth_src}${eth_dst}08004500005400004000${reply_icmp_ttl}01${exp_ip_chksum}${ipv4_dst}${ipv4_src}${reply_icmp_payload}
>> +        echo $reply >> vif$inport.expected
>> +    fi
>> +}
>> +
>> +# send ping packet to router's ip addresses, from each of the 2 logical
>> ports.
>> +rtr_l1_ip=$(ip_to_hex 192 168 1 1)
>> +rtr_l2_ip=$(ip_to_hex 172 16 1 1)
>> +l1_ip=$(ip_to_hex 192 168 1 2)
>> +l2_ip=$(ip_to_hex 172 16 1 2)
>> +
>> +# ping router ip address that is on same subnet as the logical port
>> +test_ipv4_icmp_request 1 000000010203 0000000102f1 $l1_ip $rtr_l1_ip
>> 0000 8510 0bff 8d10
>> +test_ipv4_icmp_request 2 000000010204 0000000102f2 $l2_ip $rtr_l2_ip
>> 0000 8510 0bff 8d10
>> +
>> +# ping router ip address that is on the other side of the logical ports
>> +test_ipv4_icmp_request 1 000000010203 0000000102f1 $l1_ip $rtr_l2_ip
>> 0000 8510 0bff 8d10
>> +test_ipv4_icmp_request 2 000000010204 0000000102f2 $l2_ip $rtr_l1_ip
>> 0000 8510 0bff 8d10
>> +
>> +echo "---------NB dump-----"
>> +ovn-nbctl show
>> +echo "---------------------"
>> +ovn-nbctl list logical_router
>> +echo "---------------------"
>> +ovn-nbctl list logical_router_port
>> +echo "---------------------"
>> +
>> +echo "---------SB dump-----"
>> +ovn-sbctl list datapath_binding
>> +echo "---------------------"
>> +ovn-sbctl list logical_flow
>> +echo "---------------------"
>> +
>> +echo "------ hv1 dump ----------"
>> +as hv1 ovs-ofctl dump-flows br-int
>> +
>> +# check received packets against expected
>> +for inport in 1 2; do
>> +    file=hv1/vif${inport}-tx.pcap
>> +    echo $file
>> +    $PYTHON "$top_srcdir/utilities/ovs-pcap.in" $file | trim_zeros >
>> received.packets
>> +    cat vif$inport.expected | trim_zeros > expout
>> +    AT_CHECK([cat received.packets], [0], [expout])
>> +done
>> +
>> +as hv1
>> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>> +OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> +
>> +as ovn-sb
>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> +
>> +as ovn-nb
>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> +
>> +as northd
>> +OVS_APP_EXIT_AND_WAIT([ovn-northd])
>> +
>> +as main
>> +OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> +
>> +AT_CLEANUP
>> --
>> 1.9.1
>>
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>>
>
>



More information about the dev mailing list