[ovs-dev] ovn: SFC Patch V3

Mickey Spiegel mickeys.dev at gmail.com
Thu Jun 8 17:14:11 UTC 2017


A couple more issues that have not come up in a while or at all so far:

1. A port can have multiple mac addresses. Right now you are only using the
first mac address on a port (traffic_port->nbsp->addresses[0]). Rules need
to be installed for all of the mac addresses on the port.

2. The VNF ports are associated with the logical switch's datapath. Any
rules in non-SFC specific tables (e.g. ingress stages 1-9, egress stages
1-9) will be applied over and over again at each SFC hop. This affects
performance. Besides the overhead of additional redundant lookups, if there
are stateful ACL rules then conntrack recirc will occur. Possibly even
worse, if any of the VNFs change any of the fields in the ACL match
condition, then the packet could fail the ACL and be dropped. One option to
avoid all of this would be to insert a pipeline stage at the beginning of
the ingress pipeline and egress pipelines (we already have one for SFC in
the egress pipeline, which can be reused for this purpose as well),
skipping most if not all other pipeline stages on VNF ports. For
intermediate SFC hops in the ingress pipeline, I guess the rules could be
put in this first table rather than the current table 10. For the egress
pipeline, I guess just output directly?

Mickey


On Wed, May 10, 2017 at 3:49 PM, Mickey Spiegel <mickeys.dev at gmail.com>
wrote:

> Three issues before diving in:
>
>
> 1. Placement of S_SWITCH_IN_CHAIN
>
> For some reason I thought S_SWITCH_IN_CHAIN was after all the stateful
> processing, but now I see that it is table 7. That breaks ACLs and other
> stateful processing, since REGBIT_CONNTRACK_COMMIT is set in
> S_SWITCH_IN_ACL and matched in S_SWITCH_IN_STATEFUL.
>
> S_SWITCH_IN_CHAIN should instead be table 10. The comments below are
> written assuming this change.
>
>
> 2. Ingress pipeline needs to be expanded to more than 16 tables
>
> DNS went in since the v3 patch and used up the last of the 16 ingress
> tables. If you rebase, you will see that there is no space in the ingress
> pipeline for the addition of S_SWITCH_IN_CHAIN. Someone (not me) needs to
> expand the ingress pipeline to more than 16 stages before you can proceed.
>
>
> 3. While writing this response, I paid a little more attention to the
> "exit-lport" direction and noticed a couple of significant issues.
>
> a. If a packet goes from VM A on port 1 to VM B on port 4, there is a
> logical port chain classifier on port 1 in the "entry-lport" direction, and
> there is a logical port chain classifier on port 4 in the "exit-lport"
> direction, you will only go down one of the service chains. Since the
> priorities are equal, I cannot even tell you which one of the service
> chains. Logically I would think that the packet should go down both service
> chains, first the port 1 "entry-lport" service chain and then the port 4
> "exit-lport" service chain.
>
> b. This is done in the ingress pipeline, not the egress pipeline, and is
> based on matching eth.dst. This assumes that the forwarding decision will
> be based on eth.dst, since you are immediately going down the service
> chain, skipping the other ingress pipeline stages, and at the end you go
> directly to the egress pipeline with outport based on eth.dst. That is
> quite restrictive for a generic forwarding architecture like OVN. I would
> think that the right thing to do would be to move the classifier to the
> egress pipeline stage, but then I do not know how to avoid loops. When a
> packet comes to the egress pipeline stage where the VM resides, there is no
> way to tell whether the packet has already gone down the service chain or
> not. I guess you could put a S_SWITCH_IN_EGRESS_CHAIN ingress pipeline
> stage right after L2_LKUP instead, and match on outport in addition to
> eth.dst, but it feels a bit unclean.
>
> On Tue, May 9, 2017 at 4:33 PM, John McDowall <jmcdowall at paloaltonetworks.
> com> wrote:
>
>> Mickey,
>>
>>
>>
>> Thanks for the review. I need some help understanding a couple of things:
>>
>>
>>
>> 1)       The proposed change, I could see the previous logic where we
>> inserted the flow back in the ingress pipeline just after the IN_CHAIN
>> stage. The changes you suggest seem to imply that the action is still
>> insert after the _*IN*_CHAIN stage but in the egress (OUT) pipeline. I
>> am missing something here – can you give me some more info?
>>
> Assume you have port 1 to a VM on HV1, port 2 as the input port to a VNF
> on HV2, and port 3 as the output port from that same VNF on HV2. The
> service chain is just that one VNF, with direction "entry-lport".
>
> The packet progresses as follows:
>
> HV1, ingress pipeline, inport 1
> Tables 1 to 9
> Table 10 (S_SWITCH_IN_CHAIN) hits priority 100 flow setting outport = 2
> and then going to output immediately
>
> Table 32 sends the packet through a geneve tunnel to HV2
>
> HV2, egress pipeline, outport 2
> Tables 48 to 65
>
> After packet gets delivered to the VNF, it comes back
>
> HV2, ingress pipeline, inport 3
> Tables 1 to 9
>
> Table 10 (S_SWITCH_IN_CHAIN)
>
> With the code in v3, you are doing a clone with inport = outport (which is
> not yet set!),
> then progressing to ingress Table 11, still on HV2
>
> Eventually you hit S_SWITCH_IN_L2_LKUP and go directly to the destination
> from HV2. If the destination is in the outside world, this just broke
> upstream L2 learning.
>
> My suggestion is instead Table 10 hits a priority 150 flow setting outport
> = 1 and then going to output immediately.
>
> Table 32 sends the packet through a geneve tunnel to HV1 based on outport
> == 1, but it ends up in the egress pipeline
>
> HV1, egress pipeline, outport 1
>
> New Table 0 (S_SWITCH_OUT_SFC_LOOPBACK) hits the entry doing a clone with
> input = outport and jumping to the ingress pipeline Table 11
>
> HV1, ingress pipeline, inport 1
> Tables 11 to 15 (continuation from where things left off at the beginning,
> when the packet was redirected down the service chain)
> ...
>
> Since this code does not use NSH, all metadata accumulated in Tables 1 to
> 9 is lost while going around the service chain, before proceeding through
> Tables 11 to 15. At the moment this works, but this becomes a restriction
> on future features, if they are to work with SFC. The alternative is to
> back through all the Tables 1 through 15 instead, with
> REGBIT_CHAIN_LOOPBACK as in the v2 patch to avoid going hitting any entries
> in Table 10 (S_SWITCH_IN_CHAIN). With this alternative, all packets go
> through Tables 1 to 9 on HV1 twice. We have to make a judgement call which
> alternative is least ugly, or move to NSH and stuff the metadata in the NSH
> header.
>
>>
>>
>> +    for (int ii = 0; ii < MFF_N_LOG_REGS; ii++) {
>> +            ds_put_format(&actions, "reg%d = 0; ", ii);
>> +    }
>> +    ds_put_format(&actions, "next(pipeline=ingress, table=%d); };",
>> +                            ovn_stage_get_table(S_SWITCH_IN_CHAIN) + 1);
>> +    ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN, ingress_inner_priority,
>> +                        lcc_match, ds_cstr(&actions));
>>
>>
>>
>> Replace the line above by:
>>
>>
>>
>>     ovn_lflow_add(lflows, od, S_SWITCH_OUT_SFC_LOOPBACK, 100,
>>                         lcc_match, ds_cstr(&actions));
>>
>>
>>
>>
>>
>>
>>
>> 2)       I can try and put some checks in for loop avoidance. Can you
>> think of scenarios that would cause this, a badly configured port-pair
>> could perhaps cause it (if the eth egress of the port-pair was configured
>> as the ingress eth.) Any other scenarios that come to mind ?
>>
>
> I cannot think of a good reason why a packet would end up on the egress
> pipeline on HV1 with outport == 1.
> After thinking about it more, I think it is OK if flags.loopback is left
> as 0. If this case is ever triggered and the second time around through the
> ingress pipeline still sets outport = 1, then the Table 34 loopback check
> will detect that outport == inport and drop the packet.
>
> Mickey
>
>
>>
>> Regards
>>
>>
>>
>> John
>>
>> *From: *Mickey Spiegel <mickeys.dev at gmail.com>
>> *Date: *Monday, April 24, 2017 at 6:39 PM
>> *To: *John McDowall <jmcdowall at paloaltonetworks.com>
>> *Cc: *"ovs-dev at openvswitch.org" <ovs-dev at openvswitch.org>
>> *Subject: *Re: [ovs-dev] ovn: SFC Patch V3
>>
>>
>>
>>
>>
>> On Mon, Apr 24, 2017 at 12:56 PM, <jmcdowall at paloaltonetworks.com> wrote:
>>
>> From: John McDowall <jmcdowall at paloaltonetworks.com>
>>
>>
>> Fixed changes from Mickey's last review.
>>
>> Changes
>>
>> 1) Fixed re-circulation rules
>>
>>
>>
>> Still a few modifications required. See comments inline. I just typed
>> some stuff out, have not run, built, or tested anything.
>>
>>
>>
>> 2) Fixed match statement - match is only applied to beginnning of chain in
>>    each direction.
>> 3) Fixed array length of chain of VNFs. I have tested thi sup to three
>> VNFs
>>    in a chain and it looks like it works in both directions.
>>
>> Areas to review
>>
>> 1) The logic now supports hair-pinnign the flow back to the original
>> source to
>>    ensure that the MAC learnign problem is addressed. I tested this using
>>    ovn-trace - any better testing that I should do?
>>
>> Current todo list
>>
>> 1) I have standalone tests need to add tests to ovs/ovn framework.
>> 2) Load-balancing support for port-pair-groups
>> 3) Publish more detailed examples.
>> 4) Submit suggestions to change and shorted the CLI names.
>>
>> Simple example using ovn-trace
>>
>> #!/bin/sh
>> #
>> clear
>> ovn-nbctl ls-add swt1
>>
>> ovn-nbctl lsp-add swt1 swt1-appc
>> ovn-nbctl lsp-add swt1 swt1-apps
>> ovn-nbctl lsp-add swt1 swt1-vnfp1
>> ovn-nbctl lsp-add swt1 swt1-vnfp2
>>
>> ovn-nbctl lsp-set-addresses swt1-appc "00:00:00:00:00:01 192.168.33.1"
>> ovn-nbctl lsp-set-addresses swt1-apps "00:00:00:00:00:02 192.168.33.2"
>> ovn-nbctl lsp-set-addresses swt1-vnfp1 00:00:00:00:00:03
>> ovn-nbctl lsp-set-addresses swt1-vnfp2 00:00:00:00:00:04
>> #
>> # Configure Service chain
>> #
>> ovn-nbctl lsp-pair-add swt1 swt1-vnfp1 swt1-vnfp2 pp1
>> ovn-nbctl lsp-chain-add swt1 pc1
>> ovn-nbctl lsp-pair-group-add pc1 ppg1
>> ovn-nbctl lsp-pair-group-add-port-pair ppg1 pp1
>> ovn-nbctl lsp-chain-classifier-add swt1 pc1 swt1-appc "entry-lport"
>> "bi-directional" pcc1
>> #
>> ovn-sbctl dump-flows
>> #
>> # Run trace command
>> printf "\n---------Flow 1 -------------\n\n"
>> ovn-trace --detailed  swt1 'inport == "swt1-appc" && eth.src ==
>> 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'
>> printf "\n---------Flow 2 -------------\n\n"
>> ovn-trace --detailed  swt1 'inport == "swt1-vnfp1" && eth.src ==
>> 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'
>> printf "\n---------Flow 3 -------------\n\n"
>> ovn-trace --detailed  swt1 'inport == "swt1-apps" && eth.dst ==
>> 00:00:00:00:00:01 && eth.src == 00:00:00:00:00:02'
>> printf "\n---------Flow 4 -------------\n\n"
>> ovn-trace --detailed  swt1 'inport == "swt1-vnfp2" && eth.dst ==
>> 00:00:00:00:00:01 && eth.src == 00:00:00:00:00:02'
>> #
>> # Cleanup
>> #
>> ovn-nbctl lsp-chain-classifier-del pcc1
>> ovn-nbctl lsp-pair-group-del ppg1
>> ovn-nbctl lsp-chain-del pc1
>> ovn-nbctl lsp-pair-del pp1
>> ovn-nbctl ls-del swt1
>>
>> Reported at: https://mail.openvswitch.org/pipermail/ovs-discuss/2016-Marc
>> h/040381.html
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMarch_040381.html&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=IVGOl8me3C4dzw8GMHKoxCKC1PfFf10p5EcD8PIGmbI&s=TRwcbjVrcqErrFP_gC_ZK6axUelonie6aFWbgEx1tpI&e=>
>> Reported at: https://mail.openvswitch.org/pipermail/ovs-discuss/2016-May/
>> 041359.html
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMay_041359.html&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=IVGOl8me3C4dzw8GMHKoxCKC1PfFf10p5EcD8PIGmbI&s=HHqxmHiE32vv5l-uyeTAY2gy3MQvDclqu-A49TlM6pU&e=>
>>
>> Signed-off-by: John McDowall <jmcdowall at paloaltonetworks.com>
>> Signed-off-by: Flavio Fernandes <flavio at flaviof.com>
>> Co-authored-by: Flavio Fernandes <flavio at flaviof.com>
>> ---
>>  ovn/northd/ovn-northd.8.xml   |   69 ++-
>>  ovn/northd/ovn-northd.c       |  382 ++++++++++++-
>>  ovn/ovn-architecture.7.xml    |   91 ++++
>>  ovn/ovn-nb.ovsschema          |   87 ++-
>>  ovn/ovn-nb.xml                |  188 ++++++-
>>  ovn/utilities/ovn-nbctl.8.xml |  231 ++++++++
>>  ovn/utilities/ovn-nbctl.c     | 1208 ++++++++++++++++++++++++++++++
>> +++++++++++
>>  7 files changed, 2227 insertions(+), 29 deletions(-)
>>
>>
>>
>> <snip>
>>
>>
>>
>>
>> diff --git ovn/northd/ovn-northd.c ovn/northd/ovn-northd.c
>> index d0a5ba2..090f768 100644
>> --- ovn/northd/ovn-northd.c
>> +++ ovn/northd/ovn-northd.c
>> @@ -106,13 +106,14 @@ enum ovn_stage {
>>      PIPELINE_STAGE(SWITCH, IN,  PRE_LB,         4, "ls_in_pre_lb")
>>   \
>>      PIPELINE_STAGE(SWITCH, IN,  PRE_STATEFUL,   5,
>> "ls_in_pre_stateful")  \
>>      PIPELINE_STAGE(SWITCH, IN,  ACL,            6, "ls_in_acl")
>>  \
>> -    PIPELINE_STAGE(SWITCH, IN,  QOS_MARK,       7, "ls_in_qos_mark")
>>   \
>> -    PIPELINE_STAGE(SWITCH, IN,  LB,             8, "ls_in_lb")
>>   \
>> -    PIPELINE_STAGE(SWITCH, IN,  STATEFUL,       9, "ls_in_stateful")
>>   \
>> -    PIPELINE_STAGE(SWITCH, IN,  ARP_ND_RSP,    10, "ls_in_arp_rsp")
>>  \
>> -    PIPELINE_STAGE(SWITCH, IN,  DHCP_OPTIONS,  11,
>> "ls_in_dhcp_options")  \
>> -    PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 12,
>> "ls_in_dhcp_response") \
>> -    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       13, "ls_in_l2_lkup")
>>  \
>> +    PIPELINE_STAGE(SWITCH, IN,  CHAIN,          7, "ls_in_chain")
>> \
>> +    PIPELINE_STAGE(SWITCH, IN,  QOS_MARK,       8, "ls_in_qos_mark")    \
>> +    PIPELINE_STAGE(SWITCH, IN,  LB,             9, "ls_in_lb")
>>   \
>> +    PIPELINE_STAGE(SWITCH, IN,  STATEFUL,      10, "ls_in_stateful")
>>   \
>> +    PIPELINE_STAGE(SWITCH, IN,  ARP_ND_RSP,    11, "ls_in_arp_rsp")
>>  \
>> +    PIPELINE_STAGE(SWITCH, IN,  DHCP_OPTIONS,  12,
>> "ls_in_dhcp_options")  \
>> +    PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 13,
>> "ls_in_dhcp_response") \
>> +    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       14, "ls_in_l2_lkup")
>>  \
>>                                                                        \
>>      /* Logical switch egress stages. */                               \
>>
>>
>>
>> You need a stage in the egress pipeline to do SFC egress loopback:
>>
>>
>>
>>     PIPELINE_STAGE(SWITCH, OUT, SFC_LOOPBACK, 0, "ls_out_sfc_loopback")  \
>>
>>     PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       1, "ls_out_pre_lb")     \
>>
>>     ...
>>
>>
>>
>>      PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")     \
>> @@ -160,7 +161,6 @@ enum ovn_stage {
>>  #define REGBIT_CONNTRACK_COMMIT "reg0[1]"
>>  #define REGBIT_CONNTRACK_NAT    "reg0[2]"
>>  #define REGBIT_DHCP_OPTS_RESULT "reg0[3]"
>> -
>>
>>
>>
>> Whitespace change should be removed.
>>
>>
>>
>>  /* Register definitions for switches and routers. */
>>  #define REGBIT_NAT_REDIRECT     "reg9[0]"
>>  /* Indicate that this packet has been recirculated using egress
>> @@ -3014,6 +3014,348 @@ build_acls(struct ovn_datapath *od, struct hmap
>> *lflows)
>>      }
>>  }
>>
>> +static int
>> +cmp_port_pair_groups(const void *ppg1_, const void *ppg2_)
>> +{
>> +    const struct nbrec_logical_port_pair_group *const *ppg1p = ppg1_;
>> +    const struct nbrec_logical_port_pair_group *const *ppg2p = ppg2_;
>> +    const struct nbrec_logical_port_pair_group *ppg1 = *ppg1p;
>> +    const struct nbrec_logical_port_pair_group *ppg2 = *ppg2p;
>> +
>> +
>> +    return ( (int)ppg1->sortkey - (int)ppg2->sortkey);
>> +}
>> +
>> +static void
>> +build_chain(struct ovn_datapath *od, struct hmap *lflows, struct hmap
>> *ports)
>> +{
>> +   /*
>> +    * TODO Items
>> +    *     * IPV6 support
>> +    *     * Load balancing support
>> +    *     * Bidirectional parameter support
>> +    *     * Support modes of VNF (BitW, L2, L3)
>> +    *     * Remove port-security on VNF Ports (if set by CMS)
>> +    *     * Add some state to allow match that does not require 'inport'
>> +    *     * Support visiting the same VNF more than once
>> +    *     * Unit tests!
>> +    */
>> +    const uint16_t ingress_inner_priority = 150;
>> +    const uint16_t ingress_outer_priority = 100;
>> +    const uint16_t egress_inner_priority = 150;
>> +    const uint16_t egress_outer_priority = 100;
>> +
>> +    struct ovn_port **input_port_array = NULL;
>> +    struct ovn_port **output_port_array = NULL;
>> +
>> +    struct ovn_port *dst_port = NULL;
>> +    struct ovn_port *src_port = NULL;
>> +
>> +    struct nbrec_logical_port_chain *lpc;
>> +    struct nbrec_logical_port_pair_group *lppg;
>> +    struct nbrec_logical_port_pair *lpp;
>> +    struct nbrec_logical_port_chain_classifier *lcc;
>> +
>> +    char *lcc_match = NULL;
>> +    char *lcc_action = NULL;
>> +    struct ovn_port *traffic_port;
>> +    unsigned int chain_direction = 2;
>> +    unsigned int chain_path = 2;
>> +    char * chain_match = NULL;
>> +
>> +    /* Ingress table ls_in_chain: default to passing through to the next
>> table
>> +     * (priority 0)
>> +     */
>> +    ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN, 0, "1", "next;");
>>
>>
>>
>> At some point also need:
>>
>>
>>
>>     ovn_lflow_add(lflows, od, S_SWITCH_OUT_SFC_LOOPBACK, 0, "1", "next;");
>>
>>
>>
>> +
>> +    /* No port chains defined therefore, no further work to do here. */
>> +    if (!od->nbs->port_chain_classifiers) {
>> +        return;
>> +    }
>> +    /* Iterate through all the port-chains defined for this datapath. */
>> +    for (size_t i = 0; i < od->nbs->n_port_chain_classifiers; i++) {
>> +        lcc = od->nbs->port_chain_classifiers[i];
>> +        /* Get the parameters from the classifier */
>> +        lpc = lcc->chain;
>> +        //traffic_port = lcc->port;
>> +        traffic_port =  ovn_port_find(ports,lcc->port->name);
>> +        if (traffic_port == NULL) {
>> +            static struct vlog_rate_limit rl =
>> +                        VLOG_RATE_LIMIT_INIT(1, 1);
>> +            VLOG_WARN_RL(&rl,
>> +                        "Traffic port %s does not exist\n",
>> +                              lcc->port->name);
>> +            break;
>> +        }
>> +        /*
>> +        * Check chain has one or more groups
>> +        */
>> +        if (lpc->n_port_pair_groups == 0) {
>> +            static struct vlog_rate_limit rl =
>> +                        VLOG_RATE_LIMIT_INIT(1, 1);
>> +            VLOG_WARN_RL(&rl,
>> +                "SFC Chain: %s used in classifier: %s does not not "
>> +                 "have any port pair groups\n",
>> +                              lcc->chain->name, lcc->name);
>> +            break;
>> +
>> +        }
>> +        /* TODO Check port exists. */
>> +        struct eth_addr traffic_logical_port_ea;
>> +        ovs_be32 traffic_logical_port_ip;
>> +        ovs_scan(traffic_port->nbsp->addresses[0],
>> +                    ETH_ADDR_SCAN_FMT" "IP_SCAN_FMT,
>> +                    ETH_ADDR_SCAN_ARGS(traffic_logical_port_ea),
>> +                    IP_SCAN_ARGS(&traffic_logical_port_ip));
>> +        /* Set the port to use as source or destination. */
>> +        if (strcmp(lcc->direction,"entry-lport")==0) {
>> +            chain_path = 0;
>> +        } else {
>> +            chain_path = 1;
>> +        }
>> +        /* Set the direction of the port-chain. */
>> +        if (strcmp(lcc->path,"uni-directional") == 0) {
>> +            chain_direction = 0;
>> +        } else {
>> +            chain_direction = 1;
>> +        }
>> +        /* Set the match parameters. */
>> +        chain_match = lcc->match;
>> +        /*
>> +         * Allocate space for port-pairs + 1. The Extra 1 represents the
>> +         * final hop to reach desired destination.
>> +         * TODO: We are going to allocate enough space to hold all the
>> hops:
>> +         *  1 x portGroups + 1. This needs
>> +         *  to enhanced to: SUM(port pairs of each port group) + 1
>> +         */
>> +        input_port_array = xmalloc(sizeof *src_port *
>> +                                   lpc->n_port_pair_groups);
>> +        output_port_array = xmalloc(sizeof *dst_port *
>> +                                  (lpc->n_port_pair_groups));
>> +        /* Copy port groups from chain and sort them according to
>> sortkey.*/
>> +        struct nbrec_logical_port_pair_group **port_pair_groups =
>> +                                 xmemdup(lpc->port_pair_groups,
>> +                          sizeof *port_pair_groups *
>> lpc->n_port_pair_groups);
>> +        if (lpc->n_port_pair_groups > 1) {
>> +            qsort(port_pair_groups, lpc->n_port_pair_groups,
>> +              sizeof *port_pair_groups, cmp_port_pair_groups);
>> +        }
>> +        /* For each port-pair-group in a port chain pull out the
>> port-pairs.*/
>> +        for (size_t j = 0; j < lpc->n_port_pair_groups; j++) {
>> +            lppg = port_pair_groups[j];
>> +            for (size_t k = 0; k < lppg->n_port_pairs; k++) {
>> +                 /* TODO: Need to add load balancing logic when LB
>> becomes
>> +                 * available. Until LB is available just take the first
>> +                 * PP in the PPG. */
>> +                if (k > 0) {
>> +                    static struct vlog_rate_limit rl =
>> +                        VLOG_RATE_LIMIT_INIT(1, 1);
>> +                    VLOG_WARN_RL(&rl,
>> +                        "Currently lacking support for more than \
>> +                            one port-pair %"PRIuSIZE"\n",
>> +                              lppg->n_port_pairs);
>> +                    break;
>> +                }
>> +                lpp = lppg->port_pairs[k];
>> +                input_port_array[j] = lpp->inport ? ovn_port_find(ports,
>> +                                       lpp->inport->name) : NULL;
>> +                output_port_array[j] = lpp->outport ?
>> ovn_port_find(ports,
>> +                                        lpp->outport->name) : NULL;
>> +            }
>> +        /* At this point we need to determine the final hop port to add
>> to
>> +         * the chain. This defines the complete path for packets through
>> +         * the chain. */
>> +        }
>>
>>
>>
>> I find the change in indentation below confusing.
>>
>>
>>
>> +    /*
>> +    * Insert the lowest priorty rule dest is src-logical-port. These are
>> the
>> +    * entry points into the chain in either direction. The match
>> statement
>> +    * is used to filter the entry port to provide higher granularity of
>> +    * filtering.
>> +    */
>> +    if (chain_path == 1) { /* Path starting from entry port */
>> +        if (strcmp(chain_match,"")!=0) {
>> +           lcc_match =  xasprintf(
>> +            "eth.src == "ETH_ADDR_FMT" && %s",
>> +             ETH_ADDR_ARGS(traffic_logical_port_ea), chain_match);
>> +        } else {
>> +           lcc_match =  xasprintf(
>> +            "eth.src == "ETH_ADDR_FMT,
>> +             ETH_ADDR_ARGS(traffic_logical_port_ea));
>> +        }
>> +           lcc_action = xasprintf("outport = %s; output;",
>> +                                input_port_array[0]->json_key);
>> +    } else {  /* Path going to the entry port */
>> +        if (strcmp(chain_match,"")!=0) {
>> +           lcc_match =  xasprintf(
>> +            "eth.dst == "ETH_ADDR_FMT" && %s",
>> +             ETH_ADDR_ARGS(traffic_logical_port_ea), chain_match);
>> +        } else {
>> +           lcc_match =  xasprintf(
>> +            "eth.dst == "ETH_ADDR_FMT,
>> +             ETH_ADDR_ARGS(traffic_logical_port_ea));
>> +        }
>> +           lcc_action = xasprintf("outport = %s; output;",
>> +                    output_port_array[lpc->n_port_
>> pair_groups-1]->json_key);
>> +    }
>> +    ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN, ingress_outer_priority,
>> +                       lcc_match, lcc_action);
>> +    free(lcc_match);
>> +    free(lcc_action);
>> +    /*
>> +    * For last VNF send the flow back to teh original chassis and exit
>> from
>> +    * there.
>> +    */
>> +    if (chain_path == 1) { /* Path starting from entry port */
>> +           lcc_match = xasprintf(
>> +                    "eth.src == "ETH_ADDR_FMT" && inport == %s",
>> +                     ETH_ADDR_ARGS(traffic_logical_port_ea),
>> +                     output_port_array[lpc->n_port
>> _pair_groups-1]->json_key);
>> +    } else { /* Path starting from destination port. */
>> +            lcc_match = xasprintf(
>> +                    "eth.dst == "ETH_ADDR_FMT" && inport == %s",
>> +                     ETH_ADDR_ARGS(traffic_logical_port_ea),
>> +                     input_port_array[0]->json_key);
>> +    }
>>
>>
>>
>>     lcc_action = xasprintf("outport = %s; output;",
>> traffic_port->json_key);
>>
>>     ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN, ingress_inner_priority,
>>                         lcc_match, lcc_action);
>>     free(lcc_match);
>>     free(lcc_action);
>>
>>
>>
>> Temporarily ignoring cases where there are multiple
>> port_chain_classifiers with the same traffic_port. You should do something
>> to avoid duplicate flows in this case.
>>
>>
>>
>>     /* SFC egress loopback on traffic_port. */
>>
>>
>>
>>     lcc_match = xasprintf(
>>
>>                     "eth.src == "ETH_ADDR_FMT" && outport == %s",
>>                     ETH_ADDR_ARGS(traffic_logical_port_ea),
>>
>>                     traffic_port->json_key);
>>
>>
>>
>> +             //lcc_action = xasprintf("next;");
>> +             //lcc_action = xasprintf("flags.loopback = 1; "
>> +             //    REGBIT_CHAIN_LOOPBACK" = 1;"
>> +             //   "next(pipeline=ingress, table=0);");
>> +    struct ds actions = DS_EMPTY_INITIALIZER;
>> +    ds_put_format(&actions, "clone { ct_clear; "
>> +                              "inport = outport; outport = \"\"; "
>> +                              "flags = 0; flags.loopback = 1; ");
>>
>>
>>
>> Probably should not set flags.loopback, see comments
>>
>> below.
>>
>>
>>
>> +    for (int ii = 0; ii < MFF_N_LOG_REGS; ii++) {
>> +            ds_put_format(&actions, "reg%d = 0; ", ii);
>> +    }
>> +    ds_put_format(&actions, "next(pipeline=ingress, table=%d); };",
>> +                            ovn_stage_get_table(S_SWITCH_IN_CHAIN) + 1);
>> +    ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN, ingress_inner_priority,
>> +                        lcc_match, ds_cstr(&actions));
>>
>>
>>
>> Replace the line above by:
>>
>>
>>
>>     ovn_lflow_add(lflows, od, S_SWITCH_OUT_SFC_LOOPBACK, 100,
>>                         lcc_match, ds_cstr(&actions));
>>
>>
>>
>> +    free(lcc_match);
>> +        //free(lcc_action);
>> +    ds_destroy(&actions);
>> +
>>
>>
>>
>> Some thought needs to be put behind loop avoidance, for
>>
>> example when eth.dst == eth.src == traffic_logical_port_ea.
>>
>> This may be an issue since we do not have any definitive
>>
>> method to distinguish packets returning from chains, from
>>
>> packets that were switched to traffic_port according to
>>
>> normal L2 processing.
>>
>>
>>
>> One option is to add higher priority ( > 100) flows. This is
>>
>> messy, since there is already a fair amount of code to
>>
>> write flows to send traffic to this port in
>>
>> S_SWITCH_IN_L2_LKUP. This needs some thought to
>>
>> determine the match conditions.
>>
>> The action is simple, just "next;".
>>
>>
>>
>> Perhaps more promising is not to set flags.loopback to 1.
>>
>> This would cause any packets with
>>
>> eth.src == traffic_logical_port_ea that are switched to
>>
>> traffic_port to go back through traffic_port's ingress
>>
>> pipeline, then be dropped if outport == traffic_port at the
>>
>> end of traffic_port's ingress pipeline.
>>
>> However, if anything after S_SWITCH_IN_CHAIN sets
>>
>> flags.loopback, that might not be enough.
>>
>>
>>
>> There seems to be another issue, if anything before
>>
>> S_SWITCH_IN_CHAIN sets flags.loopback, that will not be
>>
>> remembered when the packet comes back, so the packet
>>
>> will not be processed properly.
>>
>>
>>
>> Right now it looks like flags.loopback is only set for a
>>
>> logical switch in tables 11, 12, 13 which are after
>>
>> S_SWITCH_IN_CHAIN. So, at the moment, remembering
>>
>> flags.loopback is not an issue. These flows generate
>>
>> ARP and DCHP responses, in all cases changing
>>
>> eth.src, so the packets will not loop indefinitely even
>>
>> though these rules set flags.loopback. There are
>>
>> already flows to avoid generating ARP responses for
>>
>> requests on the port that owns the address, which is
>>
>> the dangerous case for SFC loops.
>>
>>
>>
>> In summary, it looks like leaving flags.loopback at 0
>>
>> should work for now. The question is what would
>>
>> happen if more cases are added in the future that
>>
>> set flags.loopback in a logical switch.
>>
>>
>>
>> Mickey
>>
>>
>>
>> +    /*
>> +    * Loop over all VNFs and create flow rules for each
>> +    * Only get here when there is more than one VNF in the chain.
>> +    */
>> +    for (size_t j = 1; j < lpc->n_port_pair_groups; j++) {
>> +
>> +        /* Apply inner rules flows */
>> +        if (chain_path == 1) { /* Path starting from entry port */
>> +            lcc_match = xasprintf(
>> +                    "eth.src == "ETH_ADDR_FMT" && inport == %s",
>> +                                 ETH_ADDR_ARGS(traffic_logical_port_ea),
>> +                                 output_port_array[j-1]->json_key);
>> +            lcc_action = xasprintf("outport = %s; output;",
>> +                                input_port_array[j]->json_key);
>> +        } else { /* Path going to entry port. */
>> +            lcc_match = xasprintf(
>> +                    "eth.dst == "ETH_ADDR_FMT" && inport == %s",
>> +                    ETH_ADDR_ARGS(traffic_logical_port_ea),
>> +                    input_port_array[lpc->n_port_p
>> air_groups-j]->json_key);
>> +             lcc_action = xasprintf("outport = %s; output;",
>> +                  output_port_array[lpc->n_port_
>> pair_groups-(j+1)]->json_key);
>> +        }
>> +        ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN,
>> ingress_inner_priority,
>> +                        lcc_match, lcc_action);
>> +        free(lcc_match);
>> +        free(lcc_action);
>> +    }
>> +    /* bi-directional chain (Response)*/
>> +    if (chain_direction == 1) {
>> +        /*
>> +         * Insert the lowest priorty rule dest is src-logical-port
>> +         */
>> +        if (chain_path == 1) { /* Path from source port. */
>> +            if (strcmp(chain_match,"")!=0) {
>> +                 lcc_match =  xasprintf("eth.dst == "ETH_ADDR_FMT" &&
>> %s",
>> +                        ETH_ADDR_ARGS(traffic_logical_port_ea),
>> chain_match);
>> +            } else { /* Path to source port */
>> +
>> +                 lcc_match =  xasprintf("eth.dst == "ETH_ADDR_FMT,
>> +                                    ETH_ADDR_ARGS(traffic_logical_
>> port_ea));
>> +            }
>> +            lcc_action = xasprintf("outport = %s; output;",
>> +                    output_port_array[lpc->n_port_
>> pair_groups-1]->json_key);
>> +        } else { /* Path from destination port. */
>> +           if (strcmp(chain_match,"")!=0) {
>> +                lcc_match =  xasprintf("eth.src == "ETH_ADDR_FMT" && %s",
>> +                        ETH_ADDR_ARGS(traffic_logical_port_ea),
>> chain_match);
>> +            } else {
>> +                lcc_match =  xasprintf("eth.src == "ETH_ADDR_FMT,
>> +                        ETH_ADDR_ARGS(traffic_logical_port_ea));
>> +            }
>> +            lcc_action = xasprintf("outport = %s; output;",
>> +                    input_port_array[0]->json_key);
>> +        }
>> +        ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN,
>> egress_outer_priority,
>> +                       lcc_match, lcc_action);
>> +        free(lcc_match);
>> +        free(lcc_action);
>> +        /*
>> +        * End Entry Flow to classification chain entry point.
>> +        */
>> +        /* Apply last rule to exit from chain */
>> +        if (chain_path == 1) { /* Path from source port. */
>> +             lcc_match = xasprintf(
>> +                    "eth.dst == "ETH_ADDR_FMT" && inport == %s",
>> +                                 ETH_ADDR_ARGS(traffic_logical_port_ea),
>> +                                 input_port_array[0]->json_key);
>> +
>> +        } else { /* Path to source port. */
>> +             lcc_match = xasprintf(
>> +                "eth.src == "ETH_ADDR_FMT" && inport == %s",
>> +                    ETH_ADDR_ARGS(traffic_logical_port_ea),
>> +                    output_port_array[lpc->n_port_
>> pair_groups-1]->json_key);
>> +        }
>> +        struct ds actions = DS_EMPTY_INITIALIZER;
>> +        ds_put_format(&actions,
>> +                              "clone { ct_clear; "
>> +                              "inport = outport; outport = \"\"; "
>> +                              "flags = 0; flags.loopback = 1; ");
>> +        for (int ii = 0; ii < MFF_N_LOG_REGS; ii++) {
>> +                    ds_put_format(&actions, "reg%d = 0; ", ii);
>> +        }
>> +        ds_put_format(&actions, "next(pipeline=ingress, table=%d); };",
>> +                              ovn_stage_get_table(S_SWITCH_IN_CHAIN) +
>> 1);
>> +        ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN,
>> +                        egress_inner_priority, lcc_match,
>> ds_cstr(&actions));
>> +        ds_destroy(&actions);
>> +        free(lcc_match);
>> +
>> +        for (int j = 1; j< lpc->n_port_pair_groups; j++) {
>> +
>> +            /* Completed first catch all rule for this port-chain. */
>> +
>> +            /* Apply inner rules flows */
>> +            if (chain_path == 1) { /* Path from source port. */
>> +                lcc_match = xasprintf(
>> +                    "eth.dst == "ETH_ADDR_FMT" && inport == %s",
>> +                    ETH_ADDR_ARGS(traffic_logical_port_ea),
>> +                    input_port_array[lpc->n_port_p
>> air_groups-j]->json_key);
>> +                 lcc_action = xasprintf("outport = %s; output;",
>> +                  output_port_array[lpc->n_port_
>> pair_groups-(j+1)]->json_key);
>> +
>> +            } else { /* Path to source port. */
>> +                lcc_match = xasprintf(
>> +                    "eth.src == "ETH_ADDR_FMT" && inport == %s",
>> +                        ETH_ADDR_ARGS(traffic_logical_port_ea),
>> +                        output_port_array[j-1]->json_key);
>> +                 lcc_action = xasprintf("outport = %s; output;",
>> +                        input_port_array[j]->json_key);
>> +            }
>> +            ovn_lflow_add(lflows, od, S_SWITCH_IN_CHAIN,
>> +                          egress_inner_priority, lcc_match, lcc_action);
>> +            free(lcc_match);
>> +            free(lcc_action);
>> +        }
>> +    }
>> +    free(input_port_array);
>> +    free(output_port_array);
>> +    free(port_pair_groups);
>> +    }
>> +}
>>  static void
>>  build_qos(struct ovn_datapath *od, struct hmap *lflows) {
>>      ovn_lflow_add(lflows, od, S_SWITCH_IN_QOS_MARK, 0, "1", "next;");
>> @@ -3142,7 +3484,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>      struct ds actions = DS_EMPTY_INITIALIZER;
>>
>>      /* Build pre-ACL and ACL tables for both ingress and egress.
>> -     * Ingress tables 3 through 9.  Egress tables 0 through 6. */
>> +     * Ingress tables 3 through 10.  Egress tables 0 through 6. */
>>      struct ovn_datapath *od;
>>      HMAP_FOR_EACH (od, key_node, datapaths) {
>>          if (!od->nbs) {
>> @@ -3153,6 +3495,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>          build_pre_lb(od, lflows);
>>          build_pre_stateful(od, lflows);
>>          build_acls(od, lflows);
>> +        build_chain(od, lflows, ports);
>>          build_qos(od, lflows);
>>          build_lb(od, lflows);
>>          build_stateful(od, lflows);
>> @@ -3225,9 +3568,9 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>          ovn_lflow_add(lflows, od, S_SWITCH_IN_PORT_SEC_IP, 0, "1",
>> "next;");
>>      }
>>
>> -    /* Ingress table 10: ARP/ND responder, skip requests coming from
>> localnet
>> -     * and vtep ports. (priority 100); see ovn-northd.8.xml for the
>> -     * rationale. */
>> +    /* Ingress table 11: ARP/ND responder, skip requests coming from
>> localnet
>> +     * ports. (priority 100). */
>> +
>>      HMAP_FOR_EACH (op, key_node, ports) {
>>          if (!op->nbsp) {
>>              continue;
>> @@ -3242,7 +3585,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>          }
>>      }
>>
>> -    /* Ingress table 10: ARP/ND responder, reply for known IPs.
>> +    /* Ingress table 11: ARP/ND responder, reply for known IPs.
>>       * (priority 50). */
>>      HMAP_FOR_EACH (op, key_node, ports) {
>>          if (!op->nbsp) {
>> @@ -3335,7 +3678,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>          }
>>      }
>>
>> -    /* Ingress table 10: ARP/ND responder, by default goto next.
>> +    /* Ingress table 11: ARP/ND responder, by default goto next.
>>       * (priority 0)*/
>>      HMAP_FOR_EACH (od, key_node, datapaths) {
>>          if (!od->nbs) {
>> @@ -3345,7 +3688,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>          ovn_lflow_add(lflows, od, S_SWITCH_IN_ARP_ND_RSP, 0, "1",
>> "next;");
>>      }
>>
>> -    /* Logical switch ingress table 11 and 12: DHCP options and response
>> +    /* Logical switch ingress table 12 and 13: DHCP options and response
>>           * priority 100 flows. */
>>      HMAP_FOR_EACH (op, key_node, ports) {
>>          if (!op->nbsp) {
>> @@ -3449,7 +3792,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>          }
>>      }
>>
>> -    /* Ingress table 11 and 12: DHCP options and response, by default
>> goto next.
>> +    /* Ingress table 12 and 13: DHCP options and response, by default
>> goto next.
>>       * (priority 0). */
>>
>>      HMAP_FOR_EACH (od, key_node, datapaths) {
>> @@ -3461,7 +3804,8 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>          ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1",
>> "next;");
>>      }
>>
>> -    /* Ingress table 13: Destination lookup, broadcast and multicast
>> handling
>> +
>> +    /* Ingress table 14: Destination lookup, broadcast and multicast
>> handling
>>       * (priority 100). */
>>      HMAP_FOR_EACH (op, key_node, ports) {
>>          if (!op->nbsp) {
>> @@ -3481,7 +3825,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>                        "outport = \""MC_FLOOD"\"; output;");
>>      }
>>
>> -    /* Ingress table 13: Destination lookup, unicast handling (priority
>> 50), */
>> +    /* Ingress table 14: Destination lookup, unicast handling (priority
>> 50), */
>>      HMAP_FOR_EACH (op, key_node, ports) {
>>          if (!op->nbsp) {
>>              continue;
>> @@ -3581,7 +3925,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
>> hmap *ports,
>>          }
>>      }
>>
>> -    /* Ingress table 13: Destination lookup for unknown MACs (priority
>> 0). */
>> +    /* Ingress table 14: Destination lookup for unknown MACs (priority
>> 0). */
>>      HMAP_FOR_EACH (od, key_node, datapaths) {
>>          if (!od->nbs) {
>>              continue;
>>
>>
>>
>> <snip>
>>
>>
>>
>
>


More information about the dev mailing list