[ovs-dev] [PATCH v5 4/5] ovn: Implement basic ARP support for L3 logical routers.
Ben Pfaff
blp at ovn.org
Sat Mar 12 22:06:31 UTC 2016
On Thu, Mar 10, 2016 at 05:55:46PM -0800, Justin Pettit wrote:
>
> > On Feb 19, 2016, at 4:40 PM, Ben Pfaff <blp at ovn.org> wrote:
> >
> >
> > --- a/ovn/TODO
> > +++ b/ovn/TODO
> > @@ -4,18 +4,11 @@
> >
> > ** New OVN logical actions
> >
> >
> > +*** rate_limit
> >
> > TCP/IP stacks typically limit the rate at which ARPs are sent, e.g. to
> > one per second for a given target. We might need to do this too.
> >
> > ....
> > +*** Ratelimiting.
> >
> > +From casual observation, Linux appears to generate at most one ARP per
> > +second per destination.
>
> It seems like this is saying similar things in two spots.
OK, I removed the item from the logical action section and added a
sentence later:
This might be supported by adding a new OVN logical action for
rate-limiting.
> > *** Renewal and expiration.
> >
> > Something needs to make sure that bindings remain valid and expire
> > those that become stale.
>
> As we discussed in-person, I wonder if we can introduce the concept of
> time into the database. Either the database itself could have
> time-based limits for columns or we could provide the ability to
> define queries based on relative. For example, ovn-northd could
> regularly send a delete query that matches any row with an age older
> than five seconds.
I have some doubts about this approach, but I don't have a design that
I'm happy with either. I am not confident at all that centralizing ARP
resolution into the database is the best approach.
For now, I added a sentence here: "One way to do this might be to add
some support for time to the database server itself."
> > +/* Adds a flow to table */
> > +static void
> > +add_neighbor_flows(struct controller_ctx *ctx,
> > + const struct lport_index *lports, struct hmap *flow_table)
>
> I think the comment describing this function didn't get finished.
Oops. Revised:
/* Adds an OpenFlow flow to 'flow_table' for each MAC binding in the OVN
* southbound database, using 'lports' to resolve logical port names to
* numbers. */
> > diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> > index 70088f6..2261475 100644
> > --- a/ovn/controller/pinctrl.c
> > +++ b/ovn/controller/pinctrl.c
> >
> > @@ -137,6 +161,10 @@ pinctrl_handle_arp(const struct flow *ip_flow, struct ofpbuf *userdata)
> > goto exit;
> > }
> >
> > + struct ds s = DS_EMPTY_INITIALIZER;
> > + ofpacts_format(ofpacts.data, ofpacts.size, &s);
> > + ds_destroy(&s);
>
> Is this doing anything useful?
No. Deleted.
> > @@ -183,21 +212,24 @@ process_packet_in(const struct ofp_header *msg)
> > struct flow headers;
> > flow_extract(&packet, &headers);
> >
> > - const struct flow *md = &pin.flow_metadata.flow;
> > switch (ntohl(ah->opcode)) {
> > ...
> > default:
> > - VLOG_WARN_RL(&rl, "unrecognized packet-in command %#"PRIx32,
> > - md->regs[0]);
> > + VLOG_WARN_RL(&rl, "unrecognized packet-in opcode %"PRIu32, ah->opcode);
>
> Does "ah->opcode" need an ntohl() around it?
Yes. Added. (Too bad sparse doesn't report these.)
> > +static void
> > +pinctrl_handle_put_arp(const struct lport_index *lports,
> > + const struct flow *md, const struct flow *headers)
> > +{
> > + if (n_put_arps >= ARRAY_SIZE(put_arps)) {
> > + return;
> > + }
>
> I don't think we've defined any coverage counters for OVN yet, but I
> wonder if we should start doing that. These sorts of silent errors
> might be good candidates.
OK, I added one here, "pinctrl_drop_put_arp".
> > ...
> > + struct put_arp *pa = &put_arps[n_put_arps++];
> > + pa->timestamp = time_msec();
> > + pa->logical_port = xstrdup(pb->logical_port);
> > + pa->ip = htonl(md->regs[0]);
> > + pa->mac = headers->dl_src;
>
> This code just uses and array to store the requests, but I worry that
> if there are too many ARP replies that this queue could overrun. What
> about using a hash table instead? In the case of multiple identical
> put_arps, that approach would also lessen the impact of the fairly
> expensive loop in run_put_arps() that looks for duplicates.
OK, I changed this to a hash table.
It was not difficult, but it was a large change. I think it's OK, but
I'd still appreciate it if you'd take another look at the commit I
applied just to be sure.
> > +static void
> > +put_load(uint64_t value, enum mf_field_id dst, int ofs, int n_bits,
> > + struct ofpbuf *ofpacts)
> > +{
> > + struct ofpact_set_field *sf = ofpact_put_SET_FIELD(ofpacts);
> > + sf->field = mf_from_id(dst);
> > + sf->flow_has_vlan = false;
> > +
> > + ovs_be64 n_value = htonll(value);
> > + bitwise_copy(&n_value, 8, 0, &sf->value, sf->field->n_bytes, ofs, n_bits);
> > + bitwise_one(&sf->mask, sf->field->n_bytes, ofs, n_bits);
> > +}
>
> We discussed this off-line, but there are three put_load() functions
> in the "ovn" directory--two of which have identical implementations.
> Similarly there's an emit_resubmit() in "ovn/lib/actions.c" and a
> put_resubmit() in "ovn/controller/physical.c" with identical
> implementations but slightly different arguments. I wonder if it
> would make sense to start putting these into a library.
It's about time.
> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> > index 1b2912e..4685143 100644
> > --- a/ovn/northd/ovn-northd.8.xml
> > +++ b/ovn/northd/ovn-northd.8.xml
> >
> >
> >
> > @@ -602,64 +617,73 @@ icmp4 {
> > <ul>
> > <li>
> > <p>
> > - Known MAC bindings. For each IP address <var>A</var> whose host is
> > - known to have Ethernet address <var>HE</var> and reside on router
> > - port <var>P</var> with Ethernet address <var>PE</var>, a priority-200
> > - flow with match <code>reg0 == <var>A</var></code> has the following
> > - actions:
> > + Static MAC bindings. MAC bindings can be known statically based on
> > + data in the <code>OVN_Northbound</code> database. For router ports
> > + connected to logical switches, MAC bindings can be known statically
> > + from the <code>addresses</code> column in the
> > + <code>Logical_Port</code> table. For router ports connected to other
> > + logical routers, MAC bindings can be known statically from the
>
> I wonder if it would be clearer to say "For routed ports" instead of
> "For router ports" in these two sentence, since they're different from
> "logical router ports" but sound similar.
I don't understand the distinction you're making. These sentences talk
about how a logical router can be aware of static MAC-IP bindings on its
logical router ports. What do you mean?
> > + Dynamic MAC bindings. This flows resolves MAC-to-IP bindings that
> > + have become known dynamically through ARP. (The next table will
> > + issue an ARP request for cases where the binding is not yet known.)
> > + </p>
>
> It might be nice to point out that the packet that is being resolved
> is effectively dropped.
Yes, but the next table, for sending ARP requests, is the right place to
do that.
> > + Unknown MAC address. A priority-100 flow with match <code>eth.dst ==
> > + 00:00:00:00:00:00</code> has the following actions:
> > </p>
> >
> > <pre>
> > +rate_limit(outport, ip4.dst);
>
> It feels weird to have an action listed here that we don't support,
> but it's nice to have a reference where it's needed. Maybe leave it
> here, but then add a note at the bottom that it still needs to be
> implemented?
This was just overlooked. I deleted it for now. I don't think
rate-limiting is in fact the biggest issue here.
> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > index e6271cf..d511b4d 100644
> > --- a/ovn/northd/ovn-northd.c
> > +++ b/ovn/northd/ovn-northd.c
> >
> > @@ -389,17 +391,18 @@ join_datapaths(struct northd_context *ctx, struct hmap *datapaths,
> > ...
> > + /* Set the gateway port to NULL. If there is a gateway, it will get
> > + * filled in as we go through the ports later. */
> > + od->gateway_port = NULL;
>
> I think this requires that build_datapaths() be called before
> build_ports(). The code does that, but do you think it's worth
> mentioning that in comments describing the functions?
OK, I added function-level comments for build_datapaths() and
build_ports().
> Thanks for working on this--it's pretty cool!
>
> Acked-by: Justin Pettit <jpettit at ovn.org>
Thanks.
I applied this to master, with the following incremental folded in.
--8<--------------------------cut here-------------------------->8--
diff --git a/ovn/TODO b/ovn/TODO
index 07d0680..0a6225d 100644
--- a/ovn/TODO
+++ b/ovn/TODO
@@ -4,11 +4,6 @@
** New OVN logical actions
-*** rate_limit
-
-TCP/IP stacks typically limit the rate at which ARPs are sent, e.g. to
-one per second for a given target. We might need to do this too.
-
*** icmp4 { action... }
Generates an ICMPv4 packet based on the current IPv4 packet and
@@ -61,6 +56,9 @@ using ARP.
From casual observation, Linux appears to generate at most one ARP per
second per destination.
+This might be supported by adding a new OVN logical action for
+rate-limiting.
+
*** Tracking queries
It's probably best to only record in the database responses to queries
@@ -73,6 +71,9 @@ into the database.
Something needs to make sure that bindings remain valid and expire
those that become stale.
+One way to do this might be to add some support for time to the
+database server itself.
+
*** Table size limiting.
The table of MAC bindings must not be allowed to grow unreasonably
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index b5d00f7..48bb9c8 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -363,7 +363,9 @@ put_load(const uint8_t *data, size_t len,
bitwise_one(&sf->mask, sf->field->n_bytes, ofs, n_bits);
}
-/* Adds a flow to table */
+/* Adds an OpenFlow flow to 'flow_table' for each MAC binding in the OVN
+ * southbound database, using 'lports' to resolve logical port names to
+ * numbers. */
static void
add_neighbor_flows(struct controller_ctx *ctx,
const struct lport_index *lports, struct hmap *flow_table)
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index 5406ba0..3fcab99 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -17,6 +17,7 @@
#include "pinctrl.h"
+#include "coverage.h"
#include "dirs.h"
#include "dp-packet.h"
#include "lport.h"
@@ -44,18 +45,23 @@ static struct rconn *swconn;
* rconn_get_connection_seqno(rconn), 'swconn' has reconnected. */
static unsigned int conn_seq_no;
-static void pinctrl_handle_put_arp(const struct lport_index *,
- const struct flow *md,
+static void pinctrl_handle_put_arp(const struct flow *md,
const struct flow *headers);
-static void flush_put_arps(void);
-static void run_put_arps(struct controller_ctx *);
+static void init_put_arps(void);
+static void destroy_put_arps(void);
+static void run_put_arps(struct controller_ctx *,
+ const struct lport_index *lports);
static void wait_put_arps(struct controller_ctx *);
+static void flush_put_arps(void);
+
+COVERAGE_DEFINE(pinctrl_drop_put_arp);
void
pinctrl_init(void)
{
swconn = rconn_create(5, 0, DSCP_DEFAULT, 1 << OFP13_VERSION);
conn_seq_no = 0;
+ init_put_arps();
}
static ovs_be32
@@ -169,10 +175,6 @@ pinctrl_handle_arp(const struct flow *ip_flow, const struct match *md,
goto exit;
}
- struct ds s = DS_EMPTY_INITIALIZER;
- ofpacts_format(ofpacts.data, ofpacts.size, &s);
- ds_destroy(&s);
-
struct ofputil_packet_out po = {
.packet = dp_packet_data(&packet),
.packet_len = dp_packet_size(&packet),
@@ -190,8 +192,7 @@ exit:
}
static void
-process_packet_in(const struct lport_index *lports,
- const struct ofp_header *msg)
+process_packet_in(const struct ofp_header *msg)
{
static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
@@ -226,18 +227,18 @@ process_packet_in(const struct lport_index *lports,
break;
case ACTION_OPCODE_PUT_ARP:
- pinctrl_handle_put_arp(lports, &pin.flow_metadata.flow, &headers);
+ pinctrl_handle_put_arp(&pin.flow_metadata.flow, &headers);
break;
default:
- VLOG_WARN_RL(&rl, "unrecognized packet-in opcode %"PRIu32, ah->opcode);
+ VLOG_WARN_RL(&rl, "unrecognized packet-in opcode %"PRIu32,
+ ntohl(ah->opcode));
break;
}
}
static void
-pinctrl_recv(const struct lport_index *lports,
- const struct ofp_header *oh, enum ofptype type)
+pinctrl_recv(const struct ofp_header *oh, enum ofptype type)
{
if (type == OFPTYPE_ECHO_REQUEST) {
queue_msg(make_echo_reply(oh));
@@ -250,7 +251,7 @@ pinctrl_recv(const struct lport_index *lports,
config.miss_send_len = UINT16_MAX;
set_switch_config(swconn, &config);
} else if (type == OFPTYPE_PACKET_IN) {
- process_packet_in(lports, oh);
+ process_packet_in(oh);
} else if (type != OFPTYPE_ECHO_REPLY && type != OFPTYPE_BARRIER_REPLY) {
if (VLOG_IS_DBG_ENABLED()) {
static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(30, 300);
@@ -300,12 +301,12 @@ pinctrl_run(struct controller_ctx *ctx, const struct lport_index *lports,
enum ofptype type;
ofptype_decode(&type, oh);
- pinctrl_recv(lports, oh, type);
+ pinctrl_recv(oh, type);
ofpbuf_delete(msg);
}
}
- run_put_arps(ctx);
+ run_put_arps(ctx, lports);
}
void
@@ -320,7 +321,7 @@ void
pinctrl_destroy(void)
{
rconn_destroy(swconn);
- flush_put_arps();
+ destroy_put_arps();
}
/* Implementation of the "put_arp" OVN action. This action sends a packet to
@@ -333,103 +334,153 @@ pinctrl_destroy(void)
* we buffer up a few put_arps (but we don't keep them longer than 1 second)
* and apply them whenever a database transaction is available. */
-/* Buffered "put_arp" operations. */
+/* Buffered "put_arp" operation. */
struct put_arp {
+ struct hmap_node hmap_node; /* In 'put_arps'. */
+
long long int timestamp; /* In milliseconds. */
- char *logical_port;
+
+ /* Key. */
+ uint32_t dp_key;
+ uint32_t port_key;
ovs_be32 ip;
+
+ /* Value. */
struct eth_addr mac;
};
-static struct put_arp put_arps[1024];
-static size_t n_put_arps;
+
+/* Contains "struct put_arp"s. */
+static struct hmap put_arps;
static void
-pinctrl_handle_put_arp(const struct lport_index *lports,
- const struct flow *md, const struct flow *headers)
+init_put_arps(void)
{
- if (n_put_arps >= ARRAY_SIZE(put_arps)) {
- return;
+ hmap_init(&put_arps);
+}
+
+static void
+destroy_put_arps(void)
+{
+ flush_put_arps();
+ hmap_destroy(&put_arps);
+}
+
+static struct put_arp *
+pinctrl_find_put_arp(uint32_t dp_key, uint32_t port_key, ovs_be32 ip,
+ uint32_t hash)
+{
+ struct put_arp *pa;
+ HMAP_FOR_EACH_WITH_HASH (pa, hmap_node, hash, &put_arps) {
+ if (pa->dp_key == dp_key
+ && pa->port_key == port_key
+ && pa->ip == ip) {
+ return pa;
+ }
}
+ return NULL;
+}
- /* Convert logical datapath and logical port key into lport. */
+static void
+pinctrl_handle_put_arp(const struct flow *md, const struct flow *headers)
+{
uint32_t dp_key = ntohll(md->metadata);
uint32_t port_key = md->regs[MFF_LOG_INPORT - MFF_REG0];
- const struct sbrec_port_binding *pb
- = lport_lookup_by_key(lports, dp_key, port_key);
- if (!pb) {
- static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+ ovs_be32 ip = htonl(md->regs[0]);
+ uint32_t hash = hash_3words(dp_key, port_key, (OVS_FORCE uint32_t) ip);
+ struct put_arp *pa = pinctrl_find_put_arp(dp_key, port_key, ip, hash);
+ if (!pa) {
+ if (hmap_count(&put_arps) >= 1000) {
+ COVERAGE_INC(pinctrl_drop_put_arp);
+ return;
+ }
- VLOG_WARN_RL(&rl, "unknown logical port with datapath %"PRIu32" and "
- "port %"PRIu32, dp_key, port_key);
- return;
+ pa = xmalloc(sizeof *pa);
+ hmap_insert(&put_arps, &pa->hmap_node, hash);
+ pa->dp_key = dp_key;
+ pa->port_key = port_key;
+ pa->ip = ip;
}
-
- struct put_arp *pa = &put_arps[n_put_arps++];
pa->timestamp = time_msec();
- pa->logical_port = xstrdup(pb->logical_port);
- pa->ip = htonl(md->regs[0]);
pa->mac = headers->dl_src;
}
static void
-flush_put_arps(void)
+run_put_arp(struct controller_ctx *ctx, const struct lport_index *lports,
+ const struct put_arp *pa)
{
- for (struct put_arp *pa = put_arps; pa < &put_arps[n_put_arps]; pa++) {
- free(pa->logical_port);
+ if (time_msec() > pa->timestamp + 1000) {
+ return;
}
- n_put_arps = 0;
-}
-static void
-run_put_arps(struct controller_ctx *ctx)
-{
- if (!ctx->ovnsb_idl_txn) {
+ /* Convert logical datapath and logical port key into lport. */
+ const struct sbrec_port_binding *pb
+ = lport_lookup_by_key(lports, pa->dp_key, pa->port_key);
+ if (!pb) {
+ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+
+ VLOG_WARN_RL(&rl, "unknown logical port with datapath %"PRIu32" "
+ "and port %"PRIu32, pa->dp_key, pa->port_key);
return;
}
- for (const struct put_arp *pa = put_arps; pa < &put_arps[n_put_arps];
- pa++) {
- if (time_msec() > pa->timestamp + 1000) {
- continue;
- }
+ /* Convert arguments to string form for database. */
+ char ip_string[INET_ADDRSTRLEN + 1];
+ snprintf(ip_string, sizeof ip_string, IP_FMT, IP_ARGS(pa->ip));
+
+ char mac_string[ETH_ADDR_STRLEN + 1];
+ snprintf(mac_string, sizeof mac_string,
+ ETH_ADDR_FMT, ETH_ADDR_ARGS(pa->mac));
- /* Convert arguments to string form for database. */
- char ip_string[INET_ADDRSTRLEN + 1];
- snprintf(ip_string, sizeof ip_string, IP_FMT, IP_ARGS(pa->ip));
- char mac_string[ETH_ADDR_STRLEN + 1];
- snprintf(mac_string, sizeof mac_string,
- ETH_ADDR_FMT, ETH_ADDR_ARGS(pa->mac));
-
- /* Check for and update an existing IP-MAC binding for this logical
- * port.
- *
- * XXX This is not very efficient. */
- const struct sbrec_mac_binding *b;
- SBREC_MAC_BINDING_FOR_EACH (b, ctx->ovnsb_idl) {
- if (!strcmp(b->logical_port, pa->logical_port)
- && !strcmp(b->ip, ip_string)) {
- if (strcmp(b->mac, mac_string)) {
- sbrec_mac_binding_set_mac(b, mac_string);
- }
- goto next;
+ /* Check for and update an existing IP-MAC binding for this logical
+ * port.
+ *
+ * XXX This is not very efficient. */
+ const struct sbrec_mac_binding *b;
+ SBREC_MAC_BINDING_FOR_EACH (b, ctx->ovnsb_idl) {
+ if (!strcmp(b->logical_port, pb->logical_port)
+ && !strcmp(b->ip, ip_string)) {
+ if (strcmp(b->mac, mac_string)) {
+ sbrec_mac_binding_set_mac(b, mac_string);
}
+ return;
}
+ }
+
+ /* Add new IP-MAC binding for this logical port. */
+ b = sbrec_mac_binding_insert(ctx->ovnsb_idl_txn);
+ sbrec_mac_binding_set_logical_port(b, pb->logical_port);
+ sbrec_mac_binding_set_ip(b, ip_string);
+ sbrec_mac_binding_set_mac(b, mac_string);
+}
- /* Add new IP-MAC binding for this logical port. */
- b = sbrec_mac_binding_insert(ctx->ovnsb_idl_txn);
- sbrec_mac_binding_set_logical_port(b, pa->logical_port);
- sbrec_mac_binding_set_ip(b, ip_string);
- sbrec_mac_binding_set_mac(b, mac_string);
- next:;
+static void
+run_put_arps(struct controller_ctx *ctx, const struct lport_index *lports)
+{
+ if (!ctx->ovnsb_idl_txn) {
+ return;
}
+ const struct put_arp *pa;
+ HMAP_FOR_EACH (pa, hmap_node, &put_arps) {
+ run_put_arp(ctx, lports, pa);
+ }
flush_put_arps();
}
static void
wait_put_arps(struct controller_ctx *ctx)
{
- if (ctx->ovnsb_idl_txn && n_put_arps) {
+ if (ctx->ovnsb_idl_txn && !hmap_is_empty(&put_arps)) {
poll_immediate_wake();
}
}
+
+static void
+flush_put_arps(void)
+{
+ struct put_arp *pa, *next;
+ HMAP_FOR_EACH_SAFE (pa, next, hmap_node, &put_arps) {
+ hmap_remove(&put_arps, &pa->hmap_node);
+ free(pa);
+ }
+}
diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 0e95828..79fe153 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -685,7 +685,6 @@ icmp4 {
</p>
<pre>
-rate_limit(outport, ip4.dst);
arp {
eth.dst = ff:ff:ff:ff:ff:ff;
arp.spa = reg1;
@@ -698,6 +697,10 @@ arp {
(Ingress table 2 initialized <code>reg1</code> with the IP address
owned by <code>outport</code>.)
</p>
+
+ <p>
+ The IP packet that triggers the ARP request is dropped.
+ </p>
</li>
<li>
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 3beb39a..d707965 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -414,6 +414,11 @@ ovn_datapath_allocate_key(struct hmap *dp_tnlids)
return allocate_tnlid(dp_tnlids, "datapath", (1u << 24) - 1, &hint);
}
+/* Updates the southbound Datapath_Binding table so that it contains the
+ * logical switches and routers specified by the northbound database.
+ *
+ * Initializes 'datapaths' to contain a "struct ovn_datapath" for every logical
+ * switch and router. */
static void
build_datapaths(struct northd_context *ctx, struct hmap *datapaths)
{
@@ -701,6 +706,12 @@ ovn_port_update_sbrec(const struct ovn_port *op)
}
}
+/* Updates the southbound Port_Binding table so that it contains the logical
+ * ports specified by the northbound database.
+ *
+ * Initializes 'ports' to contain a "struct ovn_port" for every logical port,
+ * using the "struct ovn_datapath"s in 'datapaths' to look up logical
+ * datapaths. */
static void
build_ports(struct northd_context *ctx, struct hmap *datapaths,
struct hmap *ports)
More information about the dev
mailing list