[ovs-dev] [PATCH v11 09/14] dp-packet: Add support for data "linearization".
Eelco Chaudron
echaudro at redhat.com
Thu Oct 11 14:50:42 UTC 2018
Hi Tiago,
It has been a while since I reviewed this patchset, guess before my
summer holiday ;)
I picked this patch out of the set as it does not have my acked-by, and
I assume all the other which have it did not change too much to warrant
a review. If they did, let me know.
See inline comments...
Cheers,
Eelco
On 10 Oct 2018, at 18:22, Tiago Lam wrote:
> Previous commits have added support to the dp_packet API to handle
> multi-segmented packets, where data is not stored contiguously in
> memory. However, in some cases, it is inevitable and data must be
> provided contiguously. Examples of such cases are when performing
> csums
> over the entire packet data, or when write()'ing to a file descriptor
> (for a tap interface, for example). For such cases, the dp_packet API
> has been extended to provide a way to transform a multi-segmented
> DPBUF_DPDK packet into a DPBUF_MALLOC system packet (at the expense of
> a
> copy of memory). If the packet's data is already stored in memory
> contigously then there's no need to convert the packet.
>
> Thus, the main use cases that were assuming that a dp_packet's data is
> always held contiguously in memory were changed to make use of the new
> "linear functions" in the dp_packet API when there's a need to
> traverse
> the entire's packet data. Per the example above, when the packet's
> data
> needs to be write() to the tap's file descriptor, or when the
> conntrack
> module needs to verify a packet's checksum, the data is now
> linearized.
>
> Additionally, the layer functions, such as dp_packet_l3() and
> variants,
> have been modified to check if there's enough data in the packet
> before
> returning a pointer to the data (and callers have been modified
> accordingly). This requirement is needed to guarantee that a caller
> doesn't read beyond the available memory.
I think the last might be a problem, as now the dp_packet_l2_5(),
dp_packet_l3() and dp_packet_l4() might start returning NULL.
Some the changes you made to the code referencing this functions would
crash if they return NULL.
Can I assume you verified that all these calls will be guarded by
dp_packet_linearize() so we do not access a NULL pointer?
In addition, why have you decided not to use the dp_packet_linearize()
inside these functions (I think I have an idea)?
This might allow skipping the:
if (!dp_packet_is_linear(pkt)) {
dp_packet_linearize(pkt);
}
Maybe the above can be changed to just dp_packet_linearize(), and have
the dp_packet_is_linear() part of the first call. They are both inline
functions already.
> Signed-off-by: Tiago Lam <tiago.lam at intel.com>
> ---
> lib/bfd.c | 3 +-
> lib/cfm.c | 5 +-
> lib/conntrack-icmp.c | 4 +-
> lib/conntrack-private.h | 4 +-
> lib/conntrack-tcp.c | 6 +-
> lib/conntrack.c | 109 ++++++++++------
> lib/crc32c.c | 35 +++--
> lib/crc32c.h | 2 +
> lib/dp-packet.c | 18 +++
> lib/dp-packet.h | 288
> +++++++++++++++++++++++++++++-------------
> lib/dpif-netdev.c | 5 +
> lib/dpif-netlink.c | 5 +
> lib/dpif.c | 9 ++
> lib/flow.c | 44 ++++---
> lib/lacp.c | 3 +-
> lib/mcast-snooping.c | 8 +-
> lib/netdev-bsd.c | 5 +
> lib/netdev-dummy.c | 13 +-
> lib/netdev-linux.c | 13 +-
> lib/netdev-native-tnl.c | 38 +++---
> lib/odp-execute.c | 28 ++--
> lib/ofp-print.c | 10 +-
> lib/ovs-lldp.c | 3 +-
> lib/packets.c | 151 ++++++++++++++++------
> lib/packets.h | 7 +
> lib/pcap-file.c | 2 +-
> ofproto/ofproto-dpif-upcall.c | 20 ++-
> ofproto/ofproto-dpif-xlate.c | 42 ++++--
> ovn/controller/pinctrl.c | 29 +++--
> tests/test-conntrack.c | 2 +-
> tests/test-rstp.c | 8 +-
> tests/test-stp.c | 8 +-
> 32 files changed, 637 insertions(+), 290 deletions(-)
>
> diff --git a/lib/bfd.c b/lib/bfd.c
> index cc8c685..12e076a 100644
> --- a/lib/bfd.c
> +++ b/lib/bfd.c
> @@ -721,7 +721,8 @@ bfd_process_packet(struct bfd *bfd, const struct
> flow *flow,
> if (!msg) {
> VLOG_INFO_RL(&rl, "%s: Received too-short BFD control message
> (only "
> "%"PRIdPTR" bytes long, at least %d required).",
> - bfd->name, (uint8_t *) dp_packet_tail(p) - l7,
> + bfd->name, dp_packet_size(p) -
> + (l7 - (uint8_t *) dp_packet_data(p)),
> BFD_PACKET_LEN);
> goto out;
> }
> diff --git a/lib/cfm.c b/lib/cfm.c
> index 71d2c02..83baf2a 100644
> --- a/lib/cfm.c
> +++ b/lib/cfm.c
> @@ -584,7 +584,7 @@ cfm_compose_ccm(struct cfm *cfm, struct dp_packet
> *packet,
>
> atomic_read_relaxed(&cfm->extended, &extended);
>
> - ccm = dp_packet_l3(packet);
> + ccm = dp_packet_l3(packet, sizeof(*ccm));
> ccm->mdlevel_version = 0;
> ccm->opcode = CCM_OPCODE;
> ccm->tlv_offset = 70;
> @@ -759,8 +759,7 @@ cfm_process_heartbeat(struct cfm *cfm, const
> struct dp_packet *p)
> atomic_read_relaxed(&cfm->extended, &extended);
>
> eth = dp_packet_eth(p);
> - ccm = dp_packet_at(p, (uint8_t *)dp_packet_l3(p) - (uint8_t
> *)dp_packet_data(p),
> - CCM_ACCEPT_LEN);
> + ccm = dp_packet_l3(p, CCM_ACCEPT_LEN);
>
> if (!ccm) {
> VLOG_INFO_RL(&rl, "%s: Received an unparseable 802.1ag CCM
> heartbeat.",
> diff --git a/lib/conntrack-icmp.c b/lib/conntrack-icmp.c
> index 40fd1d8..0575d0e 100644
> --- a/lib/conntrack-icmp.c
> +++ b/lib/conntrack-icmp.c
> @@ -63,7 +63,7 @@ icmp_conn_update(struct conn *conn_, struct
> conntrack_bucket *ctb,
> static bool
> icmp4_valid_new(struct dp_packet *pkt)
> {
> - struct icmp_header *icmp = dp_packet_l4(pkt);
> + struct icmp_header *icmp = dp_packet_l4(pkt, sizeof *icmp);
>
> return icmp->icmp_type == ICMP4_ECHO_REQUEST
> || icmp->icmp_type == ICMP4_INFOREQUEST
> @@ -73,7 +73,7 @@ icmp4_valid_new(struct dp_packet *pkt)
> static bool
> icmp6_valid_new(struct dp_packet *pkt)
> {
> - struct icmp6_header *icmp6 = dp_packet_l4(pkt);
> + struct icmp6_header *icmp6 = dp_packet_l4(pkt, sizeof *icmp6);
>
> return icmp6->icmp6_type == ICMP6_ECHO_REQUEST;
> }
> diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
> index a344801..2191aa6 100644
> --- a/lib/conntrack-private.h
> +++ b/lib/conntrack-private.h
> @@ -159,8 +159,8 @@ tcp_payload_length(struct dp_packet *pkt)
> {
> const char *tcp_payload = dp_packet_get_tcp_payload(pkt);
> if (tcp_payload) {
> - return ((char *) dp_packet_tail(pkt) -
> dp_packet_l2_pad_size(pkt)
> - - tcp_payload);
> + return dp_packet_l4_size(pkt) -
> + (tcp_payload - (char *) dp_packet_l4(pkt, 0));
> } else {
> return 0;
> }
> diff --git a/lib/conntrack-tcp.c b/lib/conntrack-tcp.c
> index 86d313d..5450971 100644
> --- a/lib/conntrack-tcp.c
> +++ b/lib/conntrack-tcp.c
> @@ -149,7 +149,7 @@ tcp_conn_update(struct conn *conn_, struct
> conntrack_bucket *ctb,
> struct dp_packet *pkt, bool reply, long long now)
> {
> struct conn_tcp *conn = conn_tcp_cast(conn_);
> - struct tcp_header *tcp = dp_packet_l4(pkt);
> + struct tcp_header *tcp = dp_packet_l4(pkt, sizeof *tcp);
> /* The peer that sent 'pkt' */
> struct tcp_peer *src = &conn->peer[reply ? 1 : 0];
> /* The peer that should receive 'pkt' */
> @@ -394,7 +394,7 @@ tcp_conn_update(struct conn *conn_, struct
> conntrack_bucket *ctb,
> static bool
> tcp_valid_new(struct dp_packet *pkt)
> {
> - struct tcp_header *tcp = dp_packet_l4(pkt);
> + struct tcp_header *tcp = dp_packet_l4(pkt, sizeof *tcp);
> uint16_t tcp_flags = TCP_FLAGS(tcp->tcp_ctl);
>
> if (tcp_invalid_flags(tcp_flags)) {
> @@ -416,7 +416,7 @@ tcp_new_conn(struct conntrack_bucket *ctb, struct
> dp_packet *pkt,
> long long now)
> {
> struct conn_tcp* newconn = NULL;
> - struct tcp_header *tcp = dp_packet_l4(pkt);
> + struct tcp_header *tcp = dp_packet_l4(pkt, sizeof *tcp);
> struct tcp_peer *src, *dst;
> uint16_t tcp_flags = TCP_FLAGS(tcp->tcp_ctl);
>
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 974f985..f024186 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -450,10 +450,10 @@ get_ip_proto(const struct dp_packet *pkt)
> uint8_t ip_proto;
> struct eth_header *l2 = dp_packet_eth(pkt);
> if (l2->eth_type == htons(ETH_TYPE_IPV6)) {
> - struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt);
> + struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt, sizeof
> *nh6);
> ip_proto = nh6->ip6_ctlun.ip6_un1.ip6_un1_nxt;
> } else {
> - struct ip_header *l3_hdr = dp_packet_l3(pkt);
> + struct ip_header *l3_hdr = dp_packet_l3(pkt, sizeof *l3_hdr);
> ip_proto = l3_hdr->ip_proto;
> }
>
> @@ -476,8 +476,8 @@ get_alg_ctl_type(const struct dp_packet *pkt,
> ovs_be16 tp_src, ovs_be16 tp_dst,
> enum { CT_IPPORT_FTP = 21 };
> enum { CT_IPPORT_TFTP = 69 };
> uint8_t ip_proto = get_ip_proto(pkt);
> - struct udp_header *uh = dp_packet_l4(pkt);
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct udp_header *uh = dp_packet_l4(pkt, sizeof *uh);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
> ovs_be16 ftp_src_port = htons(CT_IPPORT_FTP);
> ovs_be16 ftp_dst_port = htons(CT_IPPORT_FTP);
> ovs_be16 tftp_dst_port = htons(CT_IPPORT_TFTP);
> @@ -530,18 +530,18 @@ pat_packet(struct dp_packet *pkt, const struct
> conn *conn)
> {
> if (conn->nat_info->nat_action & NAT_ACTION_SRC) {
> if (conn->key.nw_proto == IPPROTO_TCP) {
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
> packet_set_tcp_port(pkt, conn->rev_key.dst.port,
> th->tcp_dst);
> } else if (conn->key.nw_proto == IPPROTO_UDP) {
> - struct udp_header *uh = dp_packet_l4(pkt);
> + struct udp_header *uh = dp_packet_l4(pkt, sizeof *uh);
> packet_set_udp_port(pkt, conn->rev_key.dst.port,
> uh->udp_dst);
> }
> } else if (conn->nat_info->nat_action & NAT_ACTION_DST) {
> if (conn->key.nw_proto == IPPROTO_TCP) {
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
> packet_set_tcp_port(pkt, th->tcp_src,
> conn->rev_key.src.port);
> } else if (conn->key.nw_proto == IPPROTO_UDP) {
> - struct udp_header *uh = dp_packet_l4(pkt);
> + struct udp_header *uh = dp_packet_l4(pkt, sizeof *uh);
> packet_set_udp_port(pkt, uh->udp_src,
> conn->rev_key.src.port);
> }
> }
> @@ -553,11 +553,11 @@ nat_packet(struct dp_packet *pkt, const struct
> conn *conn, bool related)
> if (conn->nat_info->nat_action & NAT_ACTION_SRC) {
> pkt->md.ct_state |= CS_SRC_NAT;
> if (conn->key.dl_type == htons(ETH_TYPE_IP)) {
> - struct ip_header *nh = dp_packet_l3(pkt);
> + struct ip_header *nh = dp_packet_l3(pkt, sizeof *nh);
> packet_set_ipv4_addr(pkt, &nh->ip_src,
> conn->rev_key.dst.addr.ipv4_aligned);
> } else {
> - struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt);
> + struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt,
> sizeof *nh6);
> packet_set_ipv6_addr(pkt, conn->key.nw_proto,
> nh6->ip6_src.be32,
> &conn->rev_key.dst.addr.ipv6_aligned,
> @@ -569,11 +569,11 @@ nat_packet(struct dp_packet *pkt, const struct
> conn *conn, bool related)
> } else if (conn->nat_info->nat_action & NAT_ACTION_DST) {
> pkt->md.ct_state |= CS_DST_NAT;
> if (conn->key.dl_type == htons(ETH_TYPE_IP)) {
> - struct ip_header *nh = dp_packet_l3(pkt);
> + struct ip_header *nh = dp_packet_l3(pkt, sizeof *nh);
> packet_set_ipv4_addr(pkt, &nh->ip_dst,
> conn->rev_key.src.addr.ipv4_aligned);
> } else {
> - struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt);
> + struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt,
> sizeof *nh6);
> packet_set_ipv6_addr(pkt, conn->key.nw_proto,
> nh6->ip6_dst.be32,
> &conn->rev_key.src.addr.ipv6_aligned,
> @@ -590,18 +590,18 @@ un_pat_packet(struct dp_packet *pkt, const
> struct conn *conn)
> {
> if (conn->nat_info->nat_action & NAT_ACTION_SRC) {
> if (conn->key.nw_proto == IPPROTO_TCP) {
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
> packet_set_tcp_port(pkt, th->tcp_src,
> conn->key.src.port);
> } else if (conn->key.nw_proto == IPPROTO_UDP) {
> - struct udp_header *uh = dp_packet_l4(pkt);
> + struct udp_header *uh = dp_packet_l4(pkt, sizeof *uh);
> packet_set_udp_port(pkt, uh->udp_src,
> conn->key.src.port);
> }
> } else if (conn->nat_info->nat_action & NAT_ACTION_DST) {
> if (conn->key.nw_proto == IPPROTO_TCP) {
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
> packet_set_tcp_port(pkt, conn->key.dst.port,
> th->tcp_dst);
> } else if (conn->key.nw_proto == IPPROTO_UDP) {
> - struct udp_header *uh = dp_packet_l4(pkt);
> + struct udp_header *uh = dp_packet_l4(pkt, sizeof *uh);
> packet_set_udp_port(pkt, conn->key.dst.port,
> uh->udp_dst);
> }
> }
> @@ -612,21 +612,21 @@ reverse_pat_packet(struct dp_packet *pkt, const
> struct conn *conn)
> {
> if (conn->nat_info->nat_action & NAT_ACTION_SRC) {
> if (conn->key.nw_proto == IPPROTO_TCP) {
> - struct tcp_header *th_in = dp_packet_l4(pkt);
> + struct tcp_header *th_in = dp_packet_l4(pkt, sizeof
> *th_in);
> packet_set_tcp_port(pkt, conn->key.src.port,
> th_in->tcp_dst);
> } else if (conn->key.nw_proto == IPPROTO_UDP) {
> - struct udp_header *uh_in = dp_packet_l4(pkt);
> + struct udp_header *uh_in = dp_packet_l4(pkt, sizeof
> *uh_in);
> packet_set_udp_port(pkt, conn->key.src.port,
> uh_in->udp_dst);
> }
> } else if (conn->nat_info->nat_action & NAT_ACTION_DST) {
> if (conn->key.nw_proto == IPPROTO_TCP) {
> - struct tcp_header *th_in = dp_packet_l4(pkt);
> + struct tcp_header *th_in = dp_packet_l4(pkt, sizeof
> *th_in);
> packet_set_tcp_port(pkt, th_in->tcp_src,
> conn->key.dst.port);
> } else if (conn->key.nw_proto == IPPROTO_UDP) {
> - struct udp_header *uh_in = dp_packet_l4(pkt);
> + struct udp_header *uh_in = dp_packet_l4(pkt, sizeof
> *uh_in);
> packet_set_udp_port(pkt, uh_in->udp_src,
> conn->key.dst.port);
> }
> @@ -636,16 +636,26 @@ reverse_pat_packet(struct dp_packet *pkt, const
> struct conn *conn)
> static void
> reverse_nat_packet(struct dp_packet *pkt, const struct conn *conn)
> {
> - char *tail = dp_packet_tail(pkt);
> - char pad = dp_packet_l2_pad_size(pkt);
> + char *tail;
> + char pad;
> struct conn_key inner_key;
> const char *inner_l4 = NULL;
> - uint16_t orig_l3_ofs = pkt->l3_ofs;
> - uint16_t orig_l4_ofs = pkt->l4_ofs;
> + uint16_t orig_l3_ofs;
> + uint16_t orig_l4_ofs;
> +
> + /* We need the whole packet to parse the packet below */
> + if (!dp_packet_is_linear(pkt)) {
> + dp_packet_linearize(pkt);
> + }
> +
> + tail = dp_packet_tail(pkt);
> + pad = dp_packet_l2_pad_size(pkt);
> + orig_l3_ofs = pkt->l3_ofs;
> + orig_l4_ofs = pkt->l4_ofs;
>
> if (conn->key.dl_type == htons(ETH_TYPE_IP)) {
> - struct ip_header *nh = dp_packet_l3(pkt);
> - struct icmp_header *icmp = dp_packet_l4(pkt);
> + struct ip_header *nh = dp_packet_l3(pkt, sizeof *nh);
> + struct icmp_header *icmp = dp_packet_l4(pkt, sizeof *icmp);
> struct ip_header *inner_l3 = (struct ip_header *) (icmp + 1);
> extract_l3_ipv4(&inner_key, inner_l3, tail - ((char
> *)inner_l3) - pad,
> &inner_l4, false);
> @@ -664,8 +674,8 @@ reverse_nat_packet(struct dp_packet *pkt, const
> struct conn *conn)
> icmp->icmp_csum = 0;
> icmp->icmp_csum = csum(icmp, tail - (char *) icmp - pad);
> } else {
> - struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt);
> - struct icmp6_error_header *icmp6 = dp_packet_l4(pkt);
> + struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt, sizeof
> *nh6);
> + struct icmp6_error_header *icmp6 = dp_packet_l4(pkt, sizeof
> *icmp6);
> struct ovs_16aligned_ip6_hdr *inner_l3_6 =
> (struct ovs_16aligned_ip6_hdr *) (icmp6 + 1);
> extract_l3_ipv6(&inner_key, inner_l3_6,
> @@ -702,11 +712,11 @@ un_nat_packet(struct dp_packet *pkt, const
> struct conn *conn,
> if (conn->nat_info->nat_action & NAT_ACTION_SRC) {
> pkt->md.ct_state |= CS_DST_NAT;
> if (conn->key.dl_type == htons(ETH_TYPE_IP)) {
> - struct ip_header *nh = dp_packet_l3(pkt);
> + struct ip_header *nh = dp_packet_l3(pkt, sizeof *nh);
> packet_set_ipv4_addr(pkt, &nh->ip_dst,
> conn->key.src.addr.ipv4_aligned);
> } else {
> - struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt);
> + struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt,
> sizeof *nh6);
> packet_set_ipv6_addr(pkt, conn->key.nw_proto,
> nh6->ip6_dst.be32,
> &conn->key.src.addr.ipv6_aligned,
> true);
> @@ -720,11 +730,11 @@ un_nat_packet(struct dp_packet *pkt, const
> struct conn *conn,
> } else if (conn->nat_info->nat_action & NAT_ACTION_DST) {
> pkt->md.ct_state |= CS_SRC_NAT;
> if (conn->key.dl_type == htons(ETH_TYPE_IP)) {
> - struct ip_header *nh = dp_packet_l3(pkt);
> + struct ip_header *nh = dp_packet_l3(pkt, sizeof *nh);
> packet_set_ipv4_addr(pkt, &nh->ip_src,
> conn->key.dst.addr.ipv4_aligned);
> } else {
> - struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt);
> + struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt,
> sizeof *nh6);
> packet_set_ipv6_addr(pkt, conn->key.nw_proto,
> nh6->ip6_src.be32,
> &conn->key.dst.addr.ipv6_aligned,
> true);
> @@ -1320,6 +1330,7 @@ conntrack_execute(struct conntrack *ct, struct
> dp_packet_batch *pkt_batch,
> write_ct_md(packet, zone, NULL, NULL, NULL);
> continue;
> }
> +
> process_one(ct, packet, &ctx, zone, force, commit, now,
> setmark,
> setlabel, nat_action_info, tp_src, tp_dst,
> helper);
> }
> @@ -1901,9 +1912,18 @@ static bool
> conn_key_extract(struct conntrack *ct, struct dp_packet *pkt,
> ovs_be16 dl_type,
> struct conn_lookup_ctx *ctx, uint16_t zone)
> {
> - const struct eth_header *l2 = dp_packet_eth(pkt);
> - const struct ip_header *l3 = dp_packet_l3(pkt);
> - const char *l4 = dp_packet_l4(pkt);
> + const struct eth_header *l2;
> + const struct ip_header *l3;
> + const char *l4;
> +
> + /* We need the whole packet to parse the packet below */
> + if (!dp_packet_is_linear(pkt)) {
> + dp_packet_linearize(pkt);
> + }
> +
> + l2 = dp_packet_eth(pkt);
> + l3 = dp_packet_l3(pkt, sizeof *l3);
> + l4 = dp_packet_l4(pkt, sizeof *l4);
>
> memset(ctx, 0, sizeof *ctx);
>
> @@ -2846,7 +2866,7 @@ terminate_number_str(char *str, uint8_t
> max_digits)
> static void
> get_ftp_ctl_msg(struct dp_packet *pkt, char *ftp_msg)
> {
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
> char *tcp_hdr = (char *) th;
> uint32_t tcp_payload_len = tcp_payload_length(pkt);
> size_t tcp_payload_of_interest = MIN(tcp_payload_len,
> @@ -2888,7 +2908,7 @@ process_ftp_ctl_v4(struct conntrack *ct,
> char **ftp_data_v4_start,
> size_t *addr_offset_from_ftp_data_start)
> {
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
> size_t tcp_hdr_len = TCP_OFFSET(th->tcp_ctl) * 4;
> char *tcp_hdr = (char *) th;
> *ftp_data_v4_start = tcp_hdr + tcp_hdr_len;
> @@ -3034,7 +3054,7 @@ process_ftp_ctl_v6(struct conntrack *ct,
> size_t *addr_offset_from_ftp_data_start,
> size_t *addr_size, enum ct_alg_mode *mode)
> {
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
> size_t tcp_hdr_len = TCP_OFFSET(th->tcp_ctl) * 4;
> char *tcp_hdr = (char *) th;
> char ftp_msg[LARGEST_FTP_MSG_OF_INTEREST + 1] = {0};
> @@ -3167,7 +3187,7 @@ handle_ftp_ctl(struct conntrack *ct, const
> struct conn_lookup_ctx *ctx,
> const struct conn *conn_for_expectation,
> long long now, enum ftp_ctl_pkt ftp_ctl, bool nat)
> {
> - struct ip_header *l3_hdr = dp_packet_l3(pkt);
> + struct ip_header *l3_hdr;
> ovs_be32 v4_addr_rep = 0;
> struct ct_addr v6_addr_rep;
> size_t addr_offset_from_ftp_data_start;
> @@ -3176,6 +3196,13 @@ handle_ftp_ctl(struct conntrack *ct, const
> struct conn_lookup_ctx *ctx,
> bool do_seq_skew_adj = true;
> enum ct_alg_mode mode = CT_FTP_MODE_ACTIVE;
>
> + /* We need the whole packet to parse the packet below */
> + if (!dp_packet_is_linear(pkt)) {
> + dp_packet_linearize(pkt);
> + }
> +
> + l3_hdr = dp_packet_l3(pkt, sizeof *l3_hdr);
> +
> if (detect_ftp_ctl_type(ctx, pkt) != ftp_ctl) {
> return;
> }
> @@ -3184,7 +3211,7 @@ handle_ftp_ctl(struct conntrack *ct, const
> struct conn_lookup_ctx *ctx,
> do_seq_skew_adj = false;
> }
>
> - struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt);
> + struct ovs_16aligned_ip6_hdr *nh6 = dp_packet_l3(pkt, sizeof
> *nh6);
> int64_t seq_skew = 0;
>
> if (ftp_ctl == CT_FTP_CTL_OTHER) {
> @@ -3240,7 +3267,7 @@ handle_ftp_ctl(struct conntrack *ct, const
> struct conn_lookup_ctx *ctx,
> OVS_NOT_REACHED();
> }
>
> - struct tcp_header *th = dp_packet_l4(pkt);
> + struct tcp_header *th = dp_packet_l4(pkt, sizeof *th);
>
> if (do_seq_skew_adj && seq_skew != 0) {
> if (ctx->reply != conn_for_expectation->seq_skew_dir) {
> diff --git a/lib/crc32c.c b/lib/crc32c.c
> index e8dd6ee..dd5bb9a 100644
> --- a/lib/crc32c.c
> +++ b/lib/crc32c.c
> @@ -132,28 +132,39 @@ static const uint32_t crc32Table[256] = {
> 0xBE2DA0A5L, 0x4C4623A6L, 0x5F16D052L, 0xAD7D5351L
> };
>
> -/*
> - * Compute a CRC32c checksum as per the SCTP requirements in RFC4960.
> This
> - * includes beginning with a checksum of all ones, and returning the
> negated
> - * CRC. Unlike the RFC, we return the checksum in network byte-order.
> - */
> -ovs_be32
> -crc32c(const uint8_t *data, size_t size)
> +uint32_t
> +crc32c_continue(uint32_t partial, const uint8_t *data, size_t size)
> {
> - uint32_t crc = 0xffffffffL;
> -
> while (size--) {
> - crc = crc32Table[(crc ^ *data++) & 0xff] ^ (crc >> 8);
> + partial = crc32Table[(partial ^ *data++) & 0xff] ^ (partial
> >> 8);
> }
>
> + return partial;
> +}
> +
> +ovs_be32
> +crc32c_finish(uint32_t partial)
> +{
> /* The result of this CRC calculation provides us a value in the
> reverse
> * byte-order as compared with our architecture. On big-endian
> systems,
> * this is opposite to our return type. So, to return a
> big-endian
> * value, we must swap the byte-order. */
> #if defined(WORDS_BIGENDIAN)
> - crc = uint32_byteswap(crc);
> + crc = uint32_byteswap(partial);
> #endif
>
> /* Our value is in network byte-order. OVS_FORCE keeps sparse
> happy. */
> - return (OVS_FORCE ovs_be32) ~crc;
> + return (OVS_FORCE ovs_be32) ~partial;
> +}
> +
> +/*
> + * Compute a CRC32c checksum as per the SCTP requirements in RFC4960.
> This
> + * includes beginning with a checksum of all ones, and returning the
> negated
> + * CRC. Unlike the RFC, we return the checksum in network byte-order.
> + */
> +ovs_be32
> +crc32c(const uint8_t *data, size_t size)
> +{
> + uint32_t crc = 0xffffffffL;
> + return crc32c_finish(crc32c_continue(crc, data, size));
> }
> diff --git a/lib/crc32c.h b/lib/crc32c.h
> index 92c7d7f..17c8190 100644
> --- a/lib/crc32c.h
> +++ b/lib/crc32c.h
> @@ -20,6 +20,8 @@
>
> #include "openvswitch/types.h"
>
> +uint32_t crc32c_continue(uint32_t partial, const uint8_t *data,
> size_t size);
> +ovs_be32 crc32c_finish(uint32_t partial);
> ovs_be32 crc32c(const uint8_t *data, size_t);
>
> #endif /* crc32c.h */
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index 1b9503c..31473fe 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -107,6 +107,9 @@ void
> dp_packet_init_dpdk(struct dp_packet *b)
> {
> b->source = DPBUF_DPDK;
> +#ifdef DPDK_NETDEV
> + b->mstate = NULL;
> +#endif
> }
>
> /* Initializes 'b' as an empty dp_packet with an initial capacity of
> 'size'
> @@ -124,6 +127,21 @@ dp_packet_uninit(struct dp_packet *b)
> if (b) {
> if (b->source == DPBUF_MALLOC) {
> free(dp_packet_base(b));
> +
> +#ifdef DPDK_NETDEV
> + /* Packet has been "linearized" */
> + if (b->mstate) {
> + b->source = DPBUF_DPDK;
> + b->mbuf.buf_addr = b->mstate->addr;
> + b->mbuf.buf_len = b->mstate->len;
> + b->mbuf.data_off = b->mstate->off;
> +
> + free(b->mstate);
> + b->mstate = NULL;
> +
> + free_dpdk_buf((struct dp_packet *) b);
> + }
> +#endif
> } else if (b->source == DPBUF_DPDK) {
> #ifdef DPDK_NETDEV
> /* If this dp_packet was allocated by DPDK it must have
> been
> diff --git a/lib/dp-packet.h b/lib/dp-packet.h
> index bae1882..e412d39 100644
> --- a/lib/dp-packet.h
> +++ b/lib/dp-packet.h
> @@ -27,7 +27,6 @@
>
> #include "netdev-dpdk.h"
> #include "openvswitch/list.h"
> -#include "packets.h"
> #include "util.h"
> #include "flow.h"
>
> @@ -46,6 +45,16 @@ enum OVS_PACKED_ENUM dp_packet_source {
>
> #define DP_PACKET_CONTEXT_SIZE 64
>
> +#ifdef DPDK_NETDEV
> +/* Struct to save data for when a DPBUF_DPDK packet is converted to
> + * DPBUF_MALLOC. */
> +struct mbuf_state {
> + void *addr;
> + uint16_t len;
> + uint16_t off;
> +};
> +#endif
> +
> /* Buffer for holding packet data. A dp_packet is automatically
> reallocated
> * as necessary if it grows too large for the available memory.
> * By default the packet type is set to Ethernet (PT_ETH).
> @@ -53,6 +62,7 @@ enum OVS_PACKED_ENUM dp_packet_source {
> struct dp_packet {
> #ifdef DPDK_NETDEV
> struct rte_mbuf mbuf; /* DPDK mbuf */
> + struct mbuf_state *mstate; /* Used when packet has been
> "linearized" */
> #else
> void *base_; /* First byte of allocated space. */
> uint16_t allocated_; /* Number of bytes allocated. */
> @@ -85,26 +95,27 @@ static inline void dp_packet_set_data(struct
> dp_packet *, void *);
> static inline void *dp_packet_base(const struct dp_packet *);
> static inline void dp_packet_set_base(struct dp_packet *, void *);
>
> +static inline bool dp_packet_is_linear(const struct dp_packet *);
> +static inline void dp_packet_linearize(struct dp_packet *);
> +
> static inline uint32_t dp_packet_size(const struct dp_packet *);
> static inline void dp_packet_set_size(struct dp_packet *, uint32_t);
>
> static inline uint16_t dp_packet_get_allocated(const struct dp_packet
> *);
> static inline void dp_packet_set_allocated(struct dp_packet *,
> uint16_t);
>
> -static inline void
> -dp_packet_copy_mbuf_flags(struct dp_packet *dst, const struct
> dp_packet *src);
> -
> void *dp_packet_resize_l2(struct dp_packet *, int increment);
> void *dp_packet_resize_l2_5(struct dp_packet *, int increment);
> static inline void *dp_packet_eth(const struct dp_packet *);
> static inline void dp_packet_reset_offsets(struct dp_packet *);
> static inline uint8_t dp_packet_l2_pad_size(const struct dp_packet
> *);
> static inline void dp_packet_set_l2_pad_size(struct dp_packet *,
> uint8_t);
> -static inline void *dp_packet_l2_5(const struct dp_packet *);
> +static inline void *dp_packet_l2_5(const struct dp_packet *, uint16_t
> size);
> static inline void dp_packet_set_l2_5(struct dp_packet *, void *);
> -static inline void *dp_packet_l3(const struct dp_packet *);
> +static inline void *dp_packet_l3(const struct dp_packet *, uint16_t
> size);
> static inline void dp_packet_set_l3(struct dp_packet *, void *);
> -static inline void *dp_packet_l4(const struct dp_packet *);
> +static inline size_t dp_packet_l3_size(const struct dp_packet *);
> +static inline void *dp_packet_l4(const struct dp_packet *, uint16_t
> size);
> static inline void dp_packet_set_l4(struct dp_packet *, void *);
> static inline size_t dp_packet_l4_size(const struct dp_packet *);
> static inline const void *dp_packet_get_tcp_payload(const struct
> dp_packet *);
> @@ -122,9 +133,6 @@ void dp_packet_init_dpdk(struct dp_packet *);
> void dp_packet_init(struct dp_packet *, size_t);
> void dp_packet_uninit(struct dp_packet *);
>
> -void dp_packet_copy_mbuf_flags(struct dp_packet *dst,
> - const struct dp_packet *src);
> -
> struct dp_packet *dp_packet_new(size_t);
> struct dp_packet *dp_packet_new_with_headroom(size_t, size_t
> headroom);
> struct dp_packet *dp_packet_clone(const struct dp_packet *);
> @@ -149,6 +157,8 @@ dp_packet_mbuf_from_offset(const struct dp_packet
> *b, size_t *offset);
> void
> dp_packet_mbuf_write(struct rte_mbuf *mbuf, int16_t ofs, uint32_t
> len,
> const void *data);
> +static inline void
> +dp_packet_copy_mbuf_flags(struct dp_packet *dst, const struct
> dp_packet *src);
> #endif
> static inline void *dp_packet_tail(const struct dp_packet *);
> static inline void *dp_packet_end(const struct dp_packet *);
> @@ -179,21 +189,22 @@ void *dp_packet_steal_data(struct dp_packet *);
> static inline bool dp_packet_equal(const struct dp_packet *,
> const struct dp_packet *);
>
> +static inline ssize_t
> +dp_packet_read_data(const struct dp_packet *b, size_t offset, size_t
> size,
> + void **ptr, void *buf);
> +
> +
>
> /* Frees memory that 'b' points to, as well as 'b' itself. */
> static inline void
> dp_packet_delete(struct dp_packet *b)
> {
> if (b) {
> - if (b->source == DPBUF_DPDK) {
> - /* If this dp_packet was allocated by DPDK it must have
> been
> - * created as a dp_packet */
> - free_dpdk_buf((struct dp_packet*) b);
> - return;
> - }
> -
> dp_packet_uninit(b);
> - free(b);
> +
> + if (b->source != DPBUF_DPDK) {
> + free(b);
> + }
> }
> }
>
> @@ -209,7 +220,9 @@ dp_packet_copy_common_members(struct dp_packet
> *new_b,
> }
>
> /* If 'b' contains at least 'offset + size' bytes of data, returns a
> pointer to
> - * byte 'offset'. Otherwise, returns a null pointer. */
> + * byte 'offset'. Otherwise, returns a null pointer. For DPDK
> packets, this
> + * means the 'offset' + 'size' must fall within the same mbuf (not
> necessarily
> + * the first mbuf), otherwise null is returned */
> static inline void *
> dp_packet_at(const struct dp_packet *b, size_t offset, size_t size)
> {
> @@ -219,18 +232,15 @@ dp_packet_at(const struct dp_packet *b, size_t
> offset, size_t size)
>
> #ifdef DPDK_NETDEV
> if (b->source == DPBUF_DPDK) {
> - struct rte_mbuf *buf = CONST_CAST(struct rte_mbuf *,
> &b->mbuf);
> -
> - while (buf && offset > buf->data_len) {
> - offset -= buf->data_len;
> + const struct rte_mbuf *mbuf = dp_packet_mbuf_from_offset(b,
> &offset);
>
> - buf = buf->next;
> + if (!mbuf || offset + size > mbuf->data_len) {
> + return NULL;
> }
>
> - return buf ? rte_pktmbuf_mtod_offset(buf, char *, offset) :
> NULL;
> + return rte_pktmbuf_mtod_offset(mbuf, char *, offset);
> }
> #endif
> -
> return (char *) dp_packet_data(b) + offset;
> }
>
> @@ -329,20 +339,24 @@ dp_packet_pull(struct dp_packet *b, size_t size)
> return data;
> }
>
> -#ifdef DPDK_NETDEV
> /* Similar to dp_packet_try_pull() but doesn't actually pull any
> data, only
> - * checks if it could and returns true or false accordingly.
> - *
> - * Valid for dp_packets carrying mbufs only. */
> + * checks if it could and returns 'true' or 'false', accordingly. For
> DPDK
> + * packets, 'true' is only returned in case the 'offset' + 'size'
> falls within
> + * the first mbuf, otherwise 'false' is returned */
> static inline bool
> -dp_packet_mbuf_may_pull(const struct dp_packet *b, size_t size) {
> - if (size > b->mbuf.data_len) {
> +dp_packet_may_pull(const struct dp_packet *b, uint16_t offset, size_t
> size)
> +{
> + if (offset == UINT16_MAX) {
> + return false;
> + }
> +#ifdef DPDK_NETDEV
> + /* Offset needs to be within the first mbuf */
> + if (offset + size > b->mbuf.data_len) {
> return false;
> }
> -
> - return true;
> -}
> #endif
> + return (offset + size > dp_packet_size(b)) ? false : true;
> +}
>
> /* If 'b' has at least 'size' bytes of data, removes that many bytes
> from the
> * head end of 'b' and returns the first byte removed. Otherwise,
> returns a
> @@ -351,7 +365,7 @@ static inline void *
> dp_packet_try_pull(struct dp_packet *b, size_t size)
> {
> #ifdef DPDK_NETDEV
> - if (!dp_packet_mbuf_may_pull(b, size)) {
> + if (!dp_packet_may_pull(b, 0, size)) {
> return NULL;
> }
> #endif
> @@ -400,17 +414,13 @@ dp_packet_set_l2_pad_size(struct dp_packet *b,
> uint8_t pad_size)
> }
>
> static inline void *
> -dp_packet_l2_5(const struct dp_packet *b)
> +dp_packet_l2_5(const struct dp_packet *b, uint16_t size)
> {
> -#ifdef DPDK_NETDEV
> - if (!dp_packet_mbuf_may_pull(b, b->l2_5_ofs)) {
> + if (!dp_packet_may_pull(b, b->l2_5_ofs, size)) {
> return NULL;
> }
> -#endif
>
> - return b->l2_5_ofs != UINT16_MAX
> - ? (char *) dp_packet_data(b) + b->l2_5_ofs
> - : NULL;
> + return (char *) dp_packet_data(b) + b->l2_5_ofs;
> }
>
> static inline void
> @@ -422,17 +432,13 @@ dp_packet_set_l2_5(struct dp_packet *b, void
> *l2_5)
> }
>
> static inline void *
> -dp_packet_l3(const struct dp_packet *b)
> +dp_packet_l3(const struct dp_packet *b, uint16_t size)
> {
> -#ifdef DPDK_NETDEV
> - if (!dp_packet_mbuf_may_pull(b, b->l3_ofs)) {
> + if (!dp_packet_may_pull(b, b->l3_ofs, size)) {
> return NULL;
> }
> -#endif
>
> - return b->l3_ofs != UINT16_MAX
> - ? (char *) dp_packet_data(b) + b->l3_ofs
> - : NULL;
> + return (char *) dp_packet_data(b) + b->l3_ofs;
> }
>
> static inline void
> @@ -441,18 +447,34 @@ dp_packet_set_l3(struct dp_packet *b, void *l3)
> b->l3_ofs = l3 ? (char *) l3 - (char *) dp_packet_data(b) :
> UINT16_MAX;
> }
>
> +/* Returns the size of the l3 header. Caller must make sure both
> l3_ofs and
> + * l4_ofs are set*/
> +static inline size_t
> +dp_packet_l3h_size(const struct dp_packet *b)
> +{
> + return b->l4_ofs - b->l3_ofs;
> +}
> +
> +static inline size_t
> +dp_packet_l3_size(const struct dp_packet *b)
> +{
> + if (!dp_packet_may_pull(b, b->l3_ofs, 0)) {
> + return 0;
> + }
> +
> + size_t l3_size = dp_packet_size(b) - b->l3_ofs;
> +
> + return l3_size - dp_packet_l2_pad_size(b);
> +}
> +
> static inline void *
> -dp_packet_l4(const struct dp_packet *b)
> +dp_packet_l4(const struct dp_packet *b, uint16_t size)
> {
> -#ifdef DPDK_NETDEV
> - if (!dp_packet_mbuf_may_pull(b, b->l4_ofs)) {
> + if (!dp_packet_may_pull(b, b->l4_ofs, size)) {
> return NULL;
> }
> -#endif
>
> - return b->l4_ofs != UINT16_MAX
> - ? (char *) dp_packet_data(b) + b->l4_ofs
> - : NULL;
> + return (char *) dp_packet_data(b) + b->l4_ofs;
> }
>
> static inline void
> @@ -464,31 +486,13 @@ dp_packet_set_l4(struct dp_packet *b, void *l4)
> static inline size_t
> dp_packet_l4_size(const struct dp_packet *b)
> {
> -#ifdef DPDK_NETDEV
> - if (b->source == DPBUF_DPDK) {
> - if (!dp_packet_mbuf_may_pull(b, b->l4_ofs)) {
> - return 0;
> - }
> -
> - struct rte_mbuf *mbuf = CONST_CAST(struct rte_mbuf *,
> &b->mbuf);
> - size_t l4_size = mbuf->data_len - b->l4_ofs;
> -
> - mbuf = mbuf->next;
> - while (mbuf) {
> - l4_size += mbuf->data_len;
> -
> - mbuf = mbuf->next;
> - }
> + if (!dp_packet_may_pull(b, b->l4_ofs, 0)) {
> + return 0;
> + }
>
> - l4_size -= dp_packet_l2_pad_size(b);
> + size_t l4_size = dp_packet_size(b) - b->l4_ofs;
>
> - return l4_size;
> - }
> -#endif
> - return b->l4_ofs != UINT16_MAX
> - ? (const char *)dp_packet_tail(b) - (const char
> *)dp_packet_l4(b)
> - - dp_packet_l2_pad_size(b)
> - : 0;
> + return l4_size - dp_packet_l2_pad_size(b);
> }
>
> static inline const void *
> @@ -497,11 +501,12 @@ dp_packet_get_tcp_payload(const struct dp_packet
> *b)
> size_t l4_size = dp_packet_l4_size(b);
>
> if (OVS_LIKELY(l4_size >= TCP_HEADER_LEN)) {
> - struct tcp_header *tcp = dp_packet_l4(b);
> + struct tcp_header *tcp = dp_packet_l4(b, sizeof *tcp);
> int tcp_len = TCP_OFFSET(tcp->tcp_ctl) * 4;
>
> if (OVS_LIKELY(tcp_len >= TCP_HEADER_LEN && tcp_len <=
> l4_size)) {
> - return (const char *)tcp + tcp_len;
> + tcp = dp_packet_at(b, b->l4_ofs, tcp_len);
> + return (tcp == NULL) ? NULL : tcp + tcp_len;
> }
> }
> return NULL;
> @@ -511,28 +516,31 @@ static inline const void *
> dp_packet_get_udp_payload(const struct dp_packet *b)
> {
> return OVS_LIKELY(dp_packet_l4_size(b) >= UDP_HEADER_LEN)
> - ? (const char *)dp_packet_l4(b) + UDP_HEADER_LEN : NULL;
> + ? (const char *) dp_packet_l4(b, UDP_HEADER_LEN) +
> UDP_HEADER_LEN
> + : NULL;
> }
>
> static inline const void *
> dp_packet_get_sctp_payload(const struct dp_packet *b)
> {
> return OVS_LIKELY(dp_packet_l4_size(b) >= SCTP_HEADER_LEN)
> - ? (const char *)dp_packet_l4(b) + SCTP_HEADER_LEN : NULL;
> + ? (const char *) dp_packet_l4(b, SCTP_HEADER_LEN) +
> SCTP_HEADER_LEN
> + : NULL;
> }
>
> static inline const void *
> dp_packet_get_icmp_payload(const struct dp_packet *b)
> {
> return OVS_LIKELY(dp_packet_l4_size(b) >= ICMP_HEADER_LEN)
> - ? (const char *)dp_packet_l4(b) + ICMP_HEADER_LEN : NULL;
> + ? (const char *) dp_packet_l4(b, ICMP_HEADER_LEN) +
> ICMP_HEADER_LEN
> + : NULL;
> }
>
> static inline const void *
> dp_packet_get_nd_payload(const struct dp_packet *b)
> {
> return OVS_LIKELY(dp_packet_l4_size(b) >= ND_MSG_LEN)
> - ? (const char *)dp_packet_l4(b) + ND_MSG_LEN : NULL;
> + ? (const char *)dp_packet_l4(b, ND_MSG_LEN) + ND_MSG_LEN :
> NULL;
> }
>
> #ifdef DPDK_NETDEV
> @@ -706,13 +714,24 @@ static inline void
> dp_packet_copy_mbuf_flags(struct dp_packet *dst, const struct
> dp_packet *src)
> {
> ovs_assert(dst != NULL && src != NULL);
> - struct rte_mbuf *buf_dst = &(dst->mbuf);
> - struct rte_mbuf buf_src = src->mbuf;
> + struct rte_mbuf *buf_dst = &dst->mbuf;
> + const struct rte_mbuf *buf_src = &src->mbuf;
>
> buf_dst->ol_flags = buf_src->ol_flags;
> buf_dst->packet_type = buf_src->packet_type;
> buf_dst->tx_offload = buf_src->tx_offload;
> }
> +
> +static inline void
> +dp_packet_copy_from_offset(const struct dp_packet *b, size_t offset,
> + size_t size, void *buf) {
> + if (dp_packet_is_linear(b)) {
> + memcpy(buf, (char *)dp_packet_data(b) + offset, size);
> + } else {
> + const struct rte_mbuf *mbuf = dp_packet_mbuf_from_offset(b,
> &offset);
> + rte_pktmbuf_read(mbuf, offset, size, buf);
> + }
> +}
> #else
> static inline bool
> dp_packet_equal(const struct dp_packet *a, const struct dp_packet *b)
> @@ -768,6 +787,12 @@ dp_packet_set_allocated(struct dp_packet *b,
> uint16_t s)
> {
> b->allocated_ = s;
> }
> +
> +static inline void
> +dp_packet_copy_from_offset(const struct dp_packet *b, size_t offset,
> + size_t size, void *buf) {
> + memcpy(buf, (char *)dp_packet_data(b) + offset, size);
> +}
> #endif
>
> static inline void
> @@ -811,6 +836,92 @@ dp_packet_data(const struct dp_packet *b)
> ? (char *) dp_packet_base(b) + __packet_data(b) : NULL;
> }
>
> +static inline bool
> +dp_packet_is_linear(const struct dp_packet *b OVS_UNUSED)
> +{
> +#ifdef DPDK_NETDEV
> + if (b->source == DPBUF_DPDK) {
> + return rte_pktmbuf_is_contiguous(&b->mbuf);
> + }
> +#endif
> + return true;
> +}
> +
> +/* Linearizes the data on packet 'b', by copying the data into
> system's memory.
> + * After this the packet is effectively a DPBUF_MALLOC packet. The
> caller is
> + * responsible * for ensuring 'b' needs linearization, by calling
> + * dp_packet_is_linear().
> + *
> + * This is an expensive operation which should only be performed as a
> last
> + * resort, when multi-segments are under use but data must be
> accessed
> + * linearly. */
> +static inline void
> +dp_packet_linearize(struct dp_packet *b OVS_UNUSED)
> +{
> +#ifdef DPDK_NETDEV
> + struct rte_mbuf *mbuf = CONST_CAST(struct rte_mbuf *, &b->mbuf);
> + struct dp_packet *pkt = CONST_CAST(struct dp_packet *, b);
> + uint32_t pkt_len = dp_packet_size(pkt);
> + struct mbuf_state *mstate = NULL;
> + void *dst = xmalloc(pkt_len);
> +
> + /* Copy packet's data to system's memory */
> + if (!rte_pktmbuf_read(mbuf, 0, pkt_len, dst)) {
> + return;
> + }
> +
> + /* Free all mbufs except for the first */
> + dp_packet_clear(pkt);
> +
> + /* Save mbuf's buf_addr to restore later */
> + mstate = xmalloc(sizeof(*mstate));
> + mstate->addr = pkt->mbuf.buf_addr;
> + mstate->len = pkt->mbuf.buf_len;
> + mstate->off = pkt->mbuf.data_off;
> + pkt->mstate = mstate;
> +
> + /* Tranform DPBUF_DPDK packet into a DPBUF_MALLOC packet */
> + pkt->source = DPBUF_MALLOC;
> + pkt->mbuf.buf_addr = dst;
> + pkt->mbuf.buf_len = pkt_len;
> + pkt->mbuf.data_off = 0;
> + dp_packet_set_size(pkt, pkt_len);
> +#endif
> +}
> +
> +/* Reads 'size' bytes from 'offset' in 'b', linearly, to 'ptr', if
> 'buf' is
> + * NULL. Otherwise, if a 'buf' is provided, it must have 'size'
> bytes, and the
> + * data will be copied there, iff it is found to be non-linear. */
> +static inline ssize_t
> +dp_packet_read_data(const struct dp_packet *b, size_t offset, size_t
> size,
> + void **ptr, void *buf) {
> + /* Zero copy */
> + if ((*ptr = dp_packet_at(b, offset, size)) != NULL) {
> + return 0;
> + }
> +
> + /* Copy available linear data */
> + if (buf == NULL) {
> +#ifdef DPDK_NETDEV
> + size_t mofs = offset;
> + const struct rte_mbuf *mbuf = dp_packet_mbuf_from_offset(b,
> &mofs);
> + *ptr = dp_packet_at(b, offset, mbuf->data_len - mofs);
> +
> + return size - (mbuf->data_len - mofs);
> +#else
> + /* Non-DPDK dp_packets should always hit the above condition
> */
> + ovs_assert(1);
> +#endif
> + }
> +
> + /* Copy all data */
> +
> + *ptr = buf;
> + dp_packet_copy_from_offset(b, offset, size, buf);
> +
> + return 0;
> +}
> +
> static inline void
> dp_packet_set_data(struct dp_packet *b, void *data)
> {
> @@ -888,6 +999,7 @@ dp_packet_mbuf_init(struct dp_packet *p
> OVS_UNUSED)
> mbuf->ol_flags = mbuf->tx_offload = mbuf->packet_type = 0;
> mbuf->nb_segs = 1;
> mbuf->next = NULL;
> + p->mstate = NULL;
> #endif
> }
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 5df4129..6f96412 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -5708,6 +5708,11 @@ dp_netdev_upcall(struct dp_netdev_pmd_thread
> *pmd, struct dp_packet *packet_,
> .support = dp_netdev_support,
> };
>
> + /* Gather the whole data for printing the packet (if debug
> enabled) */
> + if (!dp_packet_is_linear(packet_)) {
> + dp_packet_linearize(packet_);
> + }
> +
> ofpbuf_init(&key, 0);
> odp_flow_key_from_flow(&odp_parms, &key);
> packet_str = ofp_dp_packet_to_string(packet_);
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index ac3d2ed..caad7db 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -1812,6 +1812,11 @@ dpif_netlink_operate__(struct dpif_netlink
> *dpif,
> }
> n_ops = i;
> } else {
> + /* We will need to pass the whole to encode the
> message */
> + if (!dp_packet_is_linear(op->execute.packet)) {
> + dp_packet_linearize(op->execute.packet);
> + }
> +
> dpif_netlink_encode_execute(dpif->dp_ifindex,
> &op->execute,
> &aux->request);
> }
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 4697a4d..514fae5 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -1243,6 +1243,7 @@ dpif_execute_helper_cb(void *aux_, struct
> dp_packet_batch *packets_,
> execute.probe = false;
> execute.mtu = 0;
> aux->error = dpif_execute(aux->dpif, &execute);
> +
> log_execute_message(aux->dpif, &this_module, &execute,
> true, aux->error);
>
> @@ -1395,6 +1396,7 @@ dpif_operate(struct dpif *dpif, struct dpif_op
> **ops, size_t n_ops)
>
> case DPIF_OP_EXECUTE:
> COVERAGE_INC(dpif_execute);
> +
> log_execute_message(dpif, &this_module,
> &op->execute,
> false, error);
> break;
> @@ -1822,6 +1824,13 @@ log_execute_message(const struct dpif *dpif,
> uint64_t stub[1024 / 8];
> struct ofpbuf md = OFPBUF_STUB_INITIALIZER(stub);
>
> + /* We will need the whole data for logging */
> + struct dp_packet *p = CONST_CAST(struct dp_packet *,
> + execute->packet);
> + if (!dp_packet_is_linear(p)) {
> + dp_packet_linearize(p);
> + }
> +
> packet =
> ofp_packet_to_string(dp_packet_data(execute->packet),
> dp_packet_size(execute->packet),
> execute->packet->packet_type);
> diff --git a/lib/flow.c b/lib/flow.c
> index 47b01fc..32af24a 100644
> --- a/lib/flow.c
> +++ b/lib/flow.c
> @@ -3004,38 +3004,40 @@ static void
> flow_compose_l4_csum(struct dp_packet *p, const struct flow *flow,
> uint32_t pseudo_hdr_csum)
> {
> - size_t l4_len = (char *) dp_packet_tail(p) - (char *)
> dp_packet_l4(p);
> + //size_t l4_len = (char *) dp_packet_tail(p) - (char *)
> dp_packet_l4(p);
Guess this can be removed
> + size_t l4_len = dp_packet_l4_size(p);
>
> if (!(flow->nw_frag & FLOW_NW_FRAG_ANY)
> || !(flow->nw_frag & FLOW_NW_FRAG_LATER)) {
> if (flow->nw_proto == IPPROTO_TCP) {
> - struct tcp_header *tcp = dp_packet_l4(p);
> + struct tcp_header *tcp = dp_packet_l4(p, sizeof *tcp);
>
> tcp->tcp_csum = 0;
> - tcp->tcp_csum =
> csum_finish(csum_continue(pseudo_hdr_csum,
> - tcp, l4_len));
> + tcp->tcp_csum = csum_finish(
> + packet_csum_continue(p, pseudo_hdr_csum, p->l4_ofs,
> l4_len));
Sure we need to use p->l4_ofs and not udp? They are probably the same
but using udp might be more clear.
> } else if (flow->nw_proto == IPPROTO_UDP) {
> - struct udp_header *udp = dp_packet_l4(p);
> + struct udp_header *udp = dp_packet_l4(p, sizeof *udp);
>
> udp->udp_csum = 0;
> - udp->udp_csum =
> csum_finish(csum_continue(pseudo_hdr_csum,
> - udp, l4_len));
> + udp->udp_csum = csum_finish(
> + packet_csum_continue(p, pseudo_hdr_csum, p->l4_ofs,
> l4_len));
> } else if (flow->nw_proto == IPPROTO_ICMP) {
> - struct icmp_header *icmp = dp_packet_l4(p);
> + struct icmp_header *icmp = dp_packet_l4(p, sizeof *icmp);
>
> icmp->icmp_csum = 0;
> - icmp->icmp_csum = csum(icmp, l4_len);
> + icmp->icmp_csum = packet_csum(p, p->l4_ofs, l4_len);
> } else if (flow->nw_proto == IPPROTO_IGMP) {
> - struct igmp_header *igmp = dp_packet_l4(p);
> + struct igmp_header *igmp = dp_packet_l4(p, sizeof *igmp);
>
> igmp->igmp_csum = 0;
> - igmp->igmp_csum = csum(igmp, l4_len);
> + igmp->igmp_csum = packet_csum(p, p->l4_ofs, l4_len);
> } else if (flow->nw_proto == IPPROTO_ICMPV6) {
> - struct icmp6_hdr *icmp = dp_packet_l4(p);
> + struct icmp6_hdr *icmp = dp_packet_l4(p, sizeof *icmp);
>
> icmp->icmp6_cksum = 0;
> icmp->icmp6_cksum = (OVS_FORCE uint16_t)
> - csum_finish(csum_continue(pseudo_hdr_csum, icmp,
> l4_len));
> + csum_finish(packet_csum_continue(p, pseudo_hdr_csum,
> p->l4_ofs,
> + l4_len));
> }
> }
> }
> @@ -3061,18 +3063,18 @@ packet_expand(struct dp_packet *p, const
> struct flow *flow, size_t size)
> eth->eth_type = htons(dp_packet_size(p));
> } else if (dl_type_is_ip_any(flow->dl_type)) {
> uint32_t pseudo_hdr_csum;
> - size_t l4_len = (char *) dp_packet_tail(p) - (char *)
> dp_packet_l4(p);
> + size_t l4_len = dp_packet_l4_size(p);
>
> if (flow->dl_type == htons(ETH_TYPE_IP)) {
> - struct ip_header *ip = dp_packet_l3(p);
> + struct ip_header *ip = dp_packet_l3(p, sizeof *ip);
>
> - ip->ip_tot_len = htons(p->l4_ofs - p->l3_ofs + l4_len);
> + ip->ip_tot_len = htons(dp_packet_l3_size(p));
> ip->ip_csum = 0;
> ip->ip_csum = csum(ip, sizeof *ip);
>
> pseudo_hdr_csum = packet_csum_pseudoheader(ip);
> } else { /* ETH_TYPE_IPV6 */
> - struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(p);
> + struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(p, sizeof
> *nh);
>
> nh->ip6_plen = htons(l4_len);
> pseudo_hdr_csum = packet_csum_pseudoheader6(nh);
> @@ -3081,7 +3083,7 @@ packet_expand(struct dp_packet *p, const struct
> flow *flow, size_t size)
> if ((!(flow->nw_frag & FLOW_NW_FRAG_ANY)
> || !(flow->nw_frag & FLOW_NW_FRAG_LATER))
> && flow->nw_proto == IPPROTO_UDP) {
> - struct udp_header *udp = dp_packet_l4(p);
> + struct udp_header *udp = dp_packet_l4(p, sizeof *udp);
>
> udp->udp_len = htons(l4_len + extra_size);
> }
> @@ -3149,8 +3151,8 @@ flow_compose(struct dp_packet *p, const struct
> flow *flow,
>
> l4_len = flow_compose_l4(p, flow, l7, l7_len);
>
> - ip = dp_packet_l3(p);
> - ip->ip_tot_len = htons(p->l4_ofs - p->l3_ofs + l4_len);
> + ip = dp_packet_l3(p, sizeof *ip);
> + ip->ip_tot_len = htons(dp_packet_l3_size(p));
> /* Checksum has already been zeroed by put_zeros call. */
> ip->ip_csum = csum(ip, sizeof *ip);
>
> @@ -3172,7 +3174,7 @@ flow_compose(struct dp_packet *p, const struct
> flow *flow,
>
> l4_len = flow_compose_l4(p, flow, l7, l7_len);
>
> - nh = dp_packet_l3(p);
> + nh = dp_packet_l3(p, sizeof *nh);
> nh->ip6_plen = htons(l4_len);
>
> pseudo_hdr_csum = packet_csum_pseudoheader6(nh);
> diff --git a/lib/lacp.c b/lib/lacp.c
> index d6b36aa..ec92202 100644
> --- a/lib/lacp.c
> +++ b/lib/lacp.c
> @@ -190,8 +190,7 @@ parse_lacp_packet(const struct dp_packet *p)
> {
> const struct lacp_pdu *pdu;
>
> - pdu = dp_packet_at(p, (uint8_t *)dp_packet_l3(p) - (uint8_t
> *)dp_packet_data(p),
> - LACP_PDU_LEN);
> + pdu = dp_packet_l3(p, LACP_PDU_LEN);
>
> if (pdu && pdu->subtype == 1
> && pdu->actor_type == 1 && pdu->actor_len == 20
> diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c
> index 6730301..af0cadb 100644
> --- a/lib/mcast-snooping.c
> +++ b/lib/mcast-snooping.c
> @@ -450,11 +450,11 @@ mcast_snooping_add_report(struct mcast_snooping
> *ms,
> int count = 0;
> int ngrp;
>
> - offset = (char *) dp_packet_l4(p) - (char *) dp_packet_data(p);
> - igmpv3 = dp_packet_at(p, offset, IGMPV3_HEADER_LEN);
> + igmpv3 = dp_packet_l4(p, IGMPV3_HEADER_LEN);
> if (!igmpv3) {
> return 0;
> }
> + offset = (char *) igmpv3 - (char *) dp_packet_data(p);
> ngrp = ntohs(igmpv3->ngrp);
> offset += IGMPV3_HEADER_LEN;
> while (ngrp--) {
> @@ -502,11 +502,11 @@ mcast_snooping_add_mld(struct mcast_snooping
> *ms,
> int ngrp;
> bool ret;
>
> - offset = (char *) dp_packet_l4(p) - (char *) dp_packet_data(p);
> - mld = dp_packet_at(p, offset, MLD_HEADER_LEN);
> + mld = dp_packet_l4(p, MLD_HEADER_LEN);
> if (!mld) {
> return 0;
> }
> + offset = (char *) mld - (char *) dp_packet_data(p);
> ngrp = ntohs(mld->ngrp);
> offset += MLD_HEADER_LEN;
> addr = dp_packet_at(p, offset, sizeof(struct in6_addr));
> diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
> index a153aa2..a545e7e 100644
> --- a/lib/netdev-bsd.c
> +++ b/lib/netdev-bsd.c
> @@ -701,6 +701,11 @@ netdev_bsd_send(struct netdev *netdev_, int qid
> OVS_UNUSED,
> }
>
> for (i = 0; i < batch->count; i++) {
> + /* We need the whole data to send the packet on the device */
> + if (!dp_packet_is_linear(batch->packets[i])) {
> + dp_packet_linearize(batch->packets[i]);
> + }
> +
> const void *data = dp_packet_data(batch->packets[i]);
> size_t size = dp_packet_size(batch->packets[i]);
>
> diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
> index 6d0b2e2..df9d1dd 100644
> --- a/lib/netdev-dummy.c
> +++ b/lib/netdev-dummy.c
> @@ -233,7 +233,13 @@ dummy_packet_stream_run(struct netdev_dummy *dev,
> struct dummy_packet_stream *s)
>
> ASSIGN_CONTAINER(txbuf_node, ovs_list_front(&s->txq),
> list_node);
> txbuf = txbuf_node->pkt;
> - retval = stream_send(s->stream, dp_packet_data(txbuf),
> dp_packet_size(txbuf));
> +
> + if (!dp_packet_is_linear(txbuf)) {
> + dp_packet_linearize(txbuf);
> + }
> +
> + retval = stream_send(s->stream, dp_packet_data(txbuf),
> + dp_packet_size(txbuf));
>
> if (retval > 0) {
> dp_packet_pull(txbuf, retval);
> @@ -1088,6 +1094,11 @@ netdev_dummy_send(struct netdev *netdev, int
> qid OVS_UNUSED,
>
> struct dp_packet *packet;
> DP_PACKET_BATCH_FOR_EACH(i, packet, batch) {
> + /* We need the whole data to send the packet on the device */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> const void *buffer = dp_packet_data(packet);
> size_t size = dp_packet_size(packet);
>
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index f86dcd0..fa79b2a 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -1379,6 +1379,11 @@ netdev_linux_sock_batch_send(int sock, int
> ifindex,
>
> struct dp_packet *packet;
> DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
> + /* We need the whole data to send the packet on the device */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> iov[i].iov_base = dp_packet_data(packet);
> iov[i].iov_len = dp_packet_size(packet);
> mmsg[i].msg_hdr = (struct msghdr) { .msg_name = &sll,
> @@ -1432,8 +1437,14 @@ netdev_linux_tap_batch_send(struct netdev
> *netdev_,
> ssize_t retval;
> int error;
>
> + /* We need the whole data to send the packet on the device */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> do {
> - retval = write(netdev->tap_fd, dp_packet_data(packet),
> size);
> + retval = write(netdev->tap_fd, dp_packet_data(packet),
> + size);
> error = retval < 0 ? errno : 0;
> } while (error == EINTR);
>
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> index 56baaa2..22eac35 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -65,13 +65,13 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet
> *packet, struct flow_tnl *tnl,
> void *nh;
> struct ip_header *ip;
> struct ovs_16aligned_ip6_hdr *ip6;
> - void *l4;
> + char *l4;
> int l3_size;
>
> - nh = dp_packet_l3(packet);
> + nh = dp_packet_l3(packet, sizeof *ip);
> ip = nh;
> ip6 = nh;
> - l4 = dp_packet_l4(packet);
> + l4 = dp_packet_l4(packet, sizeof *l4);
>
> if (!nh || !l4) {
> return NULL;
> @@ -79,15 +79,15 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet
> *packet, struct flow_tnl *tnl,
>
> *hlen = sizeof(struct eth_header);
>
> - l3_size = dp_packet_size(packet) -
> - ((char *)nh - (char *)dp_packet_data(packet));
> + l3_size = dp_packet_l3_size(packet);
>
> if (IP_VER(ip->ip_ihl_ver) == 4) {
>
> ovs_be32 ip_src, ip_dst;
>
> if (OVS_UNLIKELY(!dp_packet_ip_checksum_valid(packet))) {
> - if (csum(ip, IP_IHL(ip->ip_ihl_ver) * 4)) {
> + if (packet_csum(packet, packet->l3_ofs,
> + IP_IHL(ip->ip_ihl_ver) * 4)) {
> VLOG_WARN_RL(&err_rl, "ip packet has invalid
> checksum");
> return NULL;
> }
> @@ -191,15 +191,17 @@ udp_extract_tnl_md(struct dp_packet *packet,
> struct flow_tnl *tnl,
> if (OVS_UNLIKELY(!dp_packet_l4_checksum_valid(packet))) {
> uint32_t csum;
> if (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) {
> - csum =
> packet_csum_pseudoheader6(dp_packet_l3(packet));
> + csum = packet_csum_pseudoheader6(
> + dp_packet_l3(packet,
> + sizeof(struct ip6_hdr)));
> } else {
> - csum =
> packet_csum_pseudoheader(dp_packet_l3(packet));
> + csum = packet_csum_pseudoheader(
> + dp_packet_l3(packet,
> + sizeof(struct ip_header)));
> }
>
> - csum = csum_continue(csum, udp, dp_packet_size(packet) -
> - ((const unsigned char *)udp -
> - (const unsigned char
> *)dp_packet_eth(packet)
> - ));
> + csum = packet_csum_continue(packet, csum, packet->l4_ofs,
> + dp_packet_l4_size(packet));
> if (csum_finish(csum)) {
> return NULL;
> }
> @@ -236,7 +238,7 @@ netdev_tnl_push_udp_header(const struct netdev
> *netdev OVS_UNUSED,
> csum =
> packet_csum_pseudoheader(netdev_tnl_ip_hdr(dp_packet_data(packet)));
> }
>
> - csum = csum_continue(csum, udp, ip_tot_size);
> + csum = packet_csum_continue(packet, csum, packet->l4_ofs,
> ip_tot_size);
> udp->udp_csum = csum_finish(csum);
>
> if (!udp->udp_csum) {
> @@ -373,9 +375,8 @@ parse_gre_header(struct dp_packet *packet,
> if (greh->flags & htons(GRE_CSUM)) {
> ovs_be16 pkt_csum;
>
> - pkt_csum = csum(greh, dp_packet_size(packet) -
> - ((const unsigned char *)greh -
> - (const unsigned char
> *)dp_packet_eth(packet)));
> + pkt_csum = packet_csum(packet, packet->l4_ofs,
> + dp_packet_l4_size(packet));
> if (pkt_csum) {
> return -EINVAL;
> }
> @@ -448,8 +449,9 @@ netdev_gre_push_header(const struct netdev
> *netdev,
> greh = netdev_tnl_push_ip_header(packet, data->header,
> data->header_len, &ip_tot_size);
>
> if (greh->flags & htons(GRE_CSUM)) {
> - ovs_be16 *csum_opt = (ovs_be16 *) (greh + 1);
> - *csum_opt = csum(greh, ip_tot_size);
> + greh = dp_packet_l4(packet, sizeof *greh);
> + ovs_be16 *csum_opt = (ovs_be16 *) greh;
> + *csum_opt = packet_csum(packet, packet->l4_ofs, ip_tot_size);
> }
>
> if (greh->flags & htons(GRE_SEQ)) {
> diff --git a/lib/odp-execute.c b/lib/odp-execute.c
> index 5831d1f..8fdc4f9 100644
> --- a/lib/odp-execute.c
> +++ b/lib/odp-execute.c
> @@ -70,7 +70,7 @@ static void
> odp_set_ipv4(struct dp_packet *packet, const struct ovs_key_ipv4
> *key,
> const struct ovs_key_ipv4 *mask)
> {
> - struct ip_header *nh = dp_packet_l3(packet);
> + struct ip_header *nh = dp_packet_l3(packet, sizeof(*nh));
> ovs_be32 ip_src_nh;
> ovs_be32 ip_dst_nh;
> ovs_be32 new_ip_src;
> @@ -140,7 +140,7 @@ static void
> odp_set_ipv6(struct dp_packet *packet, const struct ovs_key_ipv6
> *key,
> const struct ovs_key_ipv6 *mask)
> {
> - struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(packet);
> + struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(packet, sizeof
> *nh);
> struct in6_addr sbuf, dbuf;
> uint8_t old_tc = ntohl(get_16aligned_be32(&nh->ip6_flow)) >> 20;
> ovs_be32 old_fl = get_16aligned_be32(&nh->ip6_flow) &
> htonl(0xfffff);
> @@ -160,7 +160,7 @@ static void
> odp_set_tcp(struct dp_packet *packet, const struct ovs_key_tcp *key,
> const struct ovs_key_tcp *mask)
> {
> - struct tcp_header *th = dp_packet_l4(packet);
> + struct tcp_header *th = dp_packet_l4(packet, sizeof *th);
>
> if (OVS_LIKELY(th && dp_packet_get_tcp_payload(packet))) {
> packet_set_tcp_port(packet,
> @@ -173,7 +173,7 @@ static void
> odp_set_udp(struct dp_packet *packet, const struct ovs_key_udp *key,
> const struct ovs_key_udp *mask)
> {
> - struct udp_header *uh = dp_packet_l4(packet);
> + struct udp_header *uh = dp_packet_l4(packet, sizeof *uh);
>
> if (OVS_LIKELY(uh && dp_packet_get_udp_payload(packet))) {
> packet_set_udp_port(packet,
> @@ -186,7 +186,7 @@ static void
> odp_set_sctp(struct dp_packet *packet, const struct ovs_key_sctp
> *key,
> const struct ovs_key_sctp *mask)
> {
> - struct sctp_header *sh = dp_packet_l4(packet);
> + struct sctp_header *sh = dp_packet_l4(packet, sizeof *sh);
>
> if (OVS_LIKELY(sh && dp_packet_get_sctp_payload(packet))) {
> packet_set_sctp_port(packet,
> @@ -205,7 +205,7 @@ static void
> set_arp(struct dp_packet *packet, const struct ovs_key_arp *key,
> const struct ovs_key_arp *mask)
> {
> - struct arp_eth_header *arp = dp_packet_l3(packet);
> + struct arp_eth_header *arp = dp_packet_l3(packet, sizeof *arp);
>
> if (!mask) {
> arp->ar_op = key->arp_op;
> @@ -231,8 +231,16 @@ static void
> odp_set_nd(struct dp_packet *packet, const struct ovs_key_nd *key,
> const struct ovs_key_nd *mask)
> {
> - const struct ovs_nd_msg *ns = dp_packet_l4(packet);
> - const struct ovs_nd_lla_opt *lla_opt =
> dp_packet_get_nd_payload(packet);
> + const struct ovs_nd_msg *ns;
> + const struct ovs_nd_lla_opt *lla_opt;
> +
> + /* To orocess neighbor discovery options, we need the whole
> packet */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> + ns = dp_packet_l4(packet, sizeof *ns);
> + lla_opt = dp_packet_get_nd_payload(packet);
>
> if (OVS_LIKELY(ns && lla_opt)) {
> int bytes_remain = dp_packet_l4_size(packet) - sizeof(*ns);
> @@ -275,7 +283,7 @@ static void
> odp_set_nsh(struct dp_packet *packet, const struct nlattr *a, bool
> has_mask)
> {
> struct ovs_key_nsh key, mask;
> - struct nsh_hdr *nsh = dp_packet_l3(packet);
> + struct nsh_hdr *nsh = dp_packet_l3(packet, sizeof *nsh);
> uint8_t mdtype = nsh_md_type(nsh);
> ovs_be32 path_hdr;
>
> @@ -522,7 +530,7 @@ odp_execute_masked_set_action(struct dp_packet
> *packet,
> break;
>
> case OVS_KEY_ATTR_MPLS:
> - mh = dp_packet_l2_5(packet);
> + mh = dp_packet_l2_5(packet, sizeof *mh);
> if (mh) {
> put_16aligned_be32(&mh->mpls_lse, nl_attr_get_be32(a)
> | (get_16aligned_be32(&mh->mpls_lse)
> diff --git a/lib/ofp-print.c b/lib/ofp-print.c
> index e05a969..37db260 100644
> --- a/lib/ofp-print.c
> +++ b/lib/ofp-print.c
> @@ -84,21 +84,21 @@ ofp_packet_to_string(const void *data, size_t len,
> ovs_be32 packet_type)
> l4_size = dp_packet_l4_size(&buf);
>
> if (flow.nw_proto == IPPROTO_TCP && l4_size >= TCP_HEADER_LEN) {
> - struct tcp_header *th = dp_packet_l4(&buf);
> + struct tcp_header *th = dp_packet_l4(&buf, sizeof *th);
> ds_put_format(&ds, " tcp_csum:%"PRIx16, ntohs(th->tcp_csum));
> } else if (flow.nw_proto == IPPROTO_UDP && l4_size >=
> UDP_HEADER_LEN) {
> - struct udp_header *uh = dp_packet_l4(&buf);
> + struct udp_header *uh = dp_packet_l4(&buf, sizeof *uh);
> ds_put_format(&ds, " udp_csum:%"PRIx16, ntohs(uh->udp_csum));
> } else if (flow.nw_proto == IPPROTO_SCTP && l4_size >=
> SCTP_HEADER_LEN) {
> - struct sctp_header *sh = dp_packet_l4(&buf);
> + struct sctp_header *sh = dp_packet_l4(&buf, sizeof *sh);
> ds_put_format(&ds, " sctp_csum:%"PRIx32,
> ntohl(get_16aligned_be32(&sh->sctp_csum)));
> } else if (flow.nw_proto == IPPROTO_ICMP && l4_size >=
> ICMP_HEADER_LEN) {
> - struct icmp_header *icmph = dp_packet_l4(&buf);
> + struct icmp_header *icmph = dp_packet_l4(&buf, sizeof
> *icmph);
> ds_put_format(&ds, " icmp_csum:%"PRIx16,
> ntohs(icmph->icmp_csum));
> } else if (flow.nw_proto == IPPROTO_ICMPV6 && l4_size >=
> ICMP6_HEADER_LEN) {
> - struct icmp6_header *icmp6h = dp_packet_l4(&buf);
> + struct icmp6_header *icmp6h = dp_packet_l4(&buf, sizeof
> *icmp6h);
> ds_put_format(&ds, " icmp6_csum:%"PRIx16,
> ntohs(icmp6h->icmp6_cksum));
> }
> diff --git a/lib/ovs-lldp.c b/lib/ovs-lldp.c
These are only style changes, it got me confused ;)
> index 05c1dd4..39d677a 100644
> --- a/lib/ovs-lldp.c
> +++ b/lib/ovs-lldp.c
> @@ -668,7 +668,8 @@ lldp_process_packet(struct lldp *lldp, const
> struct dp_packet *p)
> {
> if (lldp) {
> lldpd_recv(lldp->lldpd, lldpd_first_hardware(lldp->lldpd),
> - (char *) dp_packet_data(p), dp_packet_size(p));
> + (char *) dp_packet_data(p),
> + dp_packet_size(p));
> }
> }
>
> diff --git a/lib/packets.c b/lib/packets.c
> index 38bfb60..62f6901 100644
> --- a/lib/packets.c
> +++ b/lib/packets.c
> @@ -260,8 +260,8 @@ push_eth(struct dp_packet *packet, const struct
> eth_addr *dst,
> void
> pop_eth(struct dp_packet *packet)
> {
> - char *l2_5 = dp_packet_l2_5(packet);
> - char *l3 = dp_packet_l3(packet);
> + char *l2_5 = dp_packet_l2_5(packet, sizeof *l2_5);
> + char *l3 = dp_packet_l3(packet, sizeof *l3);
> ovs_be16 ethertype;
> int increment;
>
> @@ -292,10 +292,12 @@ set_ethertype(struct dp_packet *packet, ovs_be16
> eth_type)
>
> if (eth_type_vlan(eh->eth_type)) {
> ovs_be16 *p;
> - char *l2_5 = dp_packet_l2_5(packet);
> + char *l2_5 = dp_packet_l2_5(packet, sizeof *l2_5);
>
> p = ALIGNED_CAST(ovs_be16 *,
> - (l2_5 ? l2_5 : (char *)dp_packet_l3(packet))
> - 2);
> + (l2_5 ?
> + l2_5 :
> + (char *)dp_packet_l3(packet, sizeof *l2_5))
> - 2);
> *p = eth_type;
> } else {
> eh->eth_type = eth_type;
> @@ -359,7 +361,7 @@ set_mpls_lse(struct dp_packet *packet, ovs_be32
> mpls_lse)
> {
> /* Packet type should be MPLS to set label stack entry. */
> if (is_mpls(packet)) {
> - struct mpls_hdr *mh = dp_packet_l2_5(packet);
> + struct mpls_hdr *mh = dp_packet_l2_5(packet, sizeof *mh);
>
> /* Update mpls label stack entry. */
> put_16aligned_be32(&mh->mpls_lse, mpls_lse);
> @@ -401,7 +403,7 @@ void
> pop_mpls(struct dp_packet *packet, ovs_be16 ethtype)
> {
> if (is_mpls(packet)) {
> - struct mpls_hdr *mh = dp_packet_l2_5(packet);
> + struct mpls_hdr *mh = dp_packet_l2_5(packet, sizeof *mh);
> size_t len = packet->l2_5_ofs;
>
> set_ethertype(packet, ethtype);
> @@ -449,7 +451,7 @@ push_nsh(struct dp_packet *packet, const struct
> nsh_hdr *nsh_hdr_src)
> bool
> pop_nsh(struct dp_packet *packet)
> {
> - struct nsh_hdr *nsh = (struct nsh_hdr *) dp_packet_l3(packet);
> + struct nsh_hdr *nsh = (struct nsh_hdr *) dp_packet_l3(packet,
> sizeof *nsh);
> size_t length;
> uint32_t next_pt;
>
> @@ -975,16 +977,16 @@ void
> packet_set_ipv4_addr(struct dp_packet *packet,
> ovs_16aligned_be32 *addr, ovs_be32 new_addr)
> {
> - struct ip_header *nh = dp_packet_l3(packet);
> + struct ip_header *nh = dp_packet_l3(packet, sizeof *nh);
> ovs_be32 old_addr = get_16aligned_be32(addr);
> size_t l4_size = dp_packet_l4_size(packet);
>
> if (nh->ip_proto == IPPROTO_TCP && l4_size >= TCP_HEADER_LEN) {
> - struct tcp_header *th = dp_packet_l4(packet);
> + struct tcp_header *th = dp_packet_l4(packet, sizeof *th);
>
> th->tcp_csum = recalc_csum32(th->tcp_csum, old_addr,
> new_addr);
> } else if (nh->ip_proto == IPPROTO_UDP && l4_size >=
> UDP_HEADER_LEN ) {
> - struct udp_header *uh = dp_packet_l4(packet);
> + struct udp_header *uh = dp_packet_l4(packet, sizeof *uh);
>
> if (uh->udp_csum) {
> uh->udp_csum = recalc_csum32(uh->udp_csum, old_addr,
> new_addr);
> @@ -1007,12 +1009,20 @@ packet_rh_present(struct dp_packet *packet,
> uint8_t *nexthdr)
> const struct ovs_16aligned_ip6_hdr *nh;
> size_t len;
> size_t remaining;
> - uint8_t *data = dp_packet_l3(packet);
> + uint8_t *data;
>
> - remaining = packet->l4_ofs - packet->l3_ofs;
> + remaining = dp_packet_l3h_size(packet);
> if (remaining < sizeof *nh) {
> return false;
> }
> +
> + /* We will need the whole data for processing the headers below
> */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> + data = dp_packet_l3(packet, sizeof *nh);
> +
> nh = ALIGNED_CAST(struct ovs_16aligned_ip6_hdr *, data);
> data += sizeof *nh;
> remaining -= sizeof *nh;
> @@ -1088,11 +1098,11 @@ packet_update_csum128(struct dp_packet
> *packet, uint8_t proto,
> size_t l4_size = dp_packet_l4_size(packet);
>
> if (proto == IPPROTO_TCP && l4_size >= TCP_HEADER_LEN) {
> - struct tcp_header *th = dp_packet_l4(packet);
> + struct tcp_header *th = dp_packet_l4(packet, sizeof *th);
>
> th->tcp_csum = recalc_csum128(th->tcp_csum, addr, new_addr);
> } else if (proto == IPPROTO_UDP && l4_size >= UDP_HEADER_LEN) {
> - struct udp_header *uh = dp_packet_l4(packet);
> + struct udp_header *uh = dp_packet_l4(packet, sizeof *uh);
>
> if (uh->udp_csum) {
> uh->udp_csum = recalc_csum128(uh->udp_csum, addr,
> new_addr);
> @@ -1102,7 +1112,7 @@ packet_update_csum128(struct dp_packet *packet,
> uint8_t proto,
> }
> } else if (proto == IPPROTO_ICMPV6 &&
> l4_size >= sizeof(struct icmp6_header)) {
> - struct icmp6_header *icmp = dp_packet_l4(packet);
> + struct icmp6_header *icmp = dp_packet_l4(packet, sizeof
> *icmp);
>
> icmp->icmp6_cksum = recalc_csum128(icmp->icmp6_cksum, addr,
> new_addr);
> }
> @@ -1144,7 +1154,7 @@ void
> packet_set_ipv4(struct dp_packet *packet, ovs_be32 src, ovs_be32 dst,
> uint8_t tos, uint8_t ttl)
> {
> - struct ip_header *nh = dp_packet_l3(packet);
> + struct ip_header *nh = dp_packet_l3(packet, sizeof *nh);
>
> if (get_16aligned_be32(&nh->ip_src) != src) {
> packet_set_ipv4_addr(packet, &nh->ip_src, src);
> @@ -1180,7 +1190,7 @@ packet_set_ipv6(struct dp_packet *packet, const
> struct in6_addr *src,
> const struct in6_addr *dst, uint8_t key_tc, ovs_be32
> key_fl,
> uint8_t key_hl)
> {
> - struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(packet);
> + struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(packet, sizeof
> *nh);
> uint8_t proto = 0;
> bool rh_present;
>
> @@ -1215,7 +1225,7 @@ packet_set_port(ovs_be16 *port, ovs_be16
> new_port, ovs_be16 *csum)
> void
> packet_set_tcp_port(struct dp_packet *packet, ovs_be16 src, ovs_be16
> dst)
> {
> - struct tcp_header *th = dp_packet_l4(packet);
> + struct tcp_header *th = dp_packet_l4(packet, sizeof *th);
>
> packet_set_port(&th->tcp_src, src, &th->tcp_csum);
> packet_set_port(&th->tcp_dst, dst, &th->tcp_csum);
> @@ -1227,7 +1237,7 @@ packet_set_tcp_port(struct dp_packet *packet,
> ovs_be16 src, ovs_be16 dst)
> void
> packet_set_udp_port(struct dp_packet *packet, ovs_be16 src, ovs_be16
> dst)
> {
> - struct udp_header *uh = dp_packet_l4(packet);
> + struct udp_header *uh = dp_packet_l4(packet, sizeof *uh);
>
> if (uh->udp_csum) {
> packet_set_port(&uh->udp_src, src, &uh->udp_csum);
> @@ -1248,18 +1258,18 @@ packet_set_udp_port(struct dp_packet *packet,
> ovs_be16 src, ovs_be16 dst)
> void
> packet_set_sctp_port(struct dp_packet *packet, ovs_be16 src, ovs_be16
> dst)
> {
> - struct sctp_header *sh = dp_packet_l4(packet);
> + struct sctp_header *sh = dp_packet_l4(packet, sizeof *sh);
> ovs_be32 old_csum, old_correct_csum, new_csum;
> uint16_t tp_len = dp_packet_l4_size(packet);
>
> old_csum = get_16aligned_be32(&sh->sctp_csum);
> put_16aligned_be32(&sh->sctp_csum, 0);
> - old_correct_csum = crc32c((void *)sh, tp_len);
> + old_correct_csum = packet_crc32c(packet, packet->l4_ofs, tp_len);
>
> sh->sctp_src = src;
> sh->sctp_dst = dst;
>
> - new_csum = crc32c((void *)sh, tp_len);
> + new_csum = packet_crc32c(packet, packet->l4_ofs, tp_len);
> put_16aligned_be32(&sh->sctp_csum, old_csum ^ old_correct_csum ^
> new_csum);
> }
>
> @@ -1269,7 +1279,7 @@ packet_set_sctp_port(struct dp_packet *packet,
> ovs_be16 src, ovs_be16 dst)
> void
> packet_set_icmp(struct dp_packet *packet, uint8_t type, uint8_t code)
> {
> - struct icmp_header *ih = dp_packet_l4(packet);
> + struct icmp_header *ih = dp_packet_l4(packet, sizeof(*ih));
> ovs_be16 orig_tc = htons(ih->icmp_type << 8 | ih->icmp_code);
> ovs_be16 new_tc = htons(type << 8 | code);
>
> @@ -1293,7 +1303,12 @@ packet_set_nd(struct dp_packet *packet, const
> struct in6_addr *target,
> return;
> }
>
> - ns = dp_packet_l4(packet);
> + /* To process neighbor discovery options, we need the whole
> packet */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> + ns = dp_packet_l4(packet, sizeof *ns);
> opt = &ns->options[0];
> bytes_remain -= sizeof(*ns);
>
> @@ -1431,7 +1446,7 @@ compose_arp(struct dp_packet *b, uint16_t
> arp_op,
> eth->eth_dst = broadcast ? eth_addr_broadcast : arp_tha;
> eth->eth_src = arp_sha;
>
> - struct arp_eth_header *arp = dp_packet_l3(b);
> + struct arp_eth_header *arp = dp_packet_l3(b, sizeof *arp);
> arp->ar_op = htons(arp_op);
> arp->ar_sha = arp_sha;
> arp->ar_tha = arp_tha;
> @@ -1475,7 +1490,7 @@ compose_ipv6(struct dp_packet *packet, uint8_t
> proto,
> struct ip6_hdr *nh;
> void *data;
>
> - nh = dp_packet_l3(packet);
> + nh = dp_packet_l3(packet, sizeof *nh);
> nh->ip6_vfc = 0x60;
> nh->ip6_nxt = proto;
> nh->ip6_plen = htons(size);
> @@ -1514,9 +1529,10 @@ compose_nd_ns(struct dp_packet *b, const struct
> eth_addr eth_src,
> packet_set_nd(b, ipv6_dst, eth_src, eth_addr_zero);
>
> ns->icmph.icmp6_cksum = 0;
> - icmp_csum = packet_csum_pseudoheader6(dp_packet_l3(b));
> - ns->icmph.icmp6_cksum = csum_finish(
> - csum_continue(icmp_csum, ns, ND_MSG_LEN + ND_LLA_OPT_LEN));
> + icmp_csum = packet_csum_pseudoheader6(
> + dp_packet_l3(b, sizeof(struct ip6_hdr)));
> + ns->icmph.icmp6_cksum = csum_finish(packet_csum_continue(
> + b, icmp_csum, b->l4_ofs, ND_MSG_LEN + ND_LLA_OPT_LEN));
> }
>
> /* Compose an IPv6 Neighbor Discovery Neighbor Advertisement message.
> */
> @@ -1545,9 +1561,10 @@ compose_nd_na(struct dp_packet *b,
> packet_set_nd(b, ipv6_src, eth_addr_zero, eth_src);
>
> na->icmph.icmp6_cksum = 0;
> - icmp_csum = packet_csum_pseudoheader6(dp_packet_l3(b));
> - na->icmph.icmp6_cksum = csum_finish(csum_continue(
> - icmp_csum, na, ND_MSG_LEN + ND_LLA_OPT_LEN));
> + icmp_csum = packet_csum_pseudoheader6(
> + dp_packet_l3(b, sizeof(struct ip6_hdr)));
> + na->icmph.icmp6_cksum = csum_finish(packet_csum_continue(
> + b, icmp_csum, b->l4_ofs, ND_MSG_LEN + ND_LLA_OPT_LEN));
> }
>
> /* Compose an IPv6 Neighbor Discovery Router Advertisement message
> with
> @@ -1596,9 +1613,10 @@ compose_nd_ra(struct dp_packet *b,
> }
>
> ra->icmph.icmp6_cksum = 0;
> - uint32_t icmp_csum = packet_csum_pseudoheader6(dp_packet_l3(b));
> - ra->icmph.icmp6_cksum = csum_finish(csum_continue(
> - icmp_csum, ra, RA_MSG_LEN + ND_LLA_OPT_LEN + mtu_opt_len));
> + uint32_t icmp_csum = packet_csum_pseudoheader6(
> + dp_packet_l3(b, sizeof(struct ip6_hdr)));
> + ra->icmph.icmp6_cksum = csum_finish(packet_csum_continue(
> + b, icmp_csum, b->l4_ofs, RA_MSG_LEN + ND_LLA_OPT_LEN +
> mtu_opt_len));
> }
>
> /* Append an IPv6 Neighbor Discovery Prefix Information option to a
> @@ -1610,10 +1628,10 @@ packet_put_ra_prefix_opt(struct dp_packet *b,
> const ovs_be128 prefix)
> {
> size_t prev_l4_size = dp_packet_l4_size(b);
> - struct ip6_hdr *nh = dp_packet_l3(b);
> + struct ip6_hdr *nh = dp_packet_l3(b, sizeof *nh);
> nh->ip6_plen = htons(prev_l4_size + ND_PREFIX_OPT_LEN);
>
> - struct ovs_ra_msg *ra = dp_packet_l4(b);
> + struct ovs_ra_msg *ra = dp_packet_l4(b, sizeof *ra);
> struct ovs_nd_prefix_opt *prefix_opt =
> dp_packet_put_uninit(b, sizeof *prefix_opt);
> prefix_opt->type = ND_OPT_PREFIX_INFORMATION;
> @@ -1626,9 +1644,10 @@ packet_put_ra_prefix_opt(struct dp_packet *b,
> memcpy(prefix_opt->prefix.be32, prefix.be32,
> sizeof(ovs_be32[4]));
>
> ra->icmph.icmp6_cksum = 0;
> - uint32_t icmp_csum = packet_csum_pseudoheader6(dp_packet_l3(b));
> - ra->icmph.icmp6_cksum = csum_finish(csum_continue(
> - icmp_csum, ra, prev_l4_size + ND_PREFIX_OPT_LEN));
> + uint32_t icmp_csum = packet_csum_pseudoheader6(
> + dp_packet_l3(b, sizeof(struct ip6_hdr)));
> + ra->icmph.icmp6_cksum = csum_finish(packet_csum_continue(
> + b, icmp_csum, b->l4_ofs, prev_l4_size + ND_PREFIX_OPT_LEN));
> }
>
> uint32_t
> @@ -1680,16 +1699,64 @@ packet_csum_upperlayer6(const struct
> ovs_16aligned_ip6_hdr *ip6,
> }
> #endif
>
> +uint32_t
> +packet_csum_continue(const struct dp_packet *b, uint32_t partial,
> + uint16_t offset, size_t n)
> +{
> + char *ptr = NULL;
> + size_t rem = 0;
> + size_t size = 0;
> +
> + while (n > 1) {
> + rem = dp_packet_read_data(b, offset, n, (void *)&ptr, NULL);
> +
> + size = n - rem;
> + partial = csum_continue(partial, ptr, size);
> +
> + offset += size;
> + n = rem;
> + }
> +
> + return partial;
> +}
> +
> +ovs_be16
> +packet_csum(const struct dp_packet *b, uint16_t offset, size_t n)
> +{
> + return csum_finish(packet_csum_continue(b, 0, offset, n));
> +}
> +
> +ovs_be32
> +packet_crc32c(const struct dp_packet *b, uint16_t offset, size_t n)
> +{
> + char *ptr = NULL;
> + size_t rem = 0;
> + size_t size = 0;
> + uint32_t partial = 0xffffffffL;
> +
> + while (n > 1) {
> + rem = dp_packet_read_data(b, offset, n, (void *)&ptr, NULL);
> +
> + size = n - rem;
> + partial = crc32c_continue(partial, (uint8_t *) ptr, size);
> +
> + offset += size;
> + n = rem;
> + }
> +
> + return crc32c_finish(partial);
> +}
> +
> void
> IP_ECN_set_ce(struct dp_packet *pkt, bool is_ipv6)
> {
> if (is_ipv6) {
> - ovs_16aligned_be32 *ip6 = dp_packet_l3(pkt);
> + ovs_16aligned_be32 *ip6 = dp_packet_l3(pkt, sizeof *ip6);
>
> put_16aligned_be32(ip6, get_16aligned_be32(ip6) |
> htonl(IP_ECN_CE << 20));
> } else {
> - struct ip_header *nh = dp_packet_l3(pkt);
> + struct ip_header *nh = dp_packet_l3(pkt, sizeof *nh);
> uint8_t tos = nh->ip_tos;
>
> tos |= IP_ECN_CE;
> diff --git a/lib/packets.h b/lib/packets.h
> index 09a0ac3..9e7f5a1 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -1573,6 +1573,13 @@ void packet_put_ra_prefix_opt(struct dp_packet
> *,
> ovs_be32 preferred_lifetime,
> const ovs_be128 router_prefix);
> uint32_t packet_csum_pseudoheader(const struct ip_header *);
> +uint32_t
> +packet_csum_continue(const struct dp_packet *b, uint32_t partial,
> + uint16_t offset, size_t n);
> +ovs_be16
> +packet_csum(const struct dp_packet *b, uint16_t offset, size_t n);
> +ovs_be32
> +packet_crc32c(const struct dp_packet *b, uint16_t offset, size_t n);
> void IP_ECN_set_ce(struct dp_packet *pkt, bool is_ipv6);
>
> #define DNS_HEADER_LEN 12
> diff --git a/lib/pcap-file.c b/lib/pcap-file.c
> index 81a094c..7024931 100644
> --- a/lib/pcap-file.c
> +++ b/lib/pcap-file.c
> @@ -355,7 +355,7 @@ tcp_reader_run(struct tcp_reader *r, const struct
> flow *flow,
> || !l7) {
> return NULL;
> }
> - tcp = dp_packet_l4(packet);
> + tcp = dp_packet_l4(packet, sizeof *tcp);
> flags = TCP_FLAGS(tcp->tcp_ctl);
> l7_length = (char *) dp_packet_tail(packet) - l7;
> seq = ntohl(get_16aligned_be32(&tcp->tcp_seq));
> diff --git a/ofproto/ofproto-dpif-upcall.c
> b/ofproto/ofproto-dpif-upcall.c
> index 0cc964a..220b8c0 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -1381,12 +1381,18 @@ process_upcall(struct udpif *udpif, struct
> upcall *upcall,
> case SFLOW_UPCALL:
> if (upcall->sflow) {
> struct dpif_sflow_actions sflow_actions;
> + struct dp_packet *p = CONST_CAST(struct dp_packet *,
> packet);
>
> memset(&sflow_actions, 0, sizeof sflow_actions);
>
> actions_len = dpif_read_actions(udpif, upcall, flow,
> upcall->type,
> &sflow_actions);
> - dpif_sflow_received(upcall->sflow, packet, flow,
> + /* Gather the whole data */
> + if (!dp_packet_is_linear(p)) {
> + dp_packet_linearize(p);
> + }
> +
> + dpif_sflow_received(upcall->sflow, p, flow,
> flow->in_port.odp_port,
> &upcall->cookie,
> actions_len > 0 ? &sflow_actions :
> NULL);
> }
> @@ -1447,6 +1453,12 @@ process_upcall(struct udpif *udpif, struct
> upcall *upcall,
>
> const struct frozen_state *state = &recirc_node->state;
>
> + /* Gather the whole data */
> + struct dp_packet *p = CONST_CAST(struct dp_packet *,
> packet);
> + if (!dp_packet_is_linear(p)) {
> + dp_packet_linearize(p);
> + }
> +
> struct ofproto_async_msg *am = xmalloc(sizeof *am);
> *am = (struct ofproto_async_msg) {
> .controller_id = cookie->controller.controller_id,
> @@ -1454,9 +1466,9 @@ process_upcall(struct udpif *udpif, struct
> upcall *upcall,
> .pin = {
> .up = {
> .base = {
> - .packet = xmemdup(dp_packet_data(packet),
> -
> dp_packet_size(packet)),
> - .packet_len = dp_packet_size(packet),
> + .packet = xmemdup(dp_packet_data(p),
> + dp_packet_size(p)),
> + .packet_len = dp_packet_size(p),
> .reason = cookie->controller.reason,
> .table_id = state->table_id,
> .cookie = get_32aligned_be64(
> diff --git a/ofproto/ofproto-dpif-xlate.c
> b/ofproto/ofproto-dpif-xlate.c
> index 84cce81..efad792 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -1735,7 +1735,8 @@ stp_process_packet(const struct xport *xport,
> const struct dp_packet *packet)
> }
>
> if (dp_packet_try_pull(&payload, ETH_HEADER_LEN +
> LLC_HEADER_LEN)) {
> - stp_received_bpdu(sp, dp_packet_data(&payload),
> dp_packet_size(&payload));
> + stp_received_bpdu(sp, dp_packet_data(&payload),
> + dp_packet_size(&payload));
> }
> }
>
> @@ -1786,7 +1787,8 @@ rstp_process_packet(const struct xport *xport,
> const struct dp_packet *packet)
> }
>
> if (dp_packet_try_pull(&payload, ETH_HEADER_LEN +
> LLC_HEADER_LEN)) {
> - rstp_port_received_bpdu(xport->rstp_port,
> dp_packet_data(&payload),
> + rstp_port_received_bpdu(xport->rstp_port,
> + dp_packet_data(&payload),
> dp_packet_size(&payload));
> }
> }
> @@ -2556,11 +2558,10 @@ update_mcast_snooping_table4__(const struct
> xlate_ctx *ctx,
> {
> const struct igmp_header *igmp;
> int count;
> - size_t offset;
> ovs_be32 ip4 = flow->igmp_group_ip4;
>
> - offset = (char *) dp_packet_l4(packet) - (char *)
> dp_packet_data(packet);
> - igmp = dp_packet_at(packet, offset, IGMP_HEADER_LEN);
> + igmp = dp_packet_l4(packet, IGMP_HEADER_LEN);
> +
Are we sure this fits in one mbuf? What if the IP packet has a lot of
options?
> if (!igmp || csum(igmp, dp_packet_l4_size(packet)) != 0) {
> xlate_report_debug(ctx, OFT_DETAIL,
> "multicast snooping received bad IGMP "
> @@ -2616,13 +2617,11 @@ update_mcast_snooping_table6__(const struct
> xlate_ctx *ctx,
> {
> const struct mld_header *mld;
> int count;
> - size_t offset;
>
> - offset = (char *) dp_packet_l4(packet) - (char *)
> dp_packet_data(packet);
> - mld = dp_packet_at(packet, offset, MLD_HEADER_LEN);
> + mld = dp_packet_l4(packet, MLD_HEADER_LEN);
>
> if (!mld ||
> - packet_csum_upperlayer6(dp_packet_l3(packet),
> + packet_csum_upperlayer6(dp_packet_l3(packet, sizeof(struct
> ip6_hdr)),
Will this work? Asking as l3 might not start at end of ip6_hdr, there
could be other next_headers before the l3 data?
> mld, IPPROTO_ICMPV6,
> dp_packet_l4_size(packet)) != 0) {
> xlate_report_debug(ctx, OFT_DETAIL, "multicast snooping
> received "
> @@ -2925,6 +2924,13 @@ xlate_normal(struct xlate_ctx *ctx)
> && is_ip_any(flow)) {
> struct mcast_snooping *ms = ctx->xbridge->ms;
> struct mcast_group *grp = NULL;
> + struct dp_packet *p = CONST_CAST(struct dp_packet *,
> + ctx->xin->packet);
> +
> + /* We will need the whole data for processing the packet
> below */
> + if (p && !dp_packet_is_linear(p)) {
> + dp_packet_linearize(p);
> + }
>
> if (is_igmp(flow, wc)) {
> /*
> @@ -3213,7 +3219,8 @@ process_special(struct xlate_ctx *ctx, const
> struct xport *xport)
> const struct flow *flow = &ctx->xin->flow;
> struct flow_wildcards *wc = ctx->wc;
> const struct xbridge *xbridge = ctx->xbridge;
> - const struct dp_packet *packet = ctx->xin->packet;
> + struct dp_packet *packet = CONST_CAST(struct dp_packet *,
> + ctx->xin->packet);
> enum slow_path_reason slow;
>
> if (!xport) {
> @@ -3225,6 +3232,11 @@ process_special(struct xlate_ctx *ctx, const
> struct xport *xport)
> slow = SLOW_CFM;
> } else if (xport->bfd && bfd_should_process_flow(xport->bfd,
> flow, wc)) {
> if (packet) {
> + /* Gather the whole data for further processing */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> bfd_process_packet(xport->bfd, flow, packet);
> /* If POLL received, immediately sends FINAL back. */
> if (bfd_should_send_packet(xport->bfd)) {
> @@ -3241,6 +3253,11 @@ process_special(struct xlate_ctx *ctx, const
> struct xport *xport)
> } else if ((xbridge->stp || xbridge->rstp) &&
> stp_should_process_flow(flow, wc)) {
> if (packet) {
> + /* Gather the whole data for further processing */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> xbridge->stp
> ? stp_process_packet(xport, packet)
> : rstp_process_packet(xport, packet);
> @@ -3248,6 +3265,11 @@ process_special(struct xlate_ctx *ctx, const
> struct xport *xport)
> slow = SLOW_STP;
> } else if (xport->lldp && lldp_should_process_flow(xport->lldp,
> flow)) {
> if (packet) {
> + /* Gather the whole data for further processing */
> + if (!dp_packet_is_linear(packet)) {
> + dp_packet_linearize(packet);
> + }
> +
> lldp_process_packet(xport->lldp, packet);
> }
> slow = SLOW_LLDP;
> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> index 69a8119..76ba9c0 100644
> --- a/ovn/controller/pinctrl.c
> +++ b/ovn/controller/pinctrl.c
> @@ -392,7 +392,7 @@ pinctrl_handle_arp(const struct flow *ip_flow,
> struct dp_packet *pkt_in,
> eth->eth_dst = ip_flow->dl_dst;
> eth->eth_src = ip_flow->dl_src;
>
> - struct arp_eth_header *arp = dp_packet_l3(&packet);
> + struct arp_eth_header *arp = dp_packet_l3(&packet, sizeof(*arp));
For all changes in this file we assume that the packet will fit in the
first mbuf, are we sure this is always the case?
> arp->ar_op = htons(ARP_OP_REQUEST);
> arp->ar_sha = ip_flow->dl_src;
> put_16aligned_be32(&arp->ar_spa, ip_flow->nw_src);
> @@ -470,9 +470,10 @@ pinctrl_handle_icmp(const struct flow *ip_flow,
> struct dp_packet *pkt_in,
> ih->icmp6_base.icmp6_cksum = 0;
>
> uint8_t *data = dp_packet_put_zeros(&packet, sizeof *nh);
> - memcpy(data, dp_packet_l3(pkt_in), sizeof(*nh));
> + memcpy(data, dp_packet_l3(pkt_in, sizeof(*nh)), sizeof(*nh));
>
> - icmpv6_csum =
> packet_csum_pseudoheader6(dp_packet_l3(&packet));
> + icmpv6_csum = packet_csum_pseudoheader6(
> + dp_packet_l3(&packet, sizeof(struct
> ovs_16aligned_ip6_hdr)));
> ih->icmp6_base.icmp6_cksum = csum_finish(
> csum_continue(icmpv6_csum, ih,
> sizeof(*nh) + ICMP6_ERROR_HEADER_LEN));
> @@ -534,7 +535,7 @@ pinctrl_handle_tcp_reset(const struct flow
> *ip_flow, struct dp_packet *pkt_in,
> }
>
> struct tcp_header *th = dp_packet_put_zeros(&packet, sizeof *th);
> - struct tcp_header *tcp_in = dp_packet_l4(pkt_in);
> + struct tcp_header *tcp_in = dp_packet_l4(pkt_in,
> sizeof(*tcp_in));
> dp_packet_set_l4(&packet, th);
> th->tcp_ctl = TCP_CTL(TCP_RST, 5);
> if (ip_flow->tcp_flags & htons(TCP_ACK)) {
> @@ -713,7 +714,7 @@ pinctrl_handle_put_dhcp_opts(
>
> udp->udp_len = htons(new_l4_size);
>
> - struct ip_header *out_ip = dp_packet_l3(&pkt_out);
> + struct ip_header *out_ip = dp_packet_l3(&pkt_out,
> sizeof(*out_ip));
> out_ip->ip_tot_len = htons(pkt_out.l4_ofs - pkt_out.l3_ofs +
> new_l4_size);
> udp->udp_csum = 0;
> /* Checksum needs to be initialized to zero. */
> @@ -888,7 +889,7 @@ pinctrl_handle_put_dhcpv6_opts(
> goto exit;
> }
>
> - struct udp_header *in_udp = dp_packet_l4(pkt_in);
> + struct udp_header *in_udp = dp_packet_l4(pkt_in,
> sizeof(*in_udp));
> const uint8_t *in_dhcpv6_data =
> dp_packet_get_udp_payload(pkt_in);
> if (!in_udp || !in_dhcpv6_data) {
> VLOG_WARN_RL(&rl, "truncated dhcpv6 packet");
> @@ -1003,11 +1004,13 @@ pinctrl_handle_put_dhcpv6_opts(
> out_udp->udp_len = htons(new_l4_size);
> out_udp->udp_csum = 0;
>
> - struct ovs_16aligned_ip6_hdr *out_ip6 = dp_packet_l3(&pkt_out);
> + struct ovs_16aligned_ip6_hdr *out_ip6 = dp_packet_l3(&pkt_out,
> + sizeof
> *out_ip6);
> out_ip6->ip6_ctlun.ip6_un1.ip6_un1_plen = out_udp->udp_len;
>
> uint32_t csum;
> - csum = packet_csum_pseudoheader6(dp_packet_l3(&pkt_out));
> + csum = packet_csum_pseudoheader6(dp_packet_l3(&pkt_out,
> + sizeof(struct ovs_16aligned_ip6_hdr)));
> csum = csum_continue(csum, out_udp, dp_packet_size(&pkt_out) -
> ((const unsigned char *)out_udp -
> (const unsigned char
> *)dp_packet_eth(&pkt_out)));
> @@ -1095,7 +1098,7 @@ pinctrl_handle_dns_lookup(
> goto exit;
> }
>
> - struct udp_header *in_udp = dp_packet_l4(pkt_in);
> + struct udp_header *in_udp = dp_packet_l4(pkt_in, sizeof *in_udp);
> size_t udp_len = ntohs(in_udp->udp_len);
> size_t l4_len = dp_packet_l4_size(pkt_in);
> uint8_t *end = (uint8_t *)in_udp + MIN(udp_len, l4_len);
> @@ -1260,14 +1263,14 @@ pinctrl_handle_dns_lookup(
>
> struct eth_header *eth = dp_packet_data(&pkt_out);
> if (eth->eth_type == htons(ETH_TYPE_IP)) {
> - struct ip_header *out_ip = dp_packet_l3(&pkt_out);
> + struct ip_header *out_ip = dp_packet_l3(&pkt_out,
> sizeof(*out_ip));
> out_ip->ip_tot_len = htons(pkt_out.l4_ofs - pkt_out.l3_ofs
> + new_l4_size);
> /* Checksum needs to be initialized to zero. */
> out_ip->ip_csum = 0;
> out_ip->ip_csum = csum(out_ip, sizeof *out_ip);
> } else {
> - struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(&pkt_out);
> + struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(&pkt_out,
> sizeof(*nh));
> nh->ip6_plen = htons(new_l4_size);
>
> /* IPv6 needs UDP checksum calculated */
> @@ -2677,9 +2680,9 @@ pinctrl_handle_put_nd_ra_opts(
> dp_packet_put(&pkt_out, userdata->data, userdata->size);
>
> /* Set the IPv6 payload length and calculate the ICMPv6 checksum.
> */
> - struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(&pkt_out);
> + struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(&pkt_out,
> sizeof(*nh));
> nh->ip6_plen = htons(userdata->size);
> - struct ovs_ra_msg *ra = dp_packet_l4(&pkt_out);
> + struct ovs_ra_msg *ra = dp_packet_l4(&pkt_out, sizeof *ra);
> ra->icmph.icmp6_cksum = 0;
> uint32_t icmp_csum = packet_csum_pseudoheader6(nh);
> ra->icmph.icmp6_cksum = csum_finish(csum_continue(
> diff --git a/tests/test-conntrack.c b/tests/test-conntrack.c
> index 24d0bb4..e3cb75d 100644
> --- a/tests/test-conntrack.c
> +++ b/tests/test-conntrack.c
> @@ -46,7 +46,7 @@ prepare_packets(size_t n, bool change, unsigned tid,
> ovs_be16 *dl_type)
> dp_packet_put_hex(pkt, payload, NULL);
> flow_extract(pkt, &flow);
>
> - udp = dp_packet_l4(pkt);
> + udp = dp_packet_l4(pkt, sizeof *udp);
> udp->udp_src = htons(ntohs(udp->udp_src) + tid);
>
> if (change) {
> diff --git a/tests/test-rstp.c b/tests/test-rstp.c
> index 01aeaf8..f6adce8 100644
> --- a/tests/test-rstp.c
> +++ b/tests/test-rstp.c
> @@ -86,8 +86,12 @@ send_bpdu(struct dp_packet *pkt, void *port_, void
> *b_)
> assert(port_no < b->n_ports);
> lan = b->ports[port_no];
> if (lan) {
> - const void *data = dp_packet_l3(pkt);
> - size_t size = (char *) dp_packet_tail(pkt) - (char *) data;
> + if (!dp_packet_is_linear(pkt)) {
> + dp_packet_linearize(pkt);
> + }
> +
> + const char *data = dp_packet_l3(pkt, sizeof *data);
No NULL check for returned value from dp_packet_l3(), guess it might be
ok here, as it would assert in dp_packet_linearize().
> + size_t size = dp_packet_size(pkt) - pkt->l3_ofs;
> int i;
>
> for (i = 0; i < lan->n_conns; i++) {
> diff --git a/tests/test-stp.c b/tests/test-stp.c
> index c85c99d..5ed2377 100644
> --- a/tests/test-stp.c
> +++ b/tests/test-stp.c
> @@ -94,8 +94,12 @@ send_bpdu(struct dp_packet *pkt, int port_no, void
> *b_)
> assert(port_no < b->n_ports);
> lan = b->ports[port_no];
> if (lan) {
> - const void *data = dp_packet_l3(pkt);
> - size_t size = (char *) dp_packet_tail(pkt) - (char *) data;
> + if (!dp_packet_is_linear(pkt)) {
> + dp_packet_linearize(pkt);
> + }
> +
> + const char *data = dp_packet_l3(pkt, sizeof *data);
> + size_t size = dp_packet_size(pkt) - pkt->l3_ofs;
> int i;
>
> for (i = 0; i < lan->n_conns; i++) {
> --
> 2.7.4
More information about the dev
mailing list