[ovs-dev] [PATCH v2] conntrack : Use Rx checksum offload feature on DPDK ports for conntrack.

Chandran, Sugesh sugesh.chandran at intel.com
Mon Jun 26 07:51:20 UTC 2017


Hi Darrel,
Please find my answers below.

Regards
_Sugesh

> -----Original Message-----
> From: Darrell Ball [mailto:dball at vmware.com]
> Sent: Saturday, June 24, 2017 10:39 PM
> To: Chandran, Sugesh <sugesh.chandran at intel.com>; dev at openvswitch.org
> Subject: Re: [PATCH v2] conntrack : Use Rx checksum offload feature on
> DPDK ports for conntrack.
> 
> Hi Sugesh
> 
> I have a few questions:
> 
> 1) There is no checking for bad hwol determined checksums If the dpdk
> documentation is correct, then this is indicated by both good and bad flags
> set for both L3 and L4.
[Sugesh] I don’t think that’s the case. At least in my testing, the GOOD flag is set
only when the checksum is good.
If the checksum is BAD, the BAD flag is set, not both.
> 2) If hwol can reliably indicate when a checksum is known to be bad, then we
> should stop right there and say the packet is bad, which we don’t do here.
> We are checking if it is not known to be good and if not, do checksum verify
> in software.
[Sugesh] The reason for implementing this way is, reduce number of validations
So less cycles for validating the flag.  Are you suggesting to validating the bad checksum
flag even for the good case? 
> I added a dp-packet API to check for bad, assuming I understood the rte
> documentation and the documentation is accurate.
> Please confirm if a bad checksum can reliably be determined by rte hwol.
> 3) Now, if the checksum is not bad according to hwol, we can proceed to
> check if it is known to be good by hwol. It seems that the existing dp-packet
> API does not definitely check for good, since if the rte documentation is
> correct, the good flag is set in both good and bad checksum cases, but in the
> bad case, the bad flag is also set.  I updated the API according to my reading
> of the rte documentation.
[Sugesh] 
the flag explanation in the DPDK code is shown as below.


/**
 * Mask of bits used to determine the status of RX IP checksum.
 * - PKT_RX_IP_CKSUM_UNKNOWN: no information about the RX IP checksum
 * - PKT_RX_IP_CKSUM_BAD: the IP checksum in the packet is wrong
 * - PKT_RX_IP_CKSUM_GOOD: the IP checksum in the packet is valid
 * - PKT_RX_IP_CKSUM_NONE: the IP checksum is not correct in the packet
 *   data, but the integrity of the IP header is verified.
 */

/**
 * Mask of bits used to determine the status of RX L4 checksum.
 * - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4 checksum
 * - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
 * - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
 * - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet
 *   data, but the integrity of the L4 data is verified.
 */

The BAD and GOOD is mutual exclusive. So will that be enough to validate
only one.??

> Please verify if the documentation is correct and my reading is correct.
> 4) If the checksum is not known to bad or good by hwol, then it is
> undetermined and must be checked by OVS.
> 
> I have some code below, which is an incremental from yours, that describes
> ‘1-4’ above.
> Let me know if my read of the rte documentation is correct and the
> documentation itself is accurate. The diff includes a bit of coding standard
> changes.
> 
> diff --git a/lib/conntrack.c b/lib/conntrack.c index 38084e9..6f40df2 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -1459,8 +1459,6 @@ conn_key_extract(struct conntrack *ct, struct
> dp_packet *pkt, ovs_be16 dl_type,
>      const char *l4 = dp_packet_l4(pkt);
>      const char *tail = dp_packet_tail(pkt);
>      bool ok;
> -    bool valid_l4_csum = 0;
> -    bool  valid_l3_csum = 0;
> 
>      memset(ctx, 0, sizeof *ctx);
> 
> @@ -1504,23 +1502,32 @@ conn_key_extract(struct conntrack *ct, struct
> dp_packet *pkt, ovs_be16 dl_type,
>       */
>      ctx->key.dl_type = dl_type;
>      if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
> -        valid_l3_csum = dp_packet_ip_checksum_valid(pkt);
> -        /* Validate the checksum only when the csum is invalid */
> -        ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
> -                             !valid_l3_csum);
> +        bool  hwol_bad_l3_csum = dp_packet_ip_checksum_bad(pkt);
> +        if (hwol_bad_l3_csum) {
> +            ok = false;
> +        } else {
> +            bool  hwol_good_l3_csum = dp_packet_ip_checksum_valid(pkt);
> +            /* Validate the checksum only when hwol is not supported. */
> +            ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
> +                                 !hwol_good_l3_csum);
> +        }
>      } else if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
>          ok = extract_l3_ipv6(&ctx->key, l3, tail - (char *) l3, NULL);
>      } else {
>          ok = false;
>      }
> 
> +
>      if (ok) {
> -        valid_l4_csum = dp_packet_l4_checksum_valid(pkt);
> -        /* Validate the ckecksum only when the checksum is not valid/unset */
> -        if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
> -            !valid_l4_csum)) {
> -            ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
> -            return true;
> +        bool hwol_bad_l4_csum = dp_packet_l4_checksum_bad(pkt);
> +        if (!hwol_bad_l4_csum) {
> +            bool  hwol_good_l4_csum = dp_packet_l4_checksum_valid(pkt);
> +            /* Validate the checksum only when hwol is not supported. */
> +            if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
> +                           !hwol_good_l4_csum)) {
> +                ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
> +                return true;
> +            }
>          }
>      }
> 
> diff --git a/lib/dp-packet.h b/lib/dp-packet.h index d2549b1..c5a7f1d 100644
> --- a/lib/dp-packet.h
> +++ b/lib/dp-packet.h
> @@ -612,7 +612,19 @@ static inline bool
>  dp_packet_ip_checksum_valid(struct dp_packet *p)  {  #ifdef
> DPDK_NETDEV
> -    return p->mbuf.ol_flags & PKT_RX_IP_CKSUM_GOOD;
> +    return (p->mbuf.ol_flags & PKT_RX_IP_CKSUM_MASK) ==
> +            PKT_RX_IP_CKSUM_GOOD;
> +#else
> +    return 0 && p;
> +#endif
> +}
> +
> +static inline bool
> +dp_packet_ip_checksum_bad(struct dp_packet *p) { #ifdef DPDK_NETDEV
> +    return (p->mbuf.ol_flags & PKT_RX_IP_CKSUM_MASK) ==
> +            PKT_RX_IP_CKSUM_MASK;
>  #else
>      return 0 && p;
>  #endif
> @@ -622,7 +634,19 @@ static inline bool
>  dp_packet_l4_checksum_valid(struct dp_packet *p)  {  #ifdef
> DPDK_NETDEV
> -    return p->mbuf.ol_flags & PKT_RX_L4_CKSUM_GOOD;
> +    return (p->mbuf.ol_flags & PKT_RX_L4_CKSUM_MASK) ==
> +            PKT_RX_L4_CKSUM_GOOD;
> +#else
> +    return 0 && p;
> +#endif
> +}
> +
> +static inline bool
> +dp_packet_l4_checksum_bad(struct dp_packet *p) { #ifdef DPDK_NETDEV
> +    return (p->mbuf.ol_flags & PKT_RX_L4_CKSUM_MASK) ==
> +            PKT_RX_L4_CKSUM_MASK;
>  #else
>      return 0 && p;
>  #endif
> 
> 
> 
> ////////////////////////////////////////////////////////////////////
> 
> 
> On 6/16/17, 4:02 AM, "Sugesh Chandran" <sugesh.chandran at intel.com>
> wrote:
> 
>     Avoiding checksum validation in conntrack module if it is already verified
>     in DPDK physical NIC ports.
> 
>     Signed-off-by: Sugesh Chandran <sugesh.chandran at intel.com>
>     Acked-by: Antonio Fischetti <antonio.fischetti at intel.com>
>     Tested-by: Antonio Fischetti <antonio.fischetti at intel.com>
> 
>     ----
>     v1->v2
>      - Rebased on master
>      - Added acked-by and tested-by tags in commit message
>     ---
>      lib/conntrack.c | 60 ++++++++++++++++++++++++++++++++++++---------
> ------------
>      1 file changed, 38 insertions(+), 22 deletions(-)
> 
>     diff --git a/lib/conntrack.c b/lib/conntrack.c
>     index 90b154a..38084e9 100644
>     --- a/lib/conntrack.c
>     +++ b/lib/conntrack.c
>     @@ -1119,10 +1119,13 @@ extract_l3_ipv6(struct conn_key *key, const
> void *data, size_t size,
> 
>      static inline bool
>      checksum_valid(const struct conn_key *key, const void *data, size_t size,
>     -               const void *l3)
>     +               const void *l3, bool validate_checksum)
>      {
>          uint32_t csum = 0;
> 
>     +    if (!validate_checksum) {
>     +       return true;
>     +    }
>          if (key->dl_type == htons(ETH_TYPE_IP)) {
>              csum = packet_csum_pseudoheader(l3);
>          } else if (key->dl_type == htons(ETH_TYPE_IPV6)) {
>     @@ -1138,7 +1141,7 @@ checksum_valid(const struct conn_key *key,
> const void *data, size_t size,
> 
>      static inline bool
>      check_l4_tcp(const struct conn_key *key, const void *data, size_t size,
>     -             const void *l3)
>     +             const void *l3, bool validate_checksum)
>      {
>          const struct tcp_header *tcp = data;
>          if (size < sizeof *tcp) {
>     @@ -1150,12 +1153,12 @@ check_l4_tcp(const struct conn_key *key,
> const void *data, size_t size,
>              return false;
>          }
> 
>     -    return checksum_valid(key, data, size, l3);
>     +    return checksum_valid(key, data, size, l3, validate_checksum);
>      }
> 
>      static inline bool
>      check_l4_udp(const struct conn_key *key, const void *data, size_t size,
>     -             const void *l3)
>     +             const void *l3, bool validate_checksum)
>      {
>          const struct udp_header *udp = data;
>          if (size < sizeof *udp) {
>     @@ -1169,20 +1172,23 @@ check_l4_udp(const struct conn_key *key,
> const void *data, size_t size,
> 
>          /* Validation must be skipped if checksum is 0 on IPv4 packets */
>          return (udp->udp_csum == 0 && key->dl_type == htons(ETH_TYPE_IP))
>     -           || checksum_valid(key, data, size, l3);
>     +           || checksum_valid(key, data, size, l3, validate_checksum);
>      }
> 
>      static inline bool
>     -check_l4_icmp(const void *data, size_t size)
>     +check_l4_icmp(const void *data, size_t size, bool validate_checksum)
>      {
>     -    return csum(data, size) == 0;
>     +    if (validate_checksum) {
>     +        return csum(data, size) == 0;
>     +    }
>     +    return true;
>      }
> 
>      static inline bool
>      check_l4_icmp6(const struct conn_key *key, const void *data, size_t size,
>     -               const void *l3)
>     +               const void *l3, bool validate_checksum)
>      {
>     -    return checksum_valid(key, data, size, l3);
>     +    return checksum_valid(key, data, size, l3, validate_checksum);
>      }
> 
>      static inline bool
>     @@ -1218,7 +1224,8 @@ extract_l4_udp(struct conn_key *key, const void
> *data, size_t size)
>      }
> 
>      static inline bool extract_l4(struct conn_key *key, const void *data,
>     -                              size_t size, bool *related, const void *l3);
>     +                              size_t size, bool *related, const void *l3,
>     +                              bool validate_checksum);
> 
>      static uint8_t
>      reverse_icmp_type(uint8_t type)
>     @@ -1306,7 +1313,7 @@ extract_l4_icmp(struct conn_key *key, const void
> *data, size_t size,
>              key->dst = inner_key.dst;
>              key->nw_proto = inner_key.nw_proto;
> 
>     -        ok = extract_l4(key, l4, tail - l4, NULL, l3);
>     +        ok = extract_l4(key, l4, tail - l4, NULL, l3, false);
>              if (ok) {
>                  conn_key_reverse(key);
>                  *related = true;
>     @@ -1396,7 +1403,7 @@ extract_l4_icmp6(struct conn_key *key, const
> void *data, size_t size,
>              key->dst = inner_key.dst;
>              key->nw_proto = inner_key.nw_proto;
> 
>     -        ok = extract_l4(key, l4, tail - l4, NULL, l3);
>     +        ok = extract_l4(key, l4, tail - l4, NULL, l3, false);
>              if (ok) {
>                  conn_key_reverse(key);
>                  *related = true;
>     @@ -1421,22 +1428,23 @@ extract_l4_icmp6(struct conn_key *key, const
> void *data, size_t size,
>       * in an ICMP error.  In this case, we skip checksum and length validation.
> */
>      static inline bool
>      extract_l4(struct conn_key *key, const void *data, size_t size, bool
> *related,
>     -           const void *l3)
>     +           const void *l3, bool validate_checksum)
>      {
>          if (key->nw_proto == IPPROTO_TCP) {
>     -        return (!related || check_l4_tcp(key, data, size, l3))
>     -               && extract_l4_tcp(key, data, size);
>     +        return (!related || check_l4_tcp(key, data, size, l3,
>     +                validate_checksum)) && extract_l4_tcp(key, data, size);
>          } else if (key->nw_proto == IPPROTO_UDP) {
>     -        return (!related || check_l4_udp(key, data, size, l3))
>     -               && extract_l4_udp(key, data, size);
>     +        return (!related || check_l4_udp(key, data, size, l3,
>     +                validate_checksum)) && extract_l4_udp(key, data, size);
>          } else if (key->dl_type == htons(ETH_TYPE_IP)
>                     && key->nw_proto == IPPROTO_ICMP) {
>     -        return (!related || check_l4_icmp(data, size))
>     +        return (!related || check_l4_icmp(data, size, validate_checksum))
>                     && extract_l4_icmp(key, data, size, related);
>          } else if (key->dl_type == htons(ETH_TYPE_IPV6)
>                     && key->nw_proto == IPPROTO_ICMPV6) {
>     -        return (!related || check_l4_icmp6(key, data, size, l3))
>     -               && extract_l4_icmp6(key, data, size, related);
>     +        return (!related || check_l4_icmp6(key, data, size, l3,
>     +                validate_checksum)) && extract_l4_icmp6(key, data, size,
>     +                related);
>          } else {
>              return false;
>          }
>     @@ -1451,6 +1459,8 @@ conn_key_extract(struct conntrack *ct, struct
> dp_packet *pkt, ovs_be16 dl_type,
>          const char *l4 = dp_packet_l4(pkt);
>          const char *tail = dp_packet_tail(pkt);
>          bool ok;
>     +    bool valid_l4_csum = 0;
>     +    bool  valid_l3_csum = 0;
> 
>          memset(ctx, 0, sizeof *ctx);
> 
>     @@ -1494,7 +1504,10 @@ conn_key_extract(struct conntrack *ct, struct
> dp_packet *pkt, ovs_be16 dl_type,
>           */
>          ctx->key.dl_type = dl_type;
>          if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
>     -        ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL, true);
>     +        valid_l3_csum = dp_packet_ip_checksum_valid(pkt);
>     +        /* Validate the checksum only when the csum is invalid */
>     +        ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
>     +                             !valid_l3_csum);
>          } else if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
>              ok = extract_l3_ipv6(&ctx->key, l3, tail - (char *) l3, NULL);
>          } else {
>     @@ -1502,7 +1515,10 @@ conn_key_extract(struct conntrack *ct, struct
> dp_packet *pkt, ovs_be16 dl_type,
>          }
> 
>          if (ok) {
>     -        if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3)) {
>     +        valid_l4_csum = dp_packet_l4_checksum_valid(pkt);
>     +        /* Validate the ckecksum only when the checksum is not valid/unset
> */
>     +        if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
>     +            !valid_l4_csum)) {
>                  ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
>                  return true;
>              }
>     --
>     2.7.4
> 
> 



More information about the dev mailing list