[ovs-dev] [PATCH v2] conntrack : Use Rx checksum offload feature on DPDK ports for conntrack.

Chandran, Sugesh sugesh.chandran at intel.com
Tue Jun 27 10:47:30 UTC 2017


Hi Darrel,

Regards
_Sugesh


> -----Original Message-----
> From: Darrell Ball [mailto:dball at vmware.com]
> Sent: Tuesday, June 27, 2017 8:15 AM
> To: Chandran, Sugesh <sugesh.chandran at intel.com>
> Cc: dev at openvswitch.org
> Subject: Re: [PATCH v2] conntrack : Use Rx checksum offload feature on
> DPDK ports for conntrack.
> 
> Hi Sugesh
> 
> On 6/26/17, 12:51 AM, "Chandran, Sugesh" <sugesh.chandran at intel.com>
> wrote:
> 
>     Hi Darrel,
>     Please find my answers below.
> 
>     Regards
>     _Sugesh
> 
>     > -----Original Message-----
>     > From: Darrell Ball [mailto:dball at vmware.com]
>     > Sent: Saturday, June 24, 2017 10:39 PM
>     > To: Chandran, Sugesh <sugesh.chandran at intel.com>;
> dev at openvswitch.org
>     > Subject: Re: [PATCH v2] conntrack : Use Rx checksum offload feature on
>     > DPDK ports for conntrack.
>     >
>     > Hi Sugesh
>     >
>     > I have a few questions:
>     >
>     > 1) There is no checking for bad hwol determined checksums If the dpdk
>     > documentation is correct, then this is indicated by both good and bad
> flags
>     > set for both L3 and L4.
>     [Sugesh] I don’t think that’s the case. At least in my testing, the GOOD flag
> is set
>     only when the checksum is good.
>     If the checksum is BAD, the BAD flag is set, not both.
> 
> Below is the full context I was referring to earlier:
> What is your interpretation of:
> 1) PKT_RX_IP_CKSUM_NONE, which equals PKT_RX_IP_CKSUM_MASK
> Packet is good but checksum needs to recalculated and updated in packet ?
> 2) PKT_RX_L4_CKSUM_NONE which PKT_RX_L4_CKSUM_MASK Packet is
> good but checksum needs to recalculated and updated in packet ?
> 3) The deprecated labellings of PKT_RX_IP_CKSUM_BAD and
> PKT_RX_L4_CKSUM_BAD  ?
> 
> **
>  * Deprecated.
>  * Checking this flag alone is deprecated: check the 2 bits of
>  * PKT_RX_IP_CKSUM_MASK.
>  * This flag was set when the IP checksum of a packet was detected as
>  * wrong by the hardware.
>  */
> #define PKT_RX_IP_CKSUM_BAD  (1ULL << 4)
> 
> /**
>  * Mask of bits used to determine the status of RX IP checksum.
>  * - PKT_RX_IP_CKSUM_UNKNOWN: no information about the RX IP
> checksum
>  * - PKT_RX_IP_CKSUM_BAD: the IP checksum in the packet is wrong
>  * - PKT_RX_IP_CKSUM_GOOD: the IP checksum in the packet is valid
>  * - PKT_RX_IP_CKSUM_NONE: the IP checksum is not correct in the packet
>  *   data, but the integrity of the IP header is verified.
>  */
> #define PKT_RX_IP_CKSUM_MASK ((1ULL << 4) | (1ULL << 7))
> 
> #define PKT_RX_IP_CKSUM_UNKNOWN 0
> #define PKT_RX_IP_CKSUM_BAD     (1ULL << 4)
> #define PKT_RX_IP_CKSUM_GOOD    (1ULL << 7)
> #define PKT_RX_IP_CKSUM_NONE    ((1ULL << 4) | (1ULL << 7))
> 
> 
> **
>  * Deprecated.
>  * Checking this flag alone is deprecated: check the 2 bits of
>  * PKT_RX_L4_CKSUM_MASK.
>  * This flag was set when the L4 checksum of a packet was detected as
>  * wrong by the hardware.
>  */
> #define PKT_RX_L4_CKSUM_BAD  (1ULL << 3)
> 
> /**
>  * Mask of bits used to determine the status of RX L4 checksum.
>  * - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4
> checksum
>  * - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
>  * - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
>  * - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet
>  *   data, but the integrity of the L4 data is verified.
>  */
> #define PKT_RX_L4_CKSUM_MASK ((1ULL << 3) | (1ULL << 8))
> 
> #define PKT_RX_L4_CKSUM_UNKNOWN 0
> #define PKT_RX_L4_CKSUM_BAD     (1ULL << 3)
> #define PKT_RX_L4_CKSUM_GOOD    (1ULL << 8)
> #define PKT_RX_L4_CKSUM_NONE    ((1ULL << 3) | (1ULL << 8))
> 
> 
> 
>     > 2) If hwol can reliably indicate when a checksum is known to be bad, then
> we
>     > should stop right there and say the packet is bad, which we don’t do
> here.
>     > We are checking if it is not known to be good and if not, do checksum
> verify
>     > in software.
>     [Sugesh] The reason for implementing this way is, reduce number of
> validations
>     So less cycles for validating the flag.  Are you suggesting to validating the
> bad checksum
>     flag even for the good case?
> 
> No, assuming we can determine the checksum is bad reliably from hwol, we
> could check for bad only if not good. High probability of bad checksums might
> be an exploit attempt.
> 
[Sugesh] Ah, Now I got it. You are right,
Yes we must use the flags with the mask to validate it.
> 
>     > I added a dp-packet API to check for bad, assuming I understood the rte
>     > documentation and the documentation is accurate.
>     > Please confirm if a bad checksum can reliably be determined by rte hwol.
>     > 3) Now, if the checksum is not bad according to hwol, we can proceed to
>     > check if it is known to be good by hwol. It seems that the existing dp-
> packet
>     > API does not definitely check for good, since if the rte documentation is
>     > correct, the good flag is set in both good and bad checksum cases, but in
> the
>     > bad case, the bad flag is also set.  I updated the API according to my
> reading
>     > of the rte documentation.
>     [Sugesh]
>     the flag explanation in the DPDK code is shown as below.
> 
> 
>     /**
>      * Mask of bits used to determine the status of RX IP checksum.
>      * - PKT_RX_IP_CKSUM_UNKNOWN: no information about the RX IP
> checksum
>      * - PKT_RX_IP_CKSUM_BAD: the IP checksum in the packet is wrong
>      * - PKT_RX_IP_CKSUM_GOOD: the IP checksum in the packet is valid
>      * - PKT_RX_IP_CKSUM_NONE: the IP checksum is not correct in the
> packet
>      *   data, but the integrity of the IP header is verified.
>      */
> 
>     /**
>      * Mask of bits used to determine the status of RX L4 checksum.
>      * - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4
> checksum
>      * - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
>      * - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
>      * - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the
> packet
>      *   data, but the integrity of the L4 data is verified.
>      */
> 
>     The BAD and GOOD is mutual exclusive. So will that be enough to validate
>     only one.??
> 
> Assuming we can determine the checksum is bad reliably from hwol, we
> could check for bad only if not good. High probability of bad checksums might
> be an exploit attempt.
> 
> 
>     > Please verify if the documentation is correct and my reading is correct.
>     > 4) If the checksum is not known to bad or good by hwol, then it is
>     > undetermined and must be checked by OVS.
>     >
>     > I have some code below, which is an incremental from yours, that
> describes
>     > ‘1-4’ above.
>     > Let me know if my read of the rte documentation is correct and the
>     > documentation itself is accurate. The diff includes a bit of coding standard
>     > changes.
>     >
>     > diff --git a/lib/conntrack.c b/lib/conntrack.c index 38084e9..6f40df2
> 100644
>     > --- a/lib/conntrack.c
>     > +++ b/lib/conntrack.c
>     > @@ -1459,8 +1459,6 @@ conn_key_extract(struct conntrack *ct, struct
>     > dp_packet *pkt, ovs_be16 dl_type,
>     >      const char *l4 = dp_packet_l4(pkt);
>     >      const char *tail = dp_packet_tail(pkt);
>     >      bool ok;
>     > -    bool valid_l4_csum = 0;
>     > -    bool  valid_l3_csum = 0;
>     >
>     >      memset(ctx, 0, sizeof *ctx);
>     >
>     > @@ -1504,23 +1502,32 @@ conn_key_extract(struct conntrack *ct, struct
>     > dp_packet *pkt, ovs_be16 dl_type,
>     >       */
>     >      ctx->key.dl_type = dl_type;
>     >      if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
>     > -        valid_l3_csum = dp_packet_ip_checksum_valid(pkt);
>     > -        /* Validate the checksum only when the csum is invalid */
>     > -        ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
>     > -                             !valid_l3_csum);
>     > +        bool  hwol_bad_l3_csum = dp_packet_ip_checksum_bad(pkt);
[Sugesh] Will add these changes in the next version of patch.

>     > +        if (hwol_bad_l3_csum) {
>     > +            ok = false;
>     > +        } else {
>     > +            bool  hwol_good_l3_csum = dp_packet_ip_checksum_valid(pkt);
>     > +            /* Validate the checksum only when hwol is not supported. */
>     > +            ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
>     > +                                 !hwol_good_l3_csum);
>     > +        }
>     >      } else if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
>     >          ok = extract_l3_ipv6(&ctx->key, l3, tail - (char *) l3, NULL);
>     >      } else {
>     >          ok = false;
>     >      }
>     >
>     > +
>     >      if (ok) {
>     > -        valid_l4_csum = dp_packet_l4_checksum_valid(pkt);
>     > -        /* Validate the ckecksum only when the checksum is not valid/unset
> */
>     > -        if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
>     > -            !valid_l4_csum)) {
>     > -            ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
>     > -            return true;
>     > +        bool hwol_bad_l4_csum = dp_packet_l4_checksum_bad(pkt);
>     > +        if (!hwol_bad_l4_csum) {
>     > +            bool  hwol_good_l4_csum = dp_packet_l4_checksum_valid(pkt);
>     > +            /* Validate the checksum only when hwol is not supported. */
>     > +            if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
>     > +                           !hwol_good_l4_csum)) {
>     > +                ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
>     > +                return true;
>     > +            }
>     >          }
>     >      }
>     >
>     > diff --git a/lib/dp-packet.h b/lib/dp-packet.h index d2549b1..c5a7f1d
> 100644
>     > --- a/lib/dp-packet.h
>     > +++ b/lib/dp-packet.h
>     > @@ -612,7 +612,19 @@ static inline bool
>     >  dp_packet_ip_checksum_valid(struct dp_packet *p)  {  #ifdef
>     > DPDK_NETDEV
>     > -    return p->mbuf.ol_flags & PKT_RX_IP_CKSUM_GOOD;
>     > +    return (p->mbuf.ol_flags & PKT_RX_IP_CKSUM_MASK) ==
>     > +            PKT_RX_IP_CKSUM_GOOD;
>     > +#else
>     > +    return 0 && p;
>     > +#endif
>     > +}
>     > +
>     > +static inline bool
>     > +dp_packet_ip_checksum_bad(struct dp_packet *p) { #ifdef
> DPDK_NETDEV
>     > +    return (p->mbuf.ol_flags & PKT_RX_IP_CKSUM_MASK) ==
>     > +            PKT_RX_IP_CKSUM_MASK;
>     >  #else
>     >      return 0 && p;
>     >  #endif
>     > @@ -622,7 +634,19 @@ static inline bool
>     >  dp_packet_l4_checksum_valid(struct dp_packet *p)  {  #ifdef
>     > DPDK_NETDEV
>     > -    return p->mbuf.ol_flags & PKT_RX_L4_CKSUM_GOOD;
>     > +    return (p->mbuf.ol_flags & PKT_RX_L4_CKSUM_MASK) ==
>     > +            PKT_RX_L4_CKSUM_GOOD;
>     > +#else
>     > +    return 0 && p;
>     > +#endif
>     > +}
>     > +
>     > +static inline bool
>     > +dp_packet_l4_checksum_bad(struct dp_packet *p) { #ifdef
> DPDK_NETDEV
>     > +    return (p->mbuf.ol_flags & PKT_RX_L4_CKSUM_MASK) ==
>     > +            PKT_RX_L4_CKSUM_MASK;
>     >  #else
>     >      return 0 && p;
>     >  #endif
>     >
>     >
>     >
>     > ////////////////////////////////////////////////////////////////////
>     >
>     >
>     > On 6/16/17, 4:02 AM, "Sugesh Chandran" <sugesh.chandran at intel.com>
>     > wrote:
>     >
>     >     Avoiding checksum validation in conntrack module if it is already
> verified
>     >     in DPDK physical NIC ports.
>     >
>     >     Signed-off-by: Sugesh Chandran <sugesh.chandran at intel.com>
>     >     Acked-by: Antonio Fischetti <antonio.fischetti at intel.com>
>     >     Tested-by: Antonio Fischetti <antonio.fischetti at intel.com>
>     >
>     >     ----
>     >     v1->v2
>     >      - Rebased on master
>     >      - Added acked-by and tested-by tags in commit message
>     >     ---
>     >      lib/conntrack.c | 60 ++++++++++++++++++++++++++++++++++++----
> -----
>     > ------------
>     >      1 file changed, 38 insertions(+), 22 deletions(-)
>     >
>     >     diff --git a/lib/conntrack.c b/lib/conntrack.c
>     >     index 90b154a..38084e9 100644
>     >     --- a/lib/conntrack.c
>     >     +++ b/lib/conntrack.c
>     >     @@ -1119,10 +1119,13 @@ extract_l3_ipv6(struct conn_key *key,
> const
>     > void *data, size_t size,
>     >
>     >      static inline bool
>     >      checksum_valid(const struct conn_key *key, const void *data, size_t
> size,
>     >     -               const void *l3)
>     >     +               const void *l3, bool validate_checksum)
>     >      {
>     >          uint32_t csum = 0;
>     >
>     >     +    if (!validate_checksum) {
>     >     +       return true;
>     >     +    }
>     >          if (key->dl_type == htons(ETH_TYPE_IP)) {
>     >              csum = packet_csum_pseudoheader(l3);
>     >          } else if (key->dl_type == htons(ETH_TYPE_IPV6)) {
>     >     @@ -1138,7 +1141,7 @@ checksum_valid(const struct conn_key *key,
>     > const void *data, size_t size,
>     >
>     >      static inline bool
>     >      check_l4_tcp(const struct conn_key *key, const void *data, size_t size,
>     >     -             const void *l3)
>     >     +             const void *l3, bool validate_checksum)
>     >      {
>     >          const struct tcp_header *tcp = data;
>     >          if (size < sizeof *tcp) {
>     >     @@ -1150,12 +1153,12 @@ check_l4_tcp(const struct conn_key *key,
>     > const void *data, size_t size,
>     >              return false;
>     >          }
>     >
>     >     -    return checksum_valid(key, data, size, l3);
>     >     +    return checksum_valid(key, data, size, l3, validate_checksum);
>     >      }
>     >
>     >      static inline bool
>     >      check_l4_udp(const struct conn_key *key, const void *data, size_t
> size,
>     >     -             const void *l3)
>     >     +             const void *l3, bool validate_checksum)
>     >      {
>     >          const struct udp_header *udp = data;
>     >          if (size < sizeof *udp) {
>     >     @@ -1169,20 +1172,23 @@ check_l4_udp(const struct conn_key *key,
>     > const void *data, size_t size,
>     >
>     >          /* Validation must be skipped if checksum is 0 on IPv4 packets */
>     >          return (udp->udp_csum == 0 && key->dl_type ==
> htons(ETH_TYPE_IP))
>     >     -           || checksum_valid(key, data, size, l3);
>     >     +           || checksum_valid(key, data, size, l3, validate_checksum);
>     >      }
>     >
>     >      static inline bool
>     >     -check_l4_icmp(const void *data, size_t size)
>     >     +check_l4_icmp(const void *data, size_t size, bool validate_checksum)
>     >      {
>     >     -    return csum(data, size) == 0;
>     >     +    if (validate_checksum) {
>     >     +        return csum(data, size) == 0;
>     >     +    }
>     >     +    return true;
>     >      }
>     >
>     >      static inline bool
>     >      check_l4_icmp6(const struct conn_key *key, const void *data, size_t
> size,
>     >     -               const void *l3)
>     >     +               const void *l3, bool validate_checksum)
>     >      {
>     >     -    return checksum_valid(key, data, size, l3);
>     >     +    return checksum_valid(key, data, size, l3, validate_checksum);
>     >      }
>     >
>     >      static inline bool
>     >     @@ -1218,7 +1224,8 @@ extract_l4_udp(struct conn_key *key, const
> void
>     > *data, size_t size)
>     >      }
>     >
>     >      static inline bool extract_l4(struct conn_key *key, const void *data,
>     >     -                              size_t size, bool *related, const void *l3);
>     >     +                              size_t size, bool *related, const void *l3,
>     >     +                              bool validate_checksum);
>     >
>     >      static uint8_t
>     >      reverse_icmp_type(uint8_t type)
>     >     @@ -1306,7 +1313,7 @@ extract_l4_icmp(struct conn_key *key, const
> void
>     > *data, size_t size,
>     >              key->dst = inner_key.dst;
>     >              key->nw_proto = inner_key.nw_proto;
>     >
>     >     -        ok = extract_l4(key, l4, tail - l4, NULL, l3);
>     >     +        ok = extract_l4(key, l4, tail - l4, NULL, l3, false);
>     >              if (ok) {
>     >                  conn_key_reverse(key);
>     >                  *related = true;
>     >     @@ -1396,7 +1403,7 @@ extract_l4_icmp6(struct conn_key *key,
> const
>     > void *data, size_t size,
>     >              key->dst = inner_key.dst;
>     >              key->nw_proto = inner_key.nw_proto;
>     >
>     >     -        ok = extract_l4(key, l4, tail - l4, NULL, l3);
>     >     +        ok = extract_l4(key, l4, tail - l4, NULL, l3, false);
>     >              if (ok) {
>     >                  conn_key_reverse(key);
>     >                  *related = true;
>     >     @@ -1421,22 +1428,23 @@ extract_l4_icmp6(struct conn_key *key,
> const
>     > void *data, size_t size,
>     >       * in an ICMP error.  In this case, we skip checksum and length
> validation.
>     > */
>     >      static inline bool
>     >      extract_l4(struct conn_key *key, const void *data, size_t size, bool
>     > *related,
>     >     -           const void *l3)
>     >     +           const void *l3, bool validate_checksum)
>     >      {
>     >          if (key->nw_proto == IPPROTO_TCP) {
>     >     -        return (!related || check_l4_tcp(key, data, size, l3))
>     >     -               && extract_l4_tcp(key, data, size);
>     >     +        return (!related || check_l4_tcp(key, data, size, l3,
>     >     +                validate_checksum)) && extract_l4_tcp(key, data, size);
>     >          } else if (key->nw_proto == IPPROTO_UDP) {
>     >     -        return (!related || check_l4_udp(key, data, size, l3))
>     >     -               && extract_l4_udp(key, data, size);
>     >     +        return (!related || check_l4_udp(key, data, size, l3,
>     >     +                validate_checksum)) && extract_l4_udp(key, data, size);
>     >          } else if (key->dl_type == htons(ETH_TYPE_IP)
>     >                     && key->nw_proto == IPPROTO_ICMP) {
>     >     -        return (!related || check_l4_icmp(data, size))
>     >     +        return (!related || check_l4_icmp(data, size, validate_checksum))
>     >                     && extract_l4_icmp(key, data, size, related);
>     >          } else if (key->dl_type == htons(ETH_TYPE_IPV6)
>     >                     && key->nw_proto == IPPROTO_ICMPV6) {
>     >     -        return (!related || check_l4_icmp6(key, data, size, l3))
>     >     -               && extract_l4_icmp6(key, data, size, related);
>     >     +        return (!related || check_l4_icmp6(key, data, size, l3,
>     >     +                validate_checksum)) && extract_l4_icmp6(key, data, size,
>     >     +                related);
>     >          } else {
>     >              return false;
>     >          }
>     >     @@ -1451,6 +1459,8 @@ conn_key_extract(struct conntrack *ct, struct
>     > dp_packet *pkt, ovs_be16 dl_type,
>     >          const char *l4 = dp_packet_l4(pkt);
>     >          const char *tail = dp_packet_tail(pkt);
>     >          bool ok;
>     >     +    bool valid_l4_csum = 0;
>     >     +    bool  valid_l3_csum = 0;
>     >
>     >          memset(ctx, 0, sizeof *ctx);
>     >
>     >     @@ -1494,7 +1504,10 @@ conn_key_extract(struct conntrack *ct,
> struct
>     > dp_packet *pkt, ovs_be16 dl_type,
>     >           */
>     >          ctx->key.dl_type = dl_type;
>     >          if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
>     >     -        ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL, true);
>     >     +        valid_l3_csum = dp_packet_ip_checksum_valid(pkt);
>     >     +        /* Validate the checksum only when the csum is invalid */
>     >     +        ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
>     >     +                             !valid_l3_csum);
>     >          } else if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
>     >              ok = extract_l3_ipv6(&ctx->key, l3, tail - (char *) l3, NULL);
>     >          } else {
>     >     @@ -1502,7 +1515,10 @@ conn_key_extract(struct conntrack *ct,
> struct
>     > dp_packet *pkt, ovs_be16 dl_type,
>     >          }
>     >
>     >          if (ok) {
>     >     -        if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3)) {
>     >     +        valid_l4_csum = dp_packet_l4_checksum_valid(pkt);
>     >     +        /* Validate the ckecksum only when the checksum is not
> valid/unset
>     > */
>     >     +        if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
>     >     +            !valid_l4_csum)) {
>     >                  ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
>     >                  return true;
>     >              }
>     >     --
>     >     2.7.4
>     >
>     >
> 
> 



More information about the dev mailing list