[ovs-dev] [PATCH v2] conntrack : Use Rx checksum offload feature on DPDK ports for conntrack.
Chandran, Sugesh
sugesh.chandran at intel.com
Tue Jun 27 10:47:30 UTC 2017
Hi Darrel,
Regards
_Sugesh
> -----Original Message-----
> From: Darrell Ball [mailto:dball at vmware.com]
> Sent: Tuesday, June 27, 2017 8:15 AM
> To: Chandran, Sugesh <sugesh.chandran at intel.com>
> Cc: dev at openvswitch.org
> Subject: Re: [PATCH v2] conntrack : Use Rx checksum offload feature on
> DPDK ports for conntrack.
>
> Hi Sugesh
>
> On 6/26/17, 12:51 AM, "Chandran, Sugesh" <sugesh.chandran at intel.com>
> wrote:
>
> Hi Darrel,
> Please find my answers below.
>
> Regards
> _Sugesh
>
> > -----Original Message-----
> > From: Darrell Ball [mailto:dball at vmware.com]
> > Sent: Saturday, June 24, 2017 10:39 PM
> > To: Chandran, Sugesh <sugesh.chandran at intel.com>;
> dev at openvswitch.org
> > Subject: Re: [PATCH v2] conntrack : Use Rx checksum offload feature on
> > DPDK ports for conntrack.
> >
> > Hi Sugesh
> >
> > I have a few questions:
> >
> > 1) There is no checking for bad hwol determined checksums If the dpdk
> > documentation is correct, then this is indicated by both good and bad
> flags
> > set for both L3 and L4.
> [Sugesh] I don’t think that’s the case. At least in my testing, the GOOD flag
> is set
> only when the checksum is good.
> If the checksum is BAD, the BAD flag is set, not both.
>
> Below is the full context I was referring to earlier:
> What is your interpretation of:
> 1) PKT_RX_IP_CKSUM_NONE, which equals PKT_RX_IP_CKSUM_MASK
> Packet is good but checksum needs to recalculated and updated in packet ?
> 2) PKT_RX_L4_CKSUM_NONE which PKT_RX_L4_CKSUM_MASK Packet is
> good but checksum needs to recalculated and updated in packet ?
> 3) The deprecated labellings of PKT_RX_IP_CKSUM_BAD and
> PKT_RX_L4_CKSUM_BAD ?
>
> **
> * Deprecated.
> * Checking this flag alone is deprecated: check the 2 bits of
> * PKT_RX_IP_CKSUM_MASK.
> * This flag was set when the IP checksum of a packet was detected as
> * wrong by the hardware.
> */
> #define PKT_RX_IP_CKSUM_BAD (1ULL << 4)
>
> /**
> * Mask of bits used to determine the status of RX IP checksum.
> * - PKT_RX_IP_CKSUM_UNKNOWN: no information about the RX IP
> checksum
> * - PKT_RX_IP_CKSUM_BAD: the IP checksum in the packet is wrong
> * - PKT_RX_IP_CKSUM_GOOD: the IP checksum in the packet is valid
> * - PKT_RX_IP_CKSUM_NONE: the IP checksum is not correct in the packet
> * data, but the integrity of the IP header is verified.
> */
> #define PKT_RX_IP_CKSUM_MASK ((1ULL << 4) | (1ULL << 7))
>
> #define PKT_RX_IP_CKSUM_UNKNOWN 0
> #define PKT_RX_IP_CKSUM_BAD (1ULL << 4)
> #define PKT_RX_IP_CKSUM_GOOD (1ULL << 7)
> #define PKT_RX_IP_CKSUM_NONE ((1ULL << 4) | (1ULL << 7))
>
>
> **
> * Deprecated.
> * Checking this flag alone is deprecated: check the 2 bits of
> * PKT_RX_L4_CKSUM_MASK.
> * This flag was set when the L4 checksum of a packet was detected as
> * wrong by the hardware.
> */
> #define PKT_RX_L4_CKSUM_BAD (1ULL << 3)
>
> /**
> * Mask of bits used to determine the status of RX L4 checksum.
> * - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4
> checksum
> * - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
> * - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
> * - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet
> * data, but the integrity of the L4 data is verified.
> */
> #define PKT_RX_L4_CKSUM_MASK ((1ULL << 3) | (1ULL << 8))
>
> #define PKT_RX_L4_CKSUM_UNKNOWN 0
> #define PKT_RX_L4_CKSUM_BAD (1ULL << 3)
> #define PKT_RX_L4_CKSUM_GOOD (1ULL << 8)
> #define PKT_RX_L4_CKSUM_NONE ((1ULL << 3) | (1ULL << 8))
>
>
>
> > 2) If hwol can reliably indicate when a checksum is known to be bad, then
> we
> > should stop right there and say the packet is bad, which we don’t do
> here.
> > We are checking if it is not known to be good and if not, do checksum
> verify
> > in software.
> [Sugesh] The reason for implementing this way is, reduce number of
> validations
> So less cycles for validating the flag. Are you suggesting to validating the
> bad checksum
> flag even for the good case?
>
> No, assuming we can determine the checksum is bad reliably from hwol, we
> could check for bad only if not good. High probability of bad checksums might
> be an exploit attempt.
>
[Sugesh] Ah, Now I got it. You are right,
Yes we must use the flags with the mask to validate it.
>
> > I added a dp-packet API to check for bad, assuming I understood the rte
> > documentation and the documentation is accurate.
> > Please confirm if a bad checksum can reliably be determined by rte hwol.
> > 3) Now, if the checksum is not bad according to hwol, we can proceed to
> > check if it is known to be good by hwol. It seems that the existing dp-
> packet
> > API does not definitely check for good, since if the rte documentation is
> > correct, the good flag is set in both good and bad checksum cases, but in
> the
> > bad case, the bad flag is also set. I updated the API according to my
> reading
> > of the rte documentation.
> [Sugesh]
> the flag explanation in the DPDK code is shown as below.
>
>
> /**
> * Mask of bits used to determine the status of RX IP checksum.
> * - PKT_RX_IP_CKSUM_UNKNOWN: no information about the RX IP
> checksum
> * - PKT_RX_IP_CKSUM_BAD: the IP checksum in the packet is wrong
> * - PKT_RX_IP_CKSUM_GOOD: the IP checksum in the packet is valid
> * - PKT_RX_IP_CKSUM_NONE: the IP checksum is not correct in the
> packet
> * data, but the integrity of the IP header is verified.
> */
>
> /**
> * Mask of bits used to determine the status of RX L4 checksum.
> * - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4
> checksum
> * - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
> * - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
> * - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the
> packet
> * data, but the integrity of the L4 data is verified.
> */
>
> The BAD and GOOD is mutual exclusive. So will that be enough to validate
> only one.??
>
> Assuming we can determine the checksum is bad reliably from hwol, we
> could check for bad only if not good. High probability of bad checksums might
> be an exploit attempt.
>
>
> > Please verify if the documentation is correct and my reading is correct.
> > 4) If the checksum is not known to bad or good by hwol, then it is
> > undetermined and must be checked by OVS.
> >
> > I have some code below, which is an incremental from yours, that
> describes
> > ‘1-4’ above.
> > Let me know if my read of the rte documentation is correct and the
> > documentation itself is accurate. The diff includes a bit of coding standard
> > changes.
> >
> > diff --git a/lib/conntrack.c b/lib/conntrack.c index 38084e9..6f40df2
> 100644
> > --- a/lib/conntrack.c
> > +++ b/lib/conntrack.c
> > @@ -1459,8 +1459,6 @@ conn_key_extract(struct conntrack *ct, struct
> > dp_packet *pkt, ovs_be16 dl_type,
> > const char *l4 = dp_packet_l4(pkt);
> > const char *tail = dp_packet_tail(pkt);
> > bool ok;
> > - bool valid_l4_csum = 0;
> > - bool valid_l3_csum = 0;
> >
> > memset(ctx, 0, sizeof *ctx);
> >
> > @@ -1504,23 +1502,32 @@ conn_key_extract(struct conntrack *ct, struct
> > dp_packet *pkt, ovs_be16 dl_type,
> > */
> > ctx->key.dl_type = dl_type;
> > if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
> > - valid_l3_csum = dp_packet_ip_checksum_valid(pkt);
> > - /* Validate the checksum only when the csum is invalid */
> > - ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
> > - !valid_l3_csum);
> > + bool hwol_bad_l3_csum = dp_packet_ip_checksum_bad(pkt);
[Sugesh] Will add these changes in the next version of patch.
> > + if (hwol_bad_l3_csum) {
> > + ok = false;
> > + } else {
> > + bool hwol_good_l3_csum = dp_packet_ip_checksum_valid(pkt);
> > + /* Validate the checksum only when hwol is not supported. */
> > + ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
> > + !hwol_good_l3_csum);
> > + }
> > } else if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
> > ok = extract_l3_ipv6(&ctx->key, l3, tail - (char *) l3, NULL);
> > } else {
> > ok = false;
> > }
> >
> > +
> > if (ok) {
> > - valid_l4_csum = dp_packet_l4_checksum_valid(pkt);
> > - /* Validate the ckecksum only when the checksum is not valid/unset
> */
> > - if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
> > - !valid_l4_csum)) {
> > - ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
> > - return true;
> > + bool hwol_bad_l4_csum = dp_packet_l4_checksum_bad(pkt);
> > + if (!hwol_bad_l4_csum) {
> > + bool hwol_good_l4_csum = dp_packet_l4_checksum_valid(pkt);
> > + /* Validate the checksum only when hwol is not supported. */
> > + if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
> > + !hwol_good_l4_csum)) {
> > + ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
> > + return true;
> > + }
> > }
> > }
> >
> > diff --git a/lib/dp-packet.h b/lib/dp-packet.h index d2549b1..c5a7f1d
> 100644
> > --- a/lib/dp-packet.h
> > +++ b/lib/dp-packet.h
> > @@ -612,7 +612,19 @@ static inline bool
> > dp_packet_ip_checksum_valid(struct dp_packet *p) { #ifdef
> > DPDK_NETDEV
> > - return p->mbuf.ol_flags & PKT_RX_IP_CKSUM_GOOD;
> > + return (p->mbuf.ol_flags & PKT_RX_IP_CKSUM_MASK) ==
> > + PKT_RX_IP_CKSUM_GOOD;
> > +#else
> > + return 0 && p;
> > +#endif
> > +}
> > +
> > +static inline bool
> > +dp_packet_ip_checksum_bad(struct dp_packet *p) { #ifdef
> DPDK_NETDEV
> > + return (p->mbuf.ol_flags & PKT_RX_IP_CKSUM_MASK) ==
> > + PKT_RX_IP_CKSUM_MASK;
> > #else
> > return 0 && p;
> > #endif
> > @@ -622,7 +634,19 @@ static inline bool
> > dp_packet_l4_checksum_valid(struct dp_packet *p) { #ifdef
> > DPDK_NETDEV
> > - return p->mbuf.ol_flags & PKT_RX_L4_CKSUM_GOOD;
> > + return (p->mbuf.ol_flags & PKT_RX_L4_CKSUM_MASK) ==
> > + PKT_RX_L4_CKSUM_GOOD;
> > +#else
> > + return 0 && p;
> > +#endif
> > +}
> > +
> > +static inline bool
> > +dp_packet_l4_checksum_bad(struct dp_packet *p) { #ifdef
> DPDK_NETDEV
> > + return (p->mbuf.ol_flags & PKT_RX_L4_CKSUM_MASK) ==
> > + PKT_RX_L4_CKSUM_MASK;
> > #else
> > return 0 && p;
> > #endif
> >
> >
> >
> > ////////////////////////////////////////////////////////////////////
> >
> >
> > On 6/16/17, 4:02 AM, "Sugesh Chandran" <sugesh.chandran at intel.com>
> > wrote:
> >
> > Avoiding checksum validation in conntrack module if it is already
> verified
> > in DPDK physical NIC ports.
> >
> > Signed-off-by: Sugesh Chandran <sugesh.chandran at intel.com>
> > Acked-by: Antonio Fischetti <antonio.fischetti at intel.com>
> > Tested-by: Antonio Fischetti <antonio.fischetti at intel.com>
> >
> > ----
> > v1->v2
> > - Rebased on master
> > - Added acked-by and tested-by tags in commit message
> > ---
> > lib/conntrack.c | 60 ++++++++++++++++++++++++++++++++++++----
> -----
> > ------------
> > 1 file changed, 38 insertions(+), 22 deletions(-)
> >
> > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > index 90b154a..38084e9 100644
> > --- a/lib/conntrack.c
> > +++ b/lib/conntrack.c
> > @@ -1119,10 +1119,13 @@ extract_l3_ipv6(struct conn_key *key,
> const
> > void *data, size_t size,
> >
> > static inline bool
> > checksum_valid(const struct conn_key *key, const void *data, size_t
> size,
> > - const void *l3)
> > + const void *l3, bool validate_checksum)
> > {
> > uint32_t csum = 0;
> >
> > + if (!validate_checksum) {
> > + return true;
> > + }
> > if (key->dl_type == htons(ETH_TYPE_IP)) {
> > csum = packet_csum_pseudoheader(l3);
> > } else if (key->dl_type == htons(ETH_TYPE_IPV6)) {
> > @@ -1138,7 +1141,7 @@ checksum_valid(const struct conn_key *key,
> > const void *data, size_t size,
> >
> > static inline bool
> > check_l4_tcp(const struct conn_key *key, const void *data, size_t size,
> > - const void *l3)
> > + const void *l3, bool validate_checksum)
> > {
> > const struct tcp_header *tcp = data;
> > if (size < sizeof *tcp) {
> > @@ -1150,12 +1153,12 @@ check_l4_tcp(const struct conn_key *key,
> > const void *data, size_t size,
> > return false;
> > }
> >
> > - return checksum_valid(key, data, size, l3);
> > + return checksum_valid(key, data, size, l3, validate_checksum);
> > }
> >
> > static inline bool
> > check_l4_udp(const struct conn_key *key, const void *data, size_t
> size,
> > - const void *l3)
> > + const void *l3, bool validate_checksum)
> > {
> > const struct udp_header *udp = data;
> > if (size < sizeof *udp) {
> > @@ -1169,20 +1172,23 @@ check_l4_udp(const struct conn_key *key,
> > const void *data, size_t size,
> >
> > /* Validation must be skipped if checksum is 0 on IPv4 packets */
> > return (udp->udp_csum == 0 && key->dl_type ==
> htons(ETH_TYPE_IP))
> > - || checksum_valid(key, data, size, l3);
> > + || checksum_valid(key, data, size, l3, validate_checksum);
> > }
> >
> > static inline bool
> > -check_l4_icmp(const void *data, size_t size)
> > +check_l4_icmp(const void *data, size_t size, bool validate_checksum)
> > {
> > - return csum(data, size) == 0;
> > + if (validate_checksum) {
> > + return csum(data, size) == 0;
> > + }
> > + return true;
> > }
> >
> > static inline bool
> > check_l4_icmp6(const struct conn_key *key, const void *data, size_t
> size,
> > - const void *l3)
> > + const void *l3, bool validate_checksum)
> > {
> > - return checksum_valid(key, data, size, l3);
> > + return checksum_valid(key, data, size, l3, validate_checksum);
> > }
> >
> > static inline bool
> > @@ -1218,7 +1224,8 @@ extract_l4_udp(struct conn_key *key, const
> void
> > *data, size_t size)
> > }
> >
> > static inline bool extract_l4(struct conn_key *key, const void *data,
> > - size_t size, bool *related, const void *l3);
> > + size_t size, bool *related, const void *l3,
> > + bool validate_checksum);
> >
> > static uint8_t
> > reverse_icmp_type(uint8_t type)
> > @@ -1306,7 +1313,7 @@ extract_l4_icmp(struct conn_key *key, const
> void
> > *data, size_t size,
> > key->dst = inner_key.dst;
> > key->nw_proto = inner_key.nw_proto;
> >
> > - ok = extract_l4(key, l4, tail - l4, NULL, l3);
> > + ok = extract_l4(key, l4, tail - l4, NULL, l3, false);
> > if (ok) {
> > conn_key_reverse(key);
> > *related = true;
> > @@ -1396,7 +1403,7 @@ extract_l4_icmp6(struct conn_key *key,
> const
> > void *data, size_t size,
> > key->dst = inner_key.dst;
> > key->nw_proto = inner_key.nw_proto;
> >
> > - ok = extract_l4(key, l4, tail - l4, NULL, l3);
> > + ok = extract_l4(key, l4, tail - l4, NULL, l3, false);
> > if (ok) {
> > conn_key_reverse(key);
> > *related = true;
> > @@ -1421,22 +1428,23 @@ extract_l4_icmp6(struct conn_key *key,
> const
> > void *data, size_t size,
> > * in an ICMP error. In this case, we skip checksum and length
> validation.
> > */
> > static inline bool
> > extract_l4(struct conn_key *key, const void *data, size_t size, bool
> > *related,
> > - const void *l3)
> > + const void *l3, bool validate_checksum)
> > {
> > if (key->nw_proto == IPPROTO_TCP) {
> > - return (!related || check_l4_tcp(key, data, size, l3))
> > - && extract_l4_tcp(key, data, size);
> > + return (!related || check_l4_tcp(key, data, size, l3,
> > + validate_checksum)) && extract_l4_tcp(key, data, size);
> > } else if (key->nw_proto == IPPROTO_UDP) {
> > - return (!related || check_l4_udp(key, data, size, l3))
> > - && extract_l4_udp(key, data, size);
> > + return (!related || check_l4_udp(key, data, size, l3,
> > + validate_checksum)) && extract_l4_udp(key, data, size);
> > } else if (key->dl_type == htons(ETH_TYPE_IP)
> > && key->nw_proto == IPPROTO_ICMP) {
> > - return (!related || check_l4_icmp(data, size))
> > + return (!related || check_l4_icmp(data, size, validate_checksum))
> > && extract_l4_icmp(key, data, size, related);
> > } else if (key->dl_type == htons(ETH_TYPE_IPV6)
> > && key->nw_proto == IPPROTO_ICMPV6) {
> > - return (!related || check_l4_icmp6(key, data, size, l3))
> > - && extract_l4_icmp6(key, data, size, related);
> > + return (!related || check_l4_icmp6(key, data, size, l3,
> > + validate_checksum)) && extract_l4_icmp6(key, data, size,
> > + related);
> > } else {
> > return false;
> > }
> > @@ -1451,6 +1459,8 @@ conn_key_extract(struct conntrack *ct, struct
> > dp_packet *pkt, ovs_be16 dl_type,
> > const char *l4 = dp_packet_l4(pkt);
> > const char *tail = dp_packet_tail(pkt);
> > bool ok;
> > + bool valid_l4_csum = 0;
> > + bool valid_l3_csum = 0;
> >
> > memset(ctx, 0, sizeof *ctx);
> >
> > @@ -1494,7 +1504,10 @@ conn_key_extract(struct conntrack *ct,
> struct
> > dp_packet *pkt, ovs_be16 dl_type,
> > */
> > ctx->key.dl_type = dl_type;
> > if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
> > - ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL, true);
> > + valid_l3_csum = dp_packet_ip_checksum_valid(pkt);
> > + /* Validate the checksum only when the csum is invalid */
> > + ok = extract_l3_ipv4(&ctx->key, l3, tail - (char *) l3, NULL,
> > + !valid_l3_csum);
> > } else if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
> > ok = extract_l3_ipv6(&ctx->key, l3, tail - (char *) l3, NULL);
> > } else {
> > @@ -1502,7 +1515,10 @@ conn_key_extract(struct conntrack *ct,
> struct
> > dp_packet *pkt, ovs_be16 dl_type,
> > }
> >
> > if (ok) {
> > - if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3)) {
> > + valid_l4_csum = dp_packet_l4_checksum_valid(pkt);
> > + /* Validate the ckecksum only when the checksum is not
> valid/unset
> > */
> > + if (extract_l4(&ctx->key, l4, tail - l4, &ctx->related, l3,
> > + !valid_l4_csum)) {
> > ctx->hash = conn_key_hash(&ctx->key, ct->hash_basis);
> > return true;
> > }
> > --
> > 2.7.4
> >
> >
>
>
More information about the dev
mailing list