summaryrefslogtreecommitdiff
path: root/include/net/vxlan.h
AgeCommit message (Collapse)Author
2015-03-13vxlan: fix wrong usage of VXLAN_VID_MASKAlexey Kodanev
commit dfd8645ea1bd9127 wrongly assumes that VXLAN_VDI_MASK includes eight lower order reserved bits of VNI field that are using for remote checksum offload. Right now, when VNI number greater then 0xffff, vxlan_udp_encap_recv() will always return with 'bad_flag' error, reducing the usable vni range from 0..16777215 to 0..65535. Also, it doesn't really check whether RCO bits processed or not. Fix it by adding new VNI mask which has all 32 bits of VNI field: 24 bits for id and 8 bits for other usage. Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11vxlan: Use checksum partial with remote checksum offloadTom Herbert
Change remote checksum handling to set checksum partial as default behavior. Added an iflink parameter to configure not using checksum partial (calling csum_partial to update checksum). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-24vxlan: Eliminate dependency on UDP socket in transmit pathTom Herbert
In the vxlan transmit path there is no need to reference the socket for a tunnel which is needed for the receive side. We do, however, need the vxlan_dev flags. This patch eliminate references to the socket in the transmit path, and changes VXLAN_F_UNSHAREABLE to be VXLAN_F_RCV_FLAGS. This mask is used to store the flags applicable to receive (GBP, CSUM6_RX, and REMCSUM_RX) in the vxlan_sock flags. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15vxlan: Only bind to sockets with compatible flags enabledThomas Graf
A VXLAN net_device looking for an appropriate socket may only consider a socket which has a matching set of flags/extensions enabled. If incompatible flags are enabled, return a conflict to have the caller create a distinct socket with distinct port. The OVS VXLAN port is kept unaware of extensions at this point. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15vxlan: Group Policy extensionThomas Graf
Implements supports for the Group Policy VXLAN extension [0] to provide a lightweight and simple security label mechanism across network peers based on VXLAN. The security context and associated metadata is mapped to/from skb->mark. This allows further mapping to a SELinux context using SECMARK, to implement ACLs directly with nftables, iptables, OVS, tc, etc. The group membership is defined by the lower 16 bits of skb->mark, the upper 16 bits are used for flags. SELinux allows to manage label to secure local resources. However, distributed applications require ACLs to implemented across hosts. This is typically achieved by matching on L2-L4 fields to identify the original sending host and process on the receiver. On top of that, netlabel and specifically CIPSO [1] allow to map security contexts to universal labels. However, netlabel and CIPSO are relatively complex. This patch provides a lightweight alternative for overlay network environments with a trusted underlay. No additional control protocol is required. Host 1: Host 2: Group A Group B Group B Group A +-----+ +-------------+ +-------+ +-----+ | lxc | | SELinux CTX | | httpd | | VM | +--+--+ +--+----------+ +---+---+ +--+--+ \---+---/ \----+---/ | | +---+---+ +---+---+ | vxlan | | vxlan | +---+---+ +---+---+ +------------------------------+ Backwards compatibility: A VXLAN-GBP socket can receive standard VXLAN frames and will assign the default group 0x0000 to such frames. A Linux VXLAN socket will drop VXLAN-GBP frames. The extension is therefore disabled by default and needs to be specifically enabled: ip link add [...] type vxlan [...] gbp In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket must run on a separate port number. Examples: iptables: host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200 host2# iptables -I INPUT -m mark --mark 0x200 -j DROP OVS: # ovs-ofctl add-flow br0 'in_port=1,actions=load:0x200->NXM_NX_TUN_GBP_ID[],NORMAL' # ovs-ofctl add-flow br0 'in_port=2,tun_gbp_id=0x200,actions=drop' [0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy [1] http://lwn.net/Articles/204905/ Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14vxlan: Remote checksum offloadTom Herbert
Add support for remote checksum offload in VXLAN. This uses a reserved bit to indicate that RCO is being done, and uses the low order reserved eight bits of the VNI to hold the start and offset values in a compressed manner. Start is encoded in the low order seven bits of VNI. This is start >> 1 so that the checksum start offset is 0-254 using even values only. Checksum offset (transport checksum field) is indicated in the high order bit in the low order byte of the VNI. If the bit is set, the checksum field is for UDP (so offset = start + 6), else checksum field is for TCP (so offset = start + 16). Only TCP and UDP are supported in this implementation. Remote checksum offload for VXLAN is described in: https://tools.ietf.org/html/draft-herbert-vxlan-rco-00 Tested by running 200 TCP_STREAM connections with VXLAN (over IPv4). With UDP checksums and Remote Checksum Offload IPv4 Client 11.84% CPU utilization Server 12.96% CPU utilization 9197 Mbps IPv6 Client 12.46% CPU utilization Server 14.48% CPU utilization 8963 Mbps With UDP checksums, no remote checksum offload IPv4 Client 15.67% CPU utilization Server 14.83% CPU utilization 9094 Mbps IPv6 Client 16.21% CPU utilization Server 14.32% CPU utilization 9058 Mbps No UDP checksums IPv4 Client 15.03% CPU utilization Server 23.09% CPU utilization 9089 Mbps IPv6 Client 16.18% CPU utilization Server 26.57% CPU utilization 8954 Mbps Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12vxlan: Improve support for header flagsTom Herbert
This patch cleans up the header flags of VXLAN in anticipation of defining some new ones: - Move header related definitions from vxlan.c to vxlan.h - Change VXLAN_FLAGS to be VXLAN_HF_VNI (only currently defined flag) - Move check for unknown flags to after we find vxlan_sock, this assumes that some flags may be processed based on tunnel configuration - Add a comment about why the stack treating unknown set flags as an error instead of ignoring them Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-26net: Generalize ndo_gso_check to ndo_features_checkJesse Gross
GSO isn't the only offload feature with restrictions that potentially can't be expressed with the current features mechanism. Checksum is another although it's a general issue that could in theory apply to anything. Even if it may be possible to implement these restrictions in other ways, it can result in duplicate code or inefficient per-packet behavior. This generalizes ndo_gso_check so that drivers can remove any features that don't make sense for a given packet, similar to netif_skb_features(). It also converts existing driver restrictions to the new format, completing the work that was done to support tunnel protocols since the issues apply to checksums as well. By actually removing features from the set that are used to do offloading, it solves another problem with the existing interface. In these cases, GSO would run with the original set of features and not do anything because it appears that segmentation is not required. CC: Tom Herbert <therbert@google.com> CC: Joe Stringer <joestringer@nicira.com> CC: Eric Dumazet <edumazet@google.com> CC: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Tom Herbert <therbert@google.com> Fixes: 04ffcb255f22 ("net: Add ndo_gso_check") Tested-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-18vxlan: Inline vxlan_gso_check().Joe Stringer
Suggested-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14net: Add vxlan_gso_check() helperJoe Stringer
Most NICs that report NETIF_F_GSO_UDP_TUNNEL support VXLAN, and not other UDP-based encapsulation protocols where the format and size of the header differs. This patch implements a generic ndo_gso_check() for VXLAN which will only advertise GSO support when the skb looks like it contains VXLAN (or no UDP tunnelling at all). Implementation shamelessly stolen from Tom Herbert: http://thread.gmane.org/gmane.linux.network/332428/focus=333111 Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07vxlan: Call udp_flow_src_portTom Herbert
In vxlan and OVS vport-vxlan call common function to get source port for a UDP tunnel. Removed vxlan_src_port since the functionality is now in udp_flow_src_port. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04vxlan: Add support for UDP checksums (v4 sending, v6 zero csums)Tom Herbert
Added VXLAN link configuration for sending UDP checksums, and allowing TX and RX of UDP6 checksums. Also, call common iptunnel_handle_offloads and added GSO support for checksums. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24vxlan: add x-netns supportNicolas Dichtel
This patch allows to switch the netns when packet is encapsulated or decapsulated. The vxlan socket is openned into the i/o netns, ie into the netns where encapsulated packets are received. The socket lookup is done into this netns to find the corresponding vxlan tunnel. After decapsulation, the packet is injecting into the corresponding interface which may stand to another netns. When one of the two netns is removed, the tunnel is destroyed. Configuration example: ip netns add netns1 ip netns exec netns1 ip link set lo up ip link add vxlan10 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0 ip link set vxlan10 netns netns1 ip netns exec netns1 ip addr add 192.168.0.249/24 broadcast 192.168.0.255 dev vxlan10 ip netns exec netns1 ip link set vxlan10 up Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-21net: Add GRO support for vxlan trafficOr Gerlitz
Add GRO handlers for vxlann, by using the UDP GRO infrastructure. For single TCP session that goes through vxlan tunneling I got nice improvement from 6.8Gbs to 11.5Gbs --> UDP/VXLAN GRO disabled $ netperf -H 192.168.52.147 -c -C $ netperf -t TCP_STREAM -H 192.168.52.147 -c -C MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 65536 65536 10.00 6799.75 12.54 24.79 0.604 1.195 --> UDP/VXLAN GRO enabled $ netperf -t TCP_STREAM -H 192.168.52.147 -c -C MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 65536 65536 10.00 11562.72 24.90 20.34 0.706 0.577 Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29vxlan: Have the NIC drivers do less work for offloadsJoseph Gasparakis
This patch removes the burden from the NIC drivers to check if the vxlan driver is enabled in the kernel and also makes available the vxlan headrooms to them. Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-09-05vxlan: Notify drivers for listening UDP port changesJoseph Gasparakis
This patch adds two more ndo ops: ndo_add_rx_vxlan_port() and ndo_del_rx_vxlan_port(). Drivers can get notifications through the above functions about changes of the UDP listening port of VXLAN. Also, when physical ports come up, now they can call vxlan_get_rx_port() in order to obtain the port number(s) of the existing VXLAN interface in case they already up before them. This information about the listening UDP port would be used for VXLAN related offloads. A big thank you to John Fastabend (john.r.fastabend@intel.com) for his input and his suggestions on this patch set. CC: John Fastabend <john.r.fastabend@intel.com> CC: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-04vxlan: remove net arg from vxlan[6]_xmit_skb()Nicolas Dichtel
This argument is not used, let's remove it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-31vxlan: add ipv6 supportCong Wang
This patch adds IPv6 support to vxlan device, as the new version RFC already mentions it: http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-03 Cc: David Stevens <dlstevens@us.ibm.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20vxlan: Factor out vxlan send api.Pravin B Shelar
Following patch allows more code sharing between vxlan and ovs-vxlan. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20vxlan: Extend vxlan handlers for openvswitch.Pravin B Shelar
Following patch adds data field to vxlan socket and export vxlan handler api. vh->data is required to store private data per vxlan handler. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>