Commit Graph

681 Commits

Author SHA1 Message Date
takt
d71301a47b Attach RTA_ENCAP_TYPE to children, not rtAttrs, when using Multipath 2019-11-13 10:34:27 -08:00
Daniel Borkmann
cbc6cb49af link, veth: fix stack corruption from retrieving peer index
For 4.20 and newer kernels VethPeerIndex() causes a stack corruption as
the kernel is copying more data to golang user space than originally
expected. This is due to a recent kernel commit where it extends veth
driver's ethtool stats for XDP:

  https://git.kernel.org/torvalds/c/d397b9682c1c808344dd93b43de8750fa4d9f581

The VethPeerIndex()'s logic is utterly wrong to assume ethtool stats are
never extended in the driver. Unfortunately there is no other way around
in golang than to add serialize/deserialize helpers to have a dynamically
sized ethtoolStats with a uint64 data array that has the size of the previous
result from the ETHTOOL_GSSET_INFO query. This ensures we don't run into
a buffer overflow triggered by kernel's copy_to_user() in ETHTOOL_GSTATS
query (ethtool_get_stats() in kernel). Now, for the deserialize operation
we really only care about the peer's ifindex which is always stored in
the first uint64.

Fixes: 54ad9e3a4c ("Two new functions: LinkSetBondSlave and VethPeerIndex")
Reported-by: Jean Raby <jean@raby.sh>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: phob0s <git@phob0s.pl>
2019-11-13 10:31:03 -08:00
Daniel Borkmann
b9fd9670a1 link, veth: remove useless call to retrieve ethtool strings
It's not needed for retrieving the veth peer ifindex, and we already
get the set count via earlier ETHTOOL_GSSET_INFO call. Both are copying
veth_get_sset_count() up to user space in veth case (which is the only
user of this anyway).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-11-13 10:31:03 -08:00
J. Brandt Buckley
aad0baef28 Don't auto-set broadcast unless subnet larger than /31
Since [vishvananda/netlink#248](https://github.com/vishvananda/netlink/pull/248), adding an address automatically sets the broadcast if the broadcast address was not specified. This is undesirable when adding an IP with a prefixlen of /31 or /32. (Additional details in the issues linked below.)

This changes the behavior so that the broadcast is only automatically set if the prefixlen is /30 or larger.

Issue reported in:

- https://github.com/vishvananda/netlink/issues/329
- https://github.com/vishvananda/netlink/issues/471

See also:

- [RFC 3021](http://tools.ietf.org/html/rfc3021)

Alternatives to this PR:

A. https://github.com/vishvananda/netlink/issues/472 - Adds `AddrAddWithoutCalculatedBroadcast`.
B. 9a85a619d2 - Breaking change to make auto-setting the broadcast address an opt-in feature.
C. already works - Suppress setting the broadcast when addr's broadcast address is set to `0.0.0.0`. (This works today, but I'm not sure the behavior can be relied upon as a public API.)
2019-11-13 10:28:39 -08:00
Tobias Klauser
e934999cd7 Add support for Go modules
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2019-11-07 09:00:16 -08:00
Takushi Fujiwara
2ec5bdc52b Change LinkSetMaster's master argument type. (*Bridge -> Link)
LinkSetMaster also works with Bond device.
So this PR changes the type of master argument to Link.
2019-09-30 07:54:47 -07:00
Yakushi Fujiwara
ac5f4df047 Add support for parsing IFLA_BOND_ARP_IP_TARGET 2019-09-24 13:57:46 -07:00
Oleg Senin
6b3a223c53 Add ip6tnl support 2019-09-24 13:56:29 -07:00
Ihar Hrachyshka
07ace697be Introduce constants for known VF link states 2019-09-24 13:55:40 -07:00
Sam Gwydir
205d80393d Support setting link state for SR-IOV VFs 2019-09-24 13:55:40 -07:00
Takushi Fujiwara
205a160d2e Add bond slave information
This PR refers to PR@lebauce and add some changes.
- Added some tests to retrieve bond slave information.
- Link.BondSlave is changed to LinkSlave interface.
- BondSlaveState.String() returns UPPER case. (same as iproute2)
- BondSlaveMiiStatus.String() returns UPPER case. (same as iproute2)
2019-09-16 08:52:39 -07:00
Laurent Bernaille
e906d22624 Add support for output-mark 2019-09-16 08:26:04 -07:00
Tobias Klauser
36d367fd76 Remove unused *_PROTO constants
These are unused since commit 941b4de9e1

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2019-09-13 09:58:27 -07:00
Juan-Luis de Sousa-Valadas Castaño
07130f38b9 Fix parsing of IFLA_GRE_COLLECT_METADATA 2019-09-12 09:05:07 -07:00
Nicolas Belouin
a1c9a648f7 neighSubscribeAt: AF_BRIDGE entries not listed when listExisting is true
When subscribing to neigh updates, the updates for all neigh
protocol families are received. However when listExisting is set,
the request is made with AF_UNSPEC family, this request does not
include AF_BRIDGE entries.

This patch add a second request for AF_BRIDGE entries.

Add test for existing AF_BRIDGE entry and make expectNeighUpdate
take a slice of expected updates

Creates a VXLAN interface for this test as its AF_BRIDGE entries
looks a lot like usual ones

Also add support for latest (2014+) neighbour attributes

NDA_MASTER was added back in 2014, it indicates whether a neigh
entry is linked to a master interface and index of this interface.

The other entries, namely NDA_LINK_NETNSID and NDA_SRC_VNI were
added later and will need extra handling.

Signed-off-by: Nicolas Belouin <nicolas.belouin@gandi.net>
2019-08-23 11:29:04 -07:00
GopiKrishna Kodali
941b4de9e1 Read connection marking information from CT flow TLV 2019-08-23 11:20:03 -07:00
Yakushi Fujiwara
254c8a89c5 Replace values defined in unix package.
- replace following values to unix.*
  AF_MPLS, RTA_NEWDST, RTA_ENCAP_TYPE, RTA_ENCAP
2019-08-23 11:17:48 -07:00
Naiming Shen
e825b754c0 Add Timestamp, Timeout to conntrack
Signed-off-by: Naiming Shen <naiming@zededa.com>
2019-08-12 12:01:13 -07:00
Adrian Chiris
46ae81cf70 Add support for IPoIB interfaces
- Add a new Link type, IPoIB, that exposes the following IPoIB attributes:
    * IFLA_IPOIB_PKEY
    * IFLA_IPOIB_MODE
    * IFLA_IPOIB_UMCAST
- Suppport Deserialize for IPoIB link attributes in LinkDeserialize()
- Support IPoIB attributes in LinkAdd()
2019-08-12 04:46:40 -07:00
Thomas Bucher
b4e9f47a11 Update netlink_unspecified.go
AddrReplace was missing, could not compile on OSX
2019-07-26 00:49:17 +02:00
Adrian Chiris
28720742a4 Add support for IFLA_VF_RATE
Today netlink package supports Get/Set of a VF's max TX rate
via IFLA_VF_TX_RATE netlink attribute.

This patch add support to Get/Set of a VFs min and max TX rate
via IFLA_VF_RATE netlink attribute.

- Add support to set min/max tx rate for VF via IFLA_VF_RATE
- Added IFLA_VF_RATE min/max tx rate attributes to netlink.VfInfo
  including parsing support in netlink.parseVfInfo()

NOTE: According to [1] IFLA_VF_RATE takes precedence over
      IFLA_VF_TX_RATE. Dealing with the co-existance of these
      netlink attributes is left for the user to handle.

[1]https://lists.openwall.net/netdev/2014/05/22/42
2019-07-25 03:38:53 +02:00
bingshen.wbs
14bd2e6fd2 support ipvlan flag
Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>
2019-07-25 03:37:08 +02:00
Daniel Borkmann
b1e9859792 netlink: enforce similar pid checks as in iproute2
iproute2's own netlink library asserts that the sockaddr sender pid
has to be the one of the kernel [0]. It also doesn't bail out on pid
mismatch but only skips the message instead. We've seen cases where
the latter had a pid 0; in such case we should skip to the next nl
message instead of hard bail out.

  [0] https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/lib/libnetlink.c
      rtnl_dump_filter_l(), __rtnl_talk_iov()

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-18 17:26:53 -07:00
Przemyslaw Lal
43af4161ea Add support for setting VLAN QoS for VF links
This change adds support for setting VLAN QoS (priority) field for the
SR-IOV Virtual Function links.
2019-07-15 07:57:55 -07:00
Anton Aksola
df01369528 add support for link groups (IFLA_GROUP) 2019-07-13 20:34:27 -07:00
Farid Zakaria
2e4a68ee6c Add support for additional TC BPF filter attributes
In order to support BPF_SYSCALL `PROG_GET_FD_BY_ID` -- the ID of the
eBPF must be available.

Add the additional enumerations and handle them when parsing the BPF
filter.
2019-07-01 11:37:39 -07:00
Lorenz Bauer
a8241965b5 Allow replacing filters
Add a function FilterReplace, which mirrors the behaviour of
QdiscReplace, etc. This makes it possible to swap out filters
with a single netlink message.
2019-06-24 06:52:36 -07:00
Martynas Pumputis
99a56c251a veth: Set peer hardware addr when creating
This commit extends LinkAdd function for Veth by allowing to specify
peer hardware addr.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
2019-06-18 07:33:17 -07:00
Parav Pandit
123a384710 Add an API to change net namespace of RDMA device
Add an API to change net namespace of RDMA device similar to

$ rdma dev set [DEV] netns NSNAME

Signed-off-by: Parav Pandit <parav@mellanox.com>
2019-06-13 22:19:36 -07:00
Sargun Dhillon
d50d15ce3f Set Link TX / RX Queues on Deserialization
This deserializes the tx queue, and rx queue count on link
deserialization. We already supported it on serialization.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2019-06-12 19:02:44 -07:00
eriknordmark
3a1f6536f6 Make AddrSubscribe more robust against kernel reporting errors 2019-06-10 08:41:22 -07:00
Parav Pandit
0f040b9e2c Add an API to set RDMA subsystem network namespace mode
Add an API to change the RDMA subsystem network namespace mode as either
shared or exclusive similar to

$ rdma system set netns { shared | exclusive }

Signed-off-by: Parav Pandit <parav@mellanox.com>
2019-06-07 21:21:07 -07:00
Parav Pandit
4666477197 Add an API to query RDMA subsystem net namespace mode
RDMA subsystem can be running in shared or exclusive mode with regards
to sharing RDMA device sharing among multiple network namespaces.

Add and API to query such mode of kernel similar to iproute2 command
$ rdma system show netns

Signed-off-by: Parav Pandit <parav@mellanox.com>
2019-06-07 21:21:07 -07:00
yandong.yan
c8c507c80e fix: fix ip rule goto bug 2019-06-03 19:20:42 -07:00
Archana Shinde
db99c040b9 tuntap: Return TunTapLink instead of GenericLink
For tuntap interfaces, return a TunTap Interface instead of
a Generic link when retrieving the interface.
Use netlink extended attributes to populate the Link attributes
for the tuntap link.
In case of older tun driver which does not provide these
attributes, use sysfs to retrieve these attributes.

This commit also adds Owner and Group attributes for the TunTap
Link.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2019-06-03 12:01:21 -07:00
Manohar Castelino
e99361632b Fix unit test failure: TestNeighAddDelLLIPAddr
TestNeighAddDelLLIPAddr was failing due to the Neighbour table
not getting properly populated when using a ipip tunnel.
This matches the behaviour in the latest kernel when using
the ip command.

Switch the tunnel type to a gre point to multi-point tunnel.
The neighbour table gets properly populated in this case.

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2019-06-03 08:12:13 -07:00
Krister Johansen
fb5fbae935 Mirred and connmark clobber their ActionAttrs
Encountered this in a local test.  It turns out that in parseActions
mirred has a bug where it parses the action attributes but then on the
very next line overwrites this hard work by assigning an empty
ActionAttrs struct on top.  I copy pasta'd this into connmark.  Fix both
instances and amend the unit tests to catch this going forward.

Signed-off-by: Krister Johansen <krister.johansen@oracle.com>
2019-05-31 09:24:53 -07:00
Vishvananda Abrams
1187dc9297 Fix tests 2019-05-29 19:32:31 -07:00
Krister Johansen
00009fb860 Add support for TC_ACT_CONNMARK
Implements the connmark action described in tc-connmark(8)

Signed-off-by: Krister Johansen <krister.johansen@oracle.com>
2019-05-22 08:35:24 -07:00
soyking
fafc1e7b60 support vlan protocol 2019-05-03 14:23:34 -07:00
Parav Pandit
fd97bf4e47 Add command to set devlink device switchdev mode
Devlink device currently has legacy and switchdev mode.
Add an API to set devlink device mode for discovered devlink device.

Signed-off-by: Parav Pandit <parav@mellanox.com>
2019-05-01 11:37:24 -07:00
Parav Pandit
bcb80b237c Add devlink command by to get specific device name
Add a command to get information about a specific devlink device
referenced by device name (bus, device).

Remove unused setupDevlinkKModule().

Signed-off-by: Parav Pandit <parav@mellanox.com>
2019-05-01 11:37:24 -07:00
CodeLingo Bot
f504738125 Fix function comments based on best practices from Effective Go
Signed-off-by: CodeLingo Bot <bot@codelingo.io>
2019-03-19 09:31:22 -07:00
Andrei Vagin
e281812e70 Fix typos
Signed-off-by: Andrei Vagin <avagin@google.com>
2019-03-19 08:22:03 -07:00
Andrei Vagin
adb577d4a4 Add support for IFLA_GSO_*
IFLA_GSO_MAX_SIZE - maximum GSO segment size
IFLA_GSO_MAX_SEGS - maximum number of GSO segments

Signed-off-by: Andrei Vagin <avagin@google.com>
2019-03-17 17:31:49 -07:00
Andrei Vagin
aa950f24b9 travis: run tests with Go 1.12.x
Signed-off-by: Andrei Vagin <avagin@google.com>
2019-03-17 17:31:49 -07:00
Andrei Vagin
b64d7bc44d travis: specify go_import_path
This will allow to enable travis for forks.

Signed-off-by: Andrei Vagin <avagin@google.com>
2019-03-17 17:31:49 -07:00
Iskander Sharipov
b9cafe4a85 remove redundant type assertions in type switch
Use type switch var to get properly-typed value
inside case clauses.

Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>
2019-02-06 11:24:39 -08:00
Matt Ellison
1e2e7ab670 Add Support for Virtual XFRM Interfaces
XFRM interfaces are available in Linux Kernel 4.19+

When an IF_ID is applied to a XFRM policy and state, the corresponding
traffic will be sent through the virtual interface with the same IF_ID.
2019-01-05 11:40:40 -08:00
Matt Ellison
48a75e0e38 Fix Race Condition in TestXfrmMonitorExpire 2019-01-04 09:44:57 -08:00