Wiki source code of Open vSwitch

Last modified by Sebastian Marsching on 2022/05/29 14:05

Show last authors
1 {{toc/}}
2
3 # Setting up Open vSwitch on Ubuntu 12.04 LTS
4
5 Basically I used an [existing tutorial for Ubuntu 12.04 LTS](http://blog.allanglesit.com/2012/03/linux-kvm-ubuntu-12-04-with-openvswitch/), however I also used some ideas from a [tutorial for Ubuntu 12.10](http://blog.allanglesit.com/2012/10/linux-kvm-ubuntu-12-10-with-openvswitch/).
6
7 First we install two packages needed for Open vSwitch (the other packages are automatically pulled in, because they are dependencies):
8
9 ```bash
10 aptitude install openvswitch-brcompat openvswitch-controller
11 ```
12
13 Next we add the `brcompat` module to `/etc/modules`. I am not entirely sure, whether this is really necessary, however, I found that it helped in avoiding the traditional bridge module to be loaded first.
14
15 We also have to enable the bridge compatibility layer by setting `BRCOMPAT=yes` in `/etc/default/openvswitch-switch`.
16
17 At this point it is a good idea to reboot, because the `brcompat` module cannot be loaded if the `bridge` module is already loaded. Probably we could just unload the module but rebooting is easier and also helps us in ensuring that it will work the next time we reboot.
18
19 Now everything should be ready for configuration. In this example we create a bridge `ovsbr0` that is connected to the physical interface `eth0`. This interface has an untagged VLAN and several tagged VLANs (we just show one as an example). The untagged VLAN also is the one supposed to be used for the management interface of the server. For each of the VLANs we create a bridge that will provide traffice from this VLAN untagged, so that we can bind a virtual machines interface to each VLAN individually.
20
21 First we create the main bridge and the bridges for the individual VLANs:
22
23 ```bash
24 ovs-vsctl add-br ovsbr0
25 ovs-vsctl add-br ovsbr0v1 1 # Create a bridge for VLAN 1.
26 ovs-vsctl add-br ovsbr0v2 2 # Create a bridge for VLAN 2.
27 ```
28
29 Now, assuming that the we want to use VLAN 1 for the management interface of the server, we add a port with this VLAN ID:
30
31 ```bash
32 ovs-vsctl add-port ovsbr0 ovsbr0p1
33 ovs-vsctl set port ovsbr0p1 tag=1
34 ovs-vsctl set interface ovsbr0p1 type=internal
35 ovs-vsctl set interface ovsbr0p1 mac="00\:01\:02\:03\:04\:05"
36 ```
37
38 Please note that some of the attributes are set in the _port_ table, while others are set in the _interface_ table. Obviously the MAC address should be replaced by a proper random MAC address. The page about KVM describes [[how to generate a random MAC address|doc:Linux.KVM.WebHome|anchor="HGeneratingarandomMACaddress"]]. You do not have to set a MAC address explicitly, however in this case the MAC address will change after each reboot, which typically is not desirable for the network interface of a server.
39
40 Now we change the network configuration in `/etc/network/interfaces`. We have to make sure that each virtual interface is brought up, even if we only use it as a bridge. We do this by bringing it up but disabling any IP configuration on it:
41
42 auto eth0
43 iface eth0 inet manual
44 up ifconfig $IFACE 0.0.0.0 up
45 up echo 1 >/proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
46 down ifconfig $IFACE down
47
48 auto ovsbr0
49 iface ovsbr0 inet manual
50 up ifconfig $IFACE 0.0.0.0 up
51 up echo 1 >/proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
52 down ifconfig $IFACE down
53
54 auto ovsbr0p1
55 # This is just a place-holder. Replace it with the proper configuration for the
56 # management interface. Typically this is the configuration you had for eth0
57 # before.
58 iface ovsbr0p1 inet dhcp
59 iface ovsbr0p1 inet6 auto
60
61 auto ovsbr0v1
62 iface ovsbr0v1 inet manual
63 up ifconfig $IFACE 0.0.0.0 up
64 up echo 1 >/proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
65 down ifconfig $IFACE down
66
67 auto ovsbr0v2
68 iface ovsbr0v2 inet manual
69 up ifconfig $IFACE 0.0.0.0 up
70 up echo 1 >/proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
71 down ifconfig $IFACE down
72
73 For some reasons the system will not detect that the network has already been configured and thus delay startup when using Open vSwitch. Therefore we modify `/etc/init/failsafe.conf` in order to make it not wait for the network configuration to be finished. You can do this by applying the following patch:
74
75 ```patch
76 --- failsafe.conf.dpkg-dist 2013-01-18 22:17:33.000000000 +0100
77 +++ failsafe.conf 2013-01-18 22:18:08.000000000 +0100
78 @@ -29,10 +29,10 @@
79 # the end of this script to avoid letting the system spin forever
80 # waiting on it to start.
81 $PLYMOUTH message --text="Waiting for network configuration..." || :
82 - sleep 40
83 + sleep 1
84
85 - $PLYMOUTH message --text="Waiting up to 60 more seconds for network configuration..." || :
86 - sleep 59
87 + $PLYMOUTH message --text="Waiting one more second for network configuration..." || :
88 + sleep 1
89 $PLYMOUTH message --text="Booting system without full network configuration..." || :
90
91 # give user 1 second to see this message since plymouth will go
92 ```
93
94 The last steps have to be performed directly from the server's operator's console, because they will interrupt the network connection. We add `eth0` to the bridge and configure it for the right VLAN mode (VLAN 1 is untagged, all other VLANs are tagged):
95
96 ```bash
97 ovs-vsctl add-port ovsbr0 eth0
98 ovs-vsctl set port eth0 tag=1
99 ovs-vsctl set port eth0 vlan_mode=native-untagged
100 ```
101
102 That's it. After rebooting the server again, the network should be working and you can specify the bridge `ovsbr0v1` and `ovsbr0v2` in virtual-machine configurations.
103
104 # Setting up Open vSwitch on Ubuntu 14.04 LTS
105
106 The instructions are nearly the same as for Ubuntu 12.04 LTS, so I only mention the differences.
107
108 Instead of installing `openvswitch-brcompat` and `openvswitch-controller`, install `openvswitch-switch`. You also do not have to enable the `brcompat` module.
109
110 You also do not have to make the changes to `failsafe.conf`. The system will boot fine without those changes.
111
112 # {{id name="fail-over-interface"/}}Using Open vSwitch for a high-availability / fail-over interface
113
114 A simple HA setup for an IP address can easily be created using [Pacemaker](http://clusterlabs.org/) and the `ocf:heartbeat:IPaddr2` and `ocf:heartbeat:IPv6addr` scripts. However, this kind of setup has one weakness: During fail-over, the MAC address changes because the IP address is now associated with a different computer and thus a different NIC. This can cause problems with old entries in ARP tables. Linux systems will typically deal with this correctly (they will see the unsolicited ARP message and update their caches), but some other operating systems or dedicated network equipment might cause trouble. For example, I had problems with the ARP cache of a Netgear GSM7328v2 switch, which could only be resolved by waiting a long time or manually clearing the ARP cache. Obviously, both options are not viable for an HA setup, where fail-over has to happen automatically and within seconds.
115
116 Therefore, it is desirable to keep the MAC address and transfer it together with the IP address. However, Linux does not allow more than one MAC address for a single interface and (to my knowledge) does not allow explicit configuration of MAC addresses on bridges. Luckily, the latter limitation does not apply to Open vSwitch bridges. Each interface associated with a specific port of a bridge can explicitly configured with a MAC address. This way, we can dynamically add or remove a port with the MAC address for which we want the fail-over setup.
117
118 We use the following commands to create the OVS bridge and add the NIC as as a port. In our example, the bridge has the name `ovsbr0` and the NIC has the name `eth0`. We assume that no tagged VLANs are used.
119
120 ```bash
121 ovs-vsctl add-br ovsbr0
122 ovs-vsctl add-port ovsbr0 eth0
123 ```
124
125 In `/etc/network/interfaces` we create the following configuration:
126
127 ```
128 auto eth0
129 iface eth0 inet manual
130 up ip link set dev $IFACE up
131 up sysctl -q -w net.ipv6.conf.$IFACE.disable_ipv6=1
132 down ip link set dev $IFACE down
133
134 auto ovsbr0
135 iface ovsbr0 inet manual
136
137 up ip link set dev $IFACE up
138 up sysctl -q -w net.ipv6.conf.$IFACE.disable_ipv6=1
139 down ip link set dev $IFACE down
140 ```
141
142 We have to bring up the interfaces because otherwise the bridge will not work. On the other hand, we want to disable IPv6 so that the interfaces do not get automatically assigned IPv6 addresses.
143
144 In order to manage an OVS bridge port with Pacemaker, we need a corresponding resource script. The following script does the job and should be saved as `$OCF_ROOT/resource.d/marsching/OVSPort` (on most systems, `$OCF_ROOT` is `/usr/lib/ocf`):
145
146 ```bash
147 #!/bin/bash
148
149 # OVS bridge port script for Pacemaker - Copyright 2014 Sebastian Marsching
150 #
151 # Permission is hereby granted, free of charge, to any person obtaining
152 # a copy of this software and associated documentation files (the
153 # "Software"), to deal in the Software without restriction, including
154 # without limitation the rights to use, copy, modify, merge, publish,
155 # distribute, sublicense, and/or sell copies of the Software, and to
156 # permit persons to whom the Software is furnished to do so, subject to
157 # the following conditions:
158 #
159 # The above copyright notice and this permission notice shall be included
160 # in all copies or substantial portions of the Software.
161 #
162 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
163 # OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
164 # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
165 # IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
166 # CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
167 # TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
168 # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
169
170 : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/resource.d/heartbeat}
171 . ${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs
172
173 # Avoid localization issues
174 unset LC_ALL; export LC_ALL
175 unset LANGUAGE; export LANGUAGE
176 LC_ALL=C; export LC_ALL
177 LC_MESSAGES=C; export LC_MESSAGES
178
179 meta_data() {
180 cat <<EOF
181
182 <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
183 <resource-agent name="OVSPort">
184 <version>1.0</version>
185 <longdesc lang="en">
186 This script manages an OpenVSwitch port.
187 It adds a port to a bridge or removes it
188 respectively and configures the
189 corresponding network interface.
190 The network interface has to be already
191 configured in /etc/network/interfaces,
192 because this script relies on the ifup
193 and ifdown tools.
194 </longdesc>
195 <shortdesc lang="en">Adds/removes an OVS port and configures the corresponding interface.</shortdesc>
196 <parameters>
197 <parameter name="bridge" unique="1" required="1">
198 <longdesc lang="en">
199 The name of the OVS bridge that the port is added to.
200 </longdesc>
201 <shortdesc lang="en">Bridge name</shortdesc>
202 <content type="string" default="" ></content>
203 </parameter>
204 <parameter name="interface" unique="1" required="1">
205 <longdesc lang="en">
206 The name of the port and the corresponding interface.
207 </longdesc>
208 <shortdesc lang="en">Interface/port name</shortdesc>
209 <content type="string" default="" ></content>
210 </parameter>
211 <parameter name="mac" unique="1" required="0">
212 <longdesc lang="en">
213 The MAC address of the interface. If not specified,
214 a random MAC address is used.
215 </longdesc>
216 <shortdesc lang="en">Interface MAC address</shortdesc>
217 <content type="string" default="" ></content>
218 </parameter>
219 </parameters>
220 <actions>
221 <action name="start" timeout="20s" ></action>
222 <action name="stop" timeout="20s" ></action>
223 <action name="monitor" depth="0" timeout="20s" interval="15s" ></action>
224 <action name="validate-all" timeout="20s" ></action>
225 <action name="meta-data" timeout="5s" ></action>
226 </actions>
227 </resource-agent>
228 EOF
229 exit $OCF_SUCCESS
230 }
231
232 usage() {
233 echo "usage: $0 {start|stop|status|monitor|validate-all|meta-data}" >&2
234 }
235
236 check_is_root() {
237 if ocf_is_root ; then
238 :
239 else
240 echo "ERROR: This action requires root privileges." >&2
241 exit $OCF_ERR_PERM
242 fi
243 }
244
245 ovs_validate_all() {
246 check_is_root
247 if ifquery "$OCF_RESKEY_interface" >/dev/null 2>&1 ; then
248 if ovs-vsctl br-exists "$OCF_RESKEY_bridge" ; then
249 return $OCF_SUCCESS
250 else
251 echo "ERROR: Bridge \"$OCF_RESKEY_bridge\" not found." >&2
252 return $OCF_ERR_CONFIGURED
253 fi
254 else
255 echo "ERROR: No configuration for interface \"$OCF_RESKEY_interface\" found." >&2
256 return $OCF_ERR_CONFIGURED
257 fi
258 }
259
260 ovs_status_internal() {
261 if ifconfig "$OCF_RESKEY_interface" >/dev/null 2>&1 ; then
262 return $OCF_SUCCESS
263 else
264 return $OCF_NOT_RUNNING
265 fi
266 }
267
268 ovs_monitor() {
269 check_is_root
270 if ovs_status_internal ; then
271 echo "Interface \"$OCF_RESKEY_interface\" is up."
272 return $OCF_SUCCESS
273 else
274 local rc=$?
275 echo "Interface \"$OCF_RESKEY_interface\" is down."
276 return $rc
277 fi
278 }
279
280 ovs_start() {
281 check_is_root
282 ovs_add_port_command="ovs-vsctl add-port $OCF_RESKEY_bridge $OCF_RESKEY_interface -- set interface $OCF_RESKEY_interface type=internal"
283 if [ -n "$OCF_RESKEY_mac" ]; then
284 mac="`echo $OCF_RESKEY_mac|sed -e "s/:/\\\\\\\\:/g"`"
285 ovs_add_port_command="$ovs_add_port_command -- set interface $OCF_RESKEY_interface mac=$mac"
286 fi
287 if $ovs_add_port_command >/dev/null 2>&1 && ifup "$OCF_RESKEY_interface" >/dev/null 2>&1 && ovs_status_internal; then
288 echo "Interface \"$OCF_RESKEY_interface\" is up."
289 return $OCF_SUCCESS
290 else
291 if ifup --force "$OCF_RESKEY_interface" >/dev/null 2>&1 && ovs_status_internal; then
292 echo "Interface \"$OCF_RESKEY_interface\" is up."
293 return $OCF_SUCCESS
294 else
295 echo "ERROR: Could not bring up interface \"$OCF_RESKEY_interface\"." >&2
296 return $OCF_ERR_GENERIC
297 fi
298 fi
299 }
300
301 ovs_stop() {
302 check_is_root
303 ifdown "$OCF_RESKEY_interface" >/dev/null 2>&1
304 if ovs_status_internal; then
305 ifdown --force "$OCF_RESKEY_interface" >/dev/null 2>&1
306 ovs-vsctl del-port "$OCF_RESKEY_interface"
307 if ovs_status_internal; then
308 echo "ERROR: Could not bring down interface \"$OCF_RESKEY_interface\"." >&2
309 return $OCF_ERR_GENERIC
310 else
311 echo "Interface \"$OCF_RESKEY_interface\" is down."
312 return $OCF_SUCCESS
313 fi
314 else
315 echo "Interface \"$OCF_RESKEY_interface\" is down."
316 return $OCF_SUCCESS
317 fi
318 }
319
320 case $1 in
321 meta-data)
322 meta_data
323 ;;
324 start)
325 ovs_validate_all && ovs_start
326 ;;
327 stop)
328 ovs_stop
329 ;;
330 monitor)
331 ovs_monitor
332 ;;
333 validate-all)
334 ovs_validate_all
335 ;;
336 *)
337 usage
338 exit $OCF_ERR_UNIMPLEMENTED
339 ;;
340 esac
341
342 exit $?
343 ```
344
345 We can now define a resource using `crm`:
346
347 primitive iface-ovsbr0p1 ocf:marsching:OVSPort \
348 params bridge="ovsbr0" interface="ovsbr0p1" mac="02:00:00:00:00:01" \
349 op monitor interval="15s"
350
351 However, in order for this to work, we also need a configuration for this interface in `/etc/network/interfaces`:
352
353 iface ovsbr0p1 inet static
354 address 192.0.2.1
355 netmask 255.255.255.0
356 iface ovsbr0p1 inet6 static
357 address 2001:db8::1
358 netmask 64
359 accept_ra 0
360 autoconf 0
361
362 Please note that there is no `auto` line for this interface, because it is brought up and down by the resource management script. If there are other interfaces on the same bridge (with IP addresses in the same subnet), some additional configuration is needed. We have to use source-based routing to make sure the right interface (and thus the right MAC address) is used in ARP replies (see the [discussion on Server Fault](http://serverfault.com/questions/247472/arp-replies-contain-wrong-mac-address)). In addition to that, the [arp_filter](https://lwn.net/Articles/45386/#arp_filter) flag might need to be set on the affected interfaces (it was not needed in my case).
363
364 iface ovsbr0p1 inet static
365 address 192.0.2.1
366 netmask 255.255.255.0
367 post-up ip route add 192.0.2.0/24 table $IFACE dev $IFACE
368 post-up ip route add default table $IFACE via 192.0.2.254 dev $IFACE
369 post-up ip rule add from 192.0.2.1/32 pref 1000 table $IFACE
370 pre-down ip rule del from 192.0.2.1/32 pref 1000 table $IFACE
371 iface ovsbr0p1 inet6 static
372 address 2001:db8::1
373 netmask 64
374 accept_ra 0
375 autoconf 0
376
377 The exact routes obviously depend on the actual environment. Typically, you will want the same routes as in the "normal" routing table, just with the different source device. This configuration assumes that the name of the interface is also a valid alias for a routing table ID. These aliases are configured in `/etc/iproute2/rt_tables`. You could also use a numeric identifier in the commands, however I find it easier to manage the numeric IDs in one central file and just use aliases in all other locations.
378
379 For IPv6, source-based routing seems not to be needed. The NDP implementation seems to use the MAC address of the interface to which the IP address is actually assigned.