Uploaded image for project: 'ONOS'
  1. ONOS
  2. ONOS-4515

Cluster Device Role States out of Sync

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 1.6.0
    • Fix Version/s: 1.6.0
    • Component/s: None
    • Labels:
    • Environment:

      3 node cluster

    • Story Points:
      3
    • Epic Link:
    • Sprint:
      Goldeneye Black Sprint #4

      Description

      the device roles in ONOS are out of sync between nodes in the cluster. one node is seeing five devices with the role none, and the other two nodes see the five devices with standby as the role.

      This bug was produced by starting the mininet topology first, then starting ONOS and connecting the switches in mininet using the sh ovs-vsctl set-controller command in the mininet prompt.

      There are seven switches in the topology that was started, but each onos node only saw the same five switches and reported that there were to SCC(s). All ONOS nodes are in a READY state and all the tcp connections exist on the mininet machine.

      When this problem occurs, the karaf.log very rapidly fills with the warnings seen and quickly grows to large sizes.

      The ONOS karaf.log from the node seeing the problem:

      2016-05-10 15:22:21,345 | INFO  | ew I/O worker #1 | OFChannelHandler
             | 179 - org.onosproject.onos-of-ctl - 1.6.0.SNAPSHOT | New switch connec
      tion from /10.128.50.10:51746
      2016-05-10 15:22:21,362 | INFO  | ew I/O worker #1 | OFChannelHandler
             | 179 - org.onosproject.onos-of-ctl - 1.6.0.SNAPSHOT | Sending OF_13 Hel
      lo to /10.128.50.10:51746
      2016-05-10 15:22:21,388 | INFO  | ew I/O worker #1 | OFChannelHandler
             | 179 - org.onosproject.onos-of-ctl - 1.6.0.SNAPSHOT | Received port des
      c reply for switch at [/10.128.50.10:51746 DPID[00:00:00:00:00:00:00:01]]
      2016-05-10 15:22:21,407 | INFO  | ew I/O worker #1 | OFChannelHandler
             | 179 - org.onosproject.onos-of-ctl - 1.6.0.SNAPSHOT | Received switch d
      escription reply OFDescStatsReplyVer13(xid=4294967289, flags=[], mfrDesc=Nicira
      , Inc., hwDesc=Open vSwitch, swDesc=2.3.1, serialNum=None, dpDesc=None) from sw
      itch at /10.128.50.10:51746
      2016-05-10 15:22:21,414 | INFO  | ew I/O worker #1 | Controller
             | 179 - org.onosproject.onos-of-ctl - 1.6.0.SNAPSHOT | OpenFlow handshak
      er found for device 1: org.onosproject.driver.handshaker.NiciraSwitchHandshaker
       [? DPID[00:00:00:00:00:00:00:01]]
      2016-05-10 15:22:21,414 | INFO  | ew I/O worker #1 | ntrollerImpl$OpenFlowSwitc
      hAgent | 179 - org.onosproject.onos-of-ctl - 1.6.0.SNAPSHOT | Added switch 00:0
      0:00:00:00:00:00:01
      2016-05-10 15:22:21,532 | INFO  | event-dispatch-0 | FlowObjectiveManager
             | 129 - org.onosproject.onos-core-net - 1.6.0.SNAPSHOT | Driver ovs boun
      d to device of:0000000000000001 ... initializing driver
      2016-05-10 15:22:21,549 | INFO  | nos-topo-build-1 | TopologyManager
             | 129 - org.onosproject.onos-core-net - 1.6.0.SNAPSHOT | Topology Defaul
      tTopology{time=710344648978180, creationTime=1462918941546, computeCost=535068,
       clusters=1, devices=1, links=0} changed
      2016-05-10 15:22:21,569 | INFO  | ew I/O worker #1 | DeviceManager
             | 129 - org.onosproject.onos-core-net - 1.6.0.SNAPSHOT | Local role is S
      TANDBY for of:0000000000000001
      2016-05-10 15:22:21,585 | INFO  | ew I/O worker #1 | DeviceManager
             | 129 - org.onosproject.onos-core-net - 1.6.0.SNAPSHOT | Device of:00000
      00000000001 connected
      2016-05-10 15:22:21,587 | INFO  | ew I/O worker #1 | PortStatsCollector
             | 180 - org.onosproject.onos-of-provider-device - 1.6.0.SNAPSHOT | Start
      ing Port Stats collection thread for 00:00:00:00:00:00:00:01
      2016-05-10 15:22:21,592 | INFO  | ew I/O worker #1 | GroupStatsCollector
             | 183 - org.onosproject.onos-of-provider-group - 1.6.0.SNAPSHOT | Starti
      ng Group Stats collection thread for 00:00:00:00:00:00:00:01
      2016-05-10 15:22:21,593 | INFO  | ew I/O worker #1 | MeterStatsCollector
             | 184 - org.onosproject.onos-of-provider-meter - 1.6.0.SNAPSHOT | Starti
      ng Meter Stats collection thread for 00:00:00:00:00:00:00:01
      2016-05-10 15:22:21,594 | INFO  | ew I/O worker #1 | OFChannelHandler
             | 179 - org.onosproject.onos-of-ctl - 1.6.0.SNAPSHOT | Processing 0 pend
      ing port status messages for 00:00:00:00:00:00:00:01
      2016-05-10 15:22:21,597 | WARN  | ew I/O worker #1 | DeviceManager
             | 129 - org.onosproject.onos-core-net - 1.6.0.SNAPSHOT | Role mismatch o
      n of:0000000000000001. set to STANDBY, but store demands NONE
      2016-05-10 15:22:21,597 | INFO  | ew I/O worker #1 | ntrollerImpl$OpenFlowSwitc
      hAgent | 179 - org.onosproject.onos-of-ctl - 1.6.0.SNAPSHOT | Transitioned swit
      ch 00:00:00:00:00:00:00:01 to EQUAL
      2016-05-10 15:22:21,794 | WARN  | ew I/O worker #1 | DeviceManager
             | 129 - org.onosproject.onos-core-net - 1.6.0.SNAPSHOT | Role mismatch o
      n of:0000000000000001. set to STANDBY, but store demands NONE
      2016-05-10 15:22:21,868 | WARN  | ew I/O worker #1 | DeviceManager
             | 129 - org.onosproject.onos-core-net - 1.6.0.SNAPSHOT | Role mismatch o
      n of:0000000000000001. set to STANDBY, but store demands NONE
      2016-05-10 15:22:22,043 | WARN  | ew I/O worker #1 | DeviceManager
             | 129 - org.onosproject.onos-core-net - 1.6.0.SNAPSHOT | Role mismatch o
      n of:0000000000000001. set to STANDBY, but store demands NONE
      

      The mininet log:

      *** Creating network
      *** Adding hosts:
      h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16 h17 h18 h19 h20 h21 h22 h23 h24
      *** Adding switches:
      s1 s2 s3 s4 s5 s6 s7
      *** Adding links:
      (s1, s2) (s1, s3) (s1, s4) (s1, s5) (s2, s3) (s2, s5) (s2, s6) (s3, s4) (s3, s6) (s4, s7) (s5, h1) (s5, h2) (s5, h3) (s5, h4) (s5, h5) (s5, h6) (s5, h7) (s5, h8) (s6, h9) (s6, h10) (s6, h11) (s6, h12) (s6, h13) (s6, h14) (s6, h15) (s6, h16) (s7, h17) (s7, h18) (s7, h19) (s7, h20) (s7, h21) (s7, h22) (s7, h23) (s7, h24)
      *** Configuring hosts
      h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16 h17 h18 h19 h20 h21 h22 h23 h24
      *** Starting controller
      
      *** Starting 7 switches
      s1 s2 s3 s4 s5 s6 s7 ...
      *** Starting CLI:
      mininet> sh ovs-vsctl set-controller s1 tcp:10.128.50.11:6653 tcp:10.128.50.12:6653 tcp:10.128.50.13:6653
      mininet> sh ovs-vsctl set-controller s2 tcp:10.128.50.11:6653 tcp:10.128.50.12:6653 tcp:10.128.50.13:6653
      mininet> sh ovs-vsctl set-controller s3 tcp:10.128.50.11:6653 tcp:10.128.50.12:6653 tcp:10.128.50.13:6653
      mininet> sh ovs-vsctl set-controller s4 tcp:10.128.50.11:6653 tcp:10.128.50.12:6653 tcp:10.128.50.13:6653
      mininet> sh ovs-vsctl set-controller s5 tcp:10.128.50.11:6653 tcp:10.128.50.12:6653 tcp:10.128.50.13:6653
      mininet> sh ovs-vsctl set-controller s6 tcp:10.128.50.11:6653 tcp:10.128.50.12:6653 tcp:10.128.50.13:6653
      mininet> sh ovs-vsctl set-controller s7 tcp:10.128.50.11:6653 tcp:10.128.50.12:6653 tcp:10.128.50.13:6653
      

      The TCP connections:

      tcp        0      0 10.128.50.10:51766      10.128.50.11:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:33196      10.128.50.12:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:33188      10.128.50.12:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:55077      10.128.50.13:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:51753      10.128.50.11:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:33198      10.128.50.12:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:33184      10.128.50.12:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:55081      10.128.50.13:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:51759      10.128.50.11:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:55071      10.128.50.13:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:55085      10.128.50.13:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:51757      10.128.50.11:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:55072      10.128.50.13:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:33180      10.128.50.12:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:55087      10.128.50.13:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:55079      10.128.50.13:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:51746      10.128.50.11:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:33185      10.128.50.12:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:33193      10.128.50.12:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:51750      10.128.50.11:6653       ESTABLISHED
      tcp        0      0 10.128.50.10:51761      10.128.50.11:6653       ESTABLISHED
      

        Attachments

        1. karaf.log
          7.76 MB
        2. newFuncTopo.py
          6 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

            Assignee:
            ximara Jeremy Songster
            Reporter:
            ximara Jeremy Songster
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: