Uploaded image for project: 'ONOS'
  1. ONOS
  2. ONOS-3825

Cluster States out of sync

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: None
    • Labels:
      None
    • Story Points:
      3

      Description

      While running push-test-intents on a linear (5-switch) topo, at some point (onos> push-test-intents -i of:0000000000000001/1 of:0000000000000005/1 500 6000
      Failure: 500 intents not installed), caught an exception. and cluster states become "inconsistent".

      onos> nodes
      id=10.128.8.11, address=10.128.8.11:9876, state=ACTIVE, updated=25m ago *
      id=10.128.8.12, address=10.128.8.12:9876, state=INACTIVE, updated=2m ago
      id=10.128.8.13, address=10.128.8.13:9876, state=ACTIVE, updated=25m ago
      onos> roles
      of:0000000000000001: master=10.128.8.12, standbys=[ 10.128.8.13 ]
      of:0000000000000002: master=10.128.8.12, standbys=[ 10.128.8.13 ]
      of:0000000000000003: master=10.128.8.12, standbys=[ 10.128.8.13 ]
      of:0000000000000004: master=10.128.8.13, standbys=[ 10.128.8.12 ]
      of:0000000000000005: master=10.128.8.12, standbys=[ 10.128.8.13 ]
      onos> leaders
      ------------------------------------------------------------------------
      Topic | Leader | Epoch | Elected |
      ------------------------------------------------------------------------
      intent-partition-13 | 10.128.8.11 | 54 | 26m ago |
      intent-partition-11 | 10.128.8.11 | 43 | 26m ago |
      intent-partition-12 | 10.128.8.11 | 49 | 26m ago |
      intent-partition-10 | 10.128.8.11 | 38 | 26m ago |
      intent-partition-0 | 10.128.8.11 | 2 | 26m ago |
      intent-partition-5 | 10.128.8.12 | 68 | 25m ago |
      intent-partition-8 | 10.128.8.12 | 65 | 25m ago |
      intent-partition-9 | 10.128.8.12 | 66 | 25m ago |
      device:of:0000000000000001 | 10.128.8.12 | 103 | 23m ago |
      device:of:0000000000000002 | 10.128.8.12 | 106 | 23m ago |
      device:of:0000000000000003 | 10.128.8.12 | 109 | 23m ago |
      device:of:0000000000000005 | 10.128.8.12 | 112 | 23m ago |
      intent-partition-3 | 10.128.8.12 | 13 | 26m ago |
      intent-partition-1 | 10.128.8.12 | 6 | 26m ago |
      intent-partition-6 | 10.128.8.13 | 78 | 25m ago |
      intent-partition-7 | 10.128.8.13 | 77 | 25m ago |
      intent-partition-4 | 10.128.8.13 | 63 | 25m ago |
      device:of:0000000000000004 | 10.128.8.13 | 100 | 24m ago |
      intent-partition-2 | 10.128.8.13 | 12 | 26m ago |
      ------------------------------------------------------------------------
      onos> partitions
      ----------------------------------------------------------
      Name Term Members
      ----------------------------------------------------------
      p0 2 onos://10.128.8.11:9876
      onos://10.128.8.12:9876
      onos://10.128.8.13:9876
      ----------------------------------------------------------
      p1 3 onos://10.128.8.11:9876 *
      onos://10.128.8.12:9876
      onos://10.128.8.13:9876
      ----------------------------------------------------------
      p2 3 onos://10.128.8.11:9876 *
      onos://10.128.8.12:9876
      onos://10.128.8.13:9876
      ----------------------------------------------------------
      p3 4 onos://10.128.8.11:9876
      onos://10.128.8.12:9876
      onos://10.128.8.13:9876 *
      ----------------------------------------------------------
      onos> nodes
      id=10.128.8.11, address=10.128.8.11:9876, state=ACTIVE, updated=26m ago *
      id=10.128.8.12, address=10.128.8.12:9876, state=INACTIVE, updated=3m ago
      id=10.128.8.13, address=10.128.8.13:9876, state=ACTIVE, updated=26m ago
      onos>

      The exception is:

      2016-01-26 14:56:19,936 | WARN | event-dispatch-0 | ListenerRegistry | 72 - org.onosproject.onos-api - 1.5.0.SNAPSHOT | Exception encountered while processing event ClusterEvent{time=2016-01-26T14:56:16.118, type=INSTANCE_DEACTIVATED, subject=DefaultControllerNode{id=10.128.8.12, ip=10.128.8.12, tcpPort=9876}}
      org.onosproject.store.service.ConsistentMapException: java.lang.IllegalStateException: Not the leader
      at org.onosproject.store.primitives.impl.DefaultConsistentMap.complete(DefaultConsistentMap.java:184)[117:org.onosproject.onos-core-primitives:1.5.0.SNAPSHOT]
      at org.onosproject.store.primitives.impl.DefaultConsistentMap.entrySet(DefaultConsistentMap.java:139)[117:org.onosproject.onos-core-primitives:1.5.0.SNAPSHOT]
      at org.onosproject.store.primitives.impl.ConsistentMapBackedJavaMap.entrySet(ConsistentMapBackedJavaMap.java:145)[117:org.onosproject.onos-core-primitives:1.5.0.SNAPSHOT]
      at org.onosproject.store.primitives.impl.ConsistentMapBackedJavaMap.forEach(ConsistentMapBackedJavaMap.java:153)[117:org.onosproject.onos-core-primitives:1.5.0.SNAPSHOT]
      at org.onosproject.store.primitives.impl.MutexExecutionManager$InternalClusterEventListener.event(MutexExecutionManager.java:192)[117:org.onosproject.onos-core-primitives:1.5.0.SNAPSHOT]
      at org.onosproject.store.primitives.impl.MutexExecutionManager$InternalClusterEventListener.event(MutexExecutionManager.java:184)[117:org.onosproject.onos-core-primitives:1.5.0.SNAPSHOT]
      at org.onosproject.event.ListenerRegistry.process(ListenerRegistry.java:66)[72:org.onosproject.onos-api:1.5.0.SNAPSHOT]
      at org.onosproject.event.impl.CoreEventDispatcher$DispatchLoop.process(CoreEventDispatcher.java:141)[114:org.onosproject.onos-core-net:1.5.0.SNAPSHOT]
      at org.onosproject.event.impl.CoreEventDispatcher$DispatchLoop.run(CoreEventDispatcher.java:124)[114:org.onosproject.onos-core-net:1.5.0.SNAPSHOT]
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_25]
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_25]
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_25]
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_25]
      at java.lang.Thread.run(Thread.java:745)[:1.8.0_25]
      Caused by: java.lang.IllegalStateException: Not the leader
      at net.kuujo.copycat.raft.PassiveState.query(PassiveState.java:267)[68:org.onosproject.onlab-thirdparty:1.5.0.SNAPSHOT]
      at net.kuujo.copycat.raft.RaftContext$$Lambda$174/2136752224.apply(Unknown Source)[68:org.onosproject.onlab-thirdparty:1.5.0.SNAPSHOT]
      at net.kuujo.copycat.raft.RaftContext.lambda$wrapCall$21(RaftContext.java:562)[68:org.onosproject.onlab-thirdparty:1.5.0.SNAPSHOT]
      at net.kuujo.copycat.raft.RaftContext$$Lambda$109/955164094.run(Unknown Source)[68:org.onosproject.onlab-thirdparty:1.5.0.SNAPSHOT]
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_25]
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_25]
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)[:1.8.0_25]
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)[:1.8.0_25]
      ... 3 more

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

            Assignee:
            madan Madan Jampani
            Reporter:
            suibin suibin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: