Forum Discussion

Oleksandr_Malof's avatar
Oleksandr_Malof
Icon for Altocumulus rankAltocumulus
Oct 26, 2018
Solved

DSC is not coming up for LTM VEs after initial setup

Hi everyone. It looks like I am facing an issue described here: https://devcentral.f5.com/questions/devices-are-not-synchronizing-configuration-to-each-other-59604. The point is that I was trying to configure DCS for a pair of LTM VEs and DCS did not come up (Standby device has a status "Awaiting Initial Sync. Device has not synced with group"). I have tried multiple TMOS versions: 13.1.0.2, 13.1.0.6, 13.1.1.2. In all cases I got the same message at Standby unit in logs: "...warning mcpd[4418]: 01071aea:4: CMI heartbeat timer expired..". I am using VMware Fusion 8.0.0, OSX 10.11.6. The comments in the post above state this was not fixed in the most recent version 14.0.0.2 as well. Could you tell if DCS is actually working for VEs versions 13,14 at the moment? Thank you!

 

  • The answer to my initial question - yes, it is possible to configure DSC with VE v13, at least, using LAB license. But this requires 8 GB of RAM and 4 CPUs per VE. Not sure if this can be performed with Trial or Evaluation licenses as I have no possibility to test this.

     

9 Replies

  • I have also run into this caveat. I am running 13.1.1.4 on VMware Workstation 15 on Fedora 29 Linux kernel 5.0.4

     

    I have managed to resolve the problem with the following db variables on the each of cluster members:

     

    tmsh modify /sys db tm.tcplargereceiveoffload value disable

     

    tmsh modify /sys db tm.tcpsegmentationoffload value disable

     

    To be sure, I had to go through devicegroup trust reset and reconfiguration of DSC. Now everything is fine.

     

    It appears VMware has issues with offloading mechanisms. In 12.1.x everything was running fine.

     

    • Abreey1's avatar
      Abreey1
      Icon for Nimbostratus rankNimbostratus

      Krystian Baniak thank you for your help. it's worked awesomely. I was searching why between both f5 cannot sync. you made my day, thank you!

    • bader1's avatar
      bader1
      Icon for Nimbostratus rankNimbostratus

      You made my day, thank you for your help!

    • aproctorED's avatar
      aproctorED
      Icon for Nimbostratus rankNimbostratus

      Krystian - this is genius, thank you. Worked for me (Windows 10, VMWare Workstation 14, BIG-IP 13.1.3). I was seeing exactly the same messages Oleksandr mentioned and your fix solved the issue. I did not need to reset trust, just made your recommended changes and was able to sync straight away. Thanks again, I had been banging my head against a wall!!

       

      Out of interest, if you are able to share how you came upon the solution, I would be interested to hear the story - I had no idea those db variables even existed and would never have known to look for them

  • UPDATE: I have tested the same configuration with TMOS v12.1.3.7 and everything went fine. The cluster was successfully formed and configuration synced. Also I have checked v14.0.0.2 - configuration is not synchronized between devices. What I have spotted is that the output from "tmsh run /cm watch-devicegroup-device" and other "tmsh run /cm watch *" commands state "Incompatible Version", despite both appliances use the same TMOS SW version. So it seems currently neither v13.x nor v14.x are able to form DCS and sync configuration successfully.

     

    Below reside logs from a standby virtual appliance:

     

    Oct 28 16:47:19 bigiptest1 notice mcpd[4508]: 01070430:5: end_transaction message timeout on connection 0x5c204548 (user %cmi-mcpd-peer-10.3.0.201)

     

    Oct 28 16:47:19 bigiptest1 notice mcpd[4508]: 01070418:5: connection 0x5c204548 (user %cmi-mcpd-peer-10.3.0.201) was closed with active requests

     

    Oct 28 16:47:19 bigiptest1 notice mcpd[4508]: 0107143c:5: Connection to CMI peer 10.3.0.201 has been removed

     

    Oct 28 16:47:19 bigiptest1 notice mcpd[4508]: 01071432:5: CMI peer connection established to 10.3.0.201 port 6699 after 0 retries

     

    Oct 28 16:52:22 bigiptest1 notice mcpd[4508]: 01070430:5: end_transaction message timeout on connection 0x5c204548 (user %cmi-mcpd-peer-10.3.0.201)

     

    Oct 28 16:52:22 bigiptest1 notice mcpd[4508]: 01070418:5: connection 0x5c204548 (user %cmi-mcpd-peer-10.3.0.201) was closed with active requests

     

    Oct 28 16:52:22 bigiptest1 notice mcpd[4508]: 0107143c:5: Connection to CMI peer 10.3.0.201 has been removed

     

    Oct 28 16:52:22 bigiptest1 notice mcpd[4508]: 01071432:5: CMI peer connection established to 10.3.0.201 port 6699 after 0 retries

     

  • Hey Oleksandr

     

    What type of license are you using? The free 90-day trial? I saw this problem when we first released our 201 book where one reader experienced the same problem. I could replicate it and when reporting it to F5 I got the following answer:

     

    "According to what I'm hearing internally, there should be NO DIFFERENCES between the 90-day, 30-day, and Lab licenses (other than the obvious). Furthermore, I was told that they should all support HA regardless of version. I have informed them that this is not what you are experiencing and asked them to check it out."

     

    But I haven't heard anything since then. So if you're using a 90-day trial license and want to lab with HA, I suggest you use version 12 instead.

     

  • The answer to my initial question - yes, it is possible to configure DSC with VE v13, at least, using LAB license. But this requires 8 GB of RAM and 4 CPUs per VE. Not sure if this can be performed with Trial or Evaluation licenses as I have no possibility to test this.

     

    • Philip_Jonsson's avatar
      Philip_Jonsson
      Icon for MVP rankMVP

      Hey Oleksandr

       

      So you solved the problem by simply increasing the RAM and CPU count? Or did you move from your laptop to the ESXi environment?

       

      I have ran HA environments on even less than 4gb per BIG-IP and it has worked flawlessly. As I mentioned before, the only time the HA would not work is when I used an evaluation license. As soon as I used a 45 day license or a bought VE Lab license, it worked perfectly.

       

    • Oleksandr_Malof's avatar
      Oleksandr_Malof
      Icon for Altocumulus rankAltocumulus

      Hi Philip

       

      I have moved from VMware Fusion to ESXi environment (from laptop to external LAB environment). ESXi environment gave an opportunity to allocate more resources to each VM. I agree this might be an issue with the Evaluation license. As far as I know the Trial is for 90 days. And evaluation is for 30 days. Could you tell what is the license for 45 days?

       

      Thank you!