DX NetOps

  • 1.  Why would the DA data source UUID change?

    Posted Oct 24, 2013 09:41 PM
    I have now had 2 incident where CAPC would no longer sync with the DA. With help of the support the first time this happened the following info was provided:
    DA Status: Synchronization Failure on IM2.1/IM2.2. The
    customer had good DA syncing until something happened and after that began experiencing these failures. Analysis of the DM logs indicated the following error message consistently occurring with each attempted DA sync:

    An error occurred during a sync request with data source Data Aggregator@DA_Host: additional info:
    enum.datasourceerror.DS_PRODUCT_ID_CHANGED. The following stack trace shows the context of the sync request:
    com.ca.im.portal.api.services.interfaces.datasource.DataSourceOp$Exception: enum.datasourceerror.DS_PRODUCT_ID_CHANGED

    When DA is added as a data-source to PC it generates a random UUID, passes that UUID to PC. Both DA and PC store this UUID in their respective databases. At the start of each sync cycle PC queries each data-source and checks to make sure that the UUIDs match a sort of handshake if you will. If the UUID s do not match the sync fails. A
    reinstall of the DA without first removing it as a data-source will cause this to happen.

    It has just occurred again. In both cases there was no reinstalling of the DA data source so that reason is totally invalid in my case.

    What else would cause the DA Data Source UUID to change?

    I do have a solution to fix it but that take 30-60min of downtime assuming you detect it at the exact moment it occurs.

    In this last situation the scenario was as follows:
    - everything was working fine yesterday
    - logged in today and found view had query failures
    - checked the Data Sources tab and found CAPC had lost communication with DA Data Source
    - checked the DR and it was working fine
    - checked the DA and it was working fine (I could even go into the DA GUI from CAPC link)
    - restarted the DA
    - went back to Data Sources and now found Test said it was communicating but there was a sync failure
    - tried doing a full resync but still sync failure
    - went to http://dahostname:8581/rest/dataaggregator and noted down the UUID
    - logged on to CAPC host and open the netqosportal in mysql
    - type select SourceGUID from DataSources2 where sourceType=262144;
    - the SourceGUID was different from one noted earlier
    - followed the solution to fix both the DA and DC
    - all working again

    So once again what causes the GUID to change?


  • 2.  RE: Why would the DA data source UUID change?

    Posted Oct 28, 2013 08:14 PM

    Hello Andrew,

    Either one of the tasks below can cause this:

    (1) Rename the default tenant;

    (2) Reboot the DR without stopping the DA first.

    Both of the above cases are being addressed in a code fix.



  • 3.  RE: Why would the DA data source UUID change?

    Posted Oct 28, 2013 11:12 PM

    Neither of those apply in my case.  I don't have any tenants other than the default tenant and the DR has been running now since August.  The only thing I restarted recently was the DA and that was after I noticed CAPC wasn't communicating with it.

    So I guess the Gremlins did it. smiley



  • 4.  RE: Why would the DA data source UUID change?
    Best Answer

    Broadcom Employee
    Posted Oct 28, 2013 11:14 PM

    Andrew:

     

    This is an on-going  issue we are working on with engineering.  We hope to have a handle on this one soon

     

    Regards,

    Joe



  • 5.  RE: Why would the DA data source UUID change?

    Posted Nov 30, 2013 09:39 PM

    I habr this exact issue after a Discovery Profile from spectrum integration causes Data Aggregator to crash. The cause of the crash is consitent (discovery profile), but the broken link between Performance Center and Data Aggregator seems to happen at random.



  • 6.  RE: Why would the DA data source UUID change?

    Posted Jan 12, 2014 10:25 PM

    Hi Andrew1,

    Could I find out how you resolved this issue? We are encountering the same problem (DS_PRODUCT_ID_CHANGED) in with out PC<->DA synchronization as well



  • 7.  RE: Why would the DA data source UUID change?

    Posted Jan 13, 2014 05:14 PM

    Note this solution is for fixing the Data Aggregator data source, tested on 2.2.2 and with just one data collector.  Its just my opinion but still put in a support tick, even if you follow these instructions, as it give CA an indication of the frequency of the problem.  Also this is a do at your own risk but was provided by support and has been done twice now on our production system.

     To check if the UUID is wrong:

    • In a web browser go to: http://YourDAName:8581/rest/dataaggregator
      where "YourDAName" is the hostname of your data aggregator server
    • Note down the value of "NpcConnectionUID". It about 40% of the way down the page
    • Logon to your CAPC server as root or sudo user
    • go into mysql
    • inside mysql type

    use netqosportal;

    select SourceGUID from DataSources2 where sourceType=262144;

    • if the NpcConnectionUID and SourceGUID are different, inside mysql type

    update data_sources2 set SourceGUID={NPCConnectionUID that you wrote down} where sourceType=262144;

    • In CAPC webpage go to Admin / Data Dources
    • Select the Data Aggregator data source
    • Click the Resync button
    • Tick the Full Resync checkbox and click Resync

     

    In my case this also affected the Data Collector as well. To check this, after you have fixed the Data Aggregator:

    • In the Data Aggregator webpage, go to System Status / Data Aggregator
    • In my case this was showing 0 (zero) Data Collectors

    To fix this requires a reinstall of the DC to reestablish its connection to the DA.  Note this is NOT an uninstall and reinstall.  It is just a reinstall using the same DC ID that it currently has.  Think of it as a “fix broken connection rather than a reinstall.  To do this:

    • Logon to the Data Collector server as root or sudo user
    • Make sure the DC’s  install.bin file is in /tmp
    • Make sure you have root priveleges before doing the rest

    IMPORTANT STEP

    • Get the DCM_ID of the current installed Data Collector

    cat /opt/DataCollector/apache-karaf-2.3.0/etc/com.ca.im.dm.core.collector.cfg

    • Look for the line shown below.  You need everything to the right of the equals:

    collector-manager-id=YourDataCollectorname:f9fd6b2e-adc3-401a-83d1-46854e9a073a

    • Type the following command to ensure you use the correct DCM_ID

    export DCM_ID=YourDataCollectorname:f9fd6b2e-adc3-401a-83d1-46854e9a073a

    • Check to make sure set correctly

    echo $DCM_ID

    • Run the Data Collector install

    ./install.bin

    And follow normal install procedures



  • 8.  RE: Why would the DA data source UUID change?

    Posted Jan 13, 2014 08:23 PM

    Hi Andrew,

    yup the UID had changed on my DA. I executed the steps provided by support (which were the same as yours) and it fixed the issue. I'll be checking my Data collector as well.

    Thanks a lot for the steps and the heads-up on the Data Collector issue.

     

    Regards,

    Murali



  • 9.  RE: Why would the DA data source UUID change?

    Posted Jan 13, 2014 08:54 PM

    Hi Andrew,

    Just to let you know I checked my Data Collector and it seems to be running fine. I'm on 2.3 for my PC, DA, DR and DC. We did not make any changes to any of our servers or the applications so I'm not too sure what caused the issue though. In any case I logged a ticket with CA, Thanks for your help.

    Regards,

    Murali