Publish APM Data to ElasticSearch (and now Splunk!)

Idea created by tnoonan on Jul 3, 2015
    New
    Score26

    It has long been an issue no matter where I have worked with in the past with CA's APM that if you are working in a large environment with a LOT of agents and thus agent data, that you will begin to run into performance issues with the cluster.  The current organization that I am involved with has multiple parties that are trying to re-surface our performance data ranging from static Cognos reports, third party dashboarding technologies or capacity management reporting.  All requests take a toll on the collector architecture when one of those third parties comes in and trys to query say 30 days worth of data.

     

    While researching solutions, I came across the ELK stack from Elastic.co.  Specifically the ElasticSearch component of the stack.  It's a document based storage facility that is super fast and super scalable.  So, my thought was to publish data, in real time, to an elastic search engine.  Then that document repository could be used by the third parties wanting historical performance data at will.  Beat it up...all the while our EM cluster is calmly pushing out 1 minute data to this data store, reducing the hit that we take when a large query comes through.  I'm not trying to boil the ocean and send "every" metric possibly available.  We should just be going after the KPI's that are important to the organization, such as:

     

    • CPU, Memory, Workload for Capacity management solutions
    • CEM RTTM data
    • Frontend/Backend high level data
    • whatever is important to your organization

     

    I've attached a zip file that contains my first pass at an integration that does just this.  Publishes CA's APM data via simple REST calls to an ElasticSearch node.  The zip file contains the following:

     

    1. Readme document providing the nuts and bolts of the integration
    2. jar file to be included on the classpath of the MOM and all collectors
    3. javascript example files, for publishing certain metrics
    4. ElasticSearch mapping file that corresponds to this integration.

     

    --------- Update on 7/27/2015  ------------------

    v3.0 of extension to publish data to ElasticSearch

    1. Changed elastic search mapping to a template that supports daily creation of indices.  Prior releases created only a single indices "app_performance", which would make it difficult to purge data. Now a new index is created on a daily basis based on current time.

    2.  In new template mapping storing much less data in the document.  Only value, min and max.  All other elements are searchable but can only think of the values of the metric being the most important to re-display.

    ----------------------------------------------------------- end update------

     

    --------- Update on 9/7/2015  ------------------

    v3.1 of extension to publish data to ElasticSearch

    1. Modified the pattern of one to one relationship between metric pulled and json post request to ElasticSearch.  Now extension uses a bulk process of pulling metrics and then sending a single post request to ES.  It doesn't speed the time to pull data from the smartstor, but the separate thread that is spawned from the process doesn't have to manage potentially 100's or 1000's of separate connections to the ES data store.

    ----------------------------------------------------------- end update------

    --------- Update on 3/4/2017  ------------------

    I know this is really not the right venue to officially submit this integration to the community...but whatever...use it...don't use it...BUT I've added in functionality to do the same kind of data spooling for Elastic but now sending to Splunk as well.  Setup is the exact same, deploy jar file and javascript files and BAM!  you've got data going to another datasource.  One of my main customers we are sending around 12k metrics every 5 minutes to ElasticSearch for keeping SLA data for an extended period of time.  I've got 18+ months and counting of elastic indexes with general availability data, including CEM and ADA.  Really anything that you would want to send off that reports to the Investigator tree.

    ----------------------------------------------------------- end update------

    Attachments