"On the 4th day of 8.0 the CA Team gave to me,
FOUR opaque data types processed
We have reached the 4th day of DevTest 8.0 and this gift is simply magic. I explained a high-level description of Opaque Data Processing (ODP) in my blog before CA World on how it removes the need for the subject matter expert on new or unique protocols by using byte-level matching to find relationships in requests matching and response generation.
So today, let's get a bit more in depth on how Opaque Data Processing (ODP) works and what to use it for.
Service Virtualization Today and Its Challenges:
It all begins with how Service Virtualization works and something I like to call the DPH Challenge. Service virtualization can be summed up in 3 steps: Capture, Process and Model
- Capture is the step where you gather the requests and matching responses in the virtual service. Transactions can be captured in a few ways like inserting a pre-defined specification (WADL/WSDL), a set of log files/packet captures or even recording the transactions occurring on a live system,
- Process is the key to the puzzle. It's what makes service virtualization something more than just stubbing and mocking. In the process step, the protocol of the transactions is detected and CA Service Virtualization uses a Data Protocol Handler (DPH) to process the syntax and semantics of that data. Headers are identified, Operations and Arguments split out, de-identification is applied and business rules are identified (through magic strings/dates).
- Model is the final stage where the virtual service gets its routing steps. This can be simple or complex but we'll leave that detail in the execution modes blog from yesterday.
Opaque Data Processing (ODP) addresses the "Process" step and solves "The DPH Challenge". You see, the Data Protocol Handler is the brains of the operation. Without it, the matching of requests and responses would be completely linear - one to one. If there isn't an exact matching request in the virtual service, you do not get a response. In order to build a DPH that can 'translate" the messages to find the best match, you must have a subject matter expert that can determine the syntax, semantics, and synchronization of the transactions. In many cases, with legacy technology or simply proprietary protocols, this Subject Matter Expert (SME) has left the company or simply doesn't exist.
ODP addresses the DPH Challenge by using patented algorithms to automatically find the relationships inside of nearly any data source, radically reducing the time required to create virtual services. By using the Needleman-Wunsch genome sequencing alignment algorithm, ODP can discover byte-level patterns in messages and match those messages to the closest ones it has already seen before. If we want to get technical, it calculates a distance function between the incoming request and a matching request to find which one most closely matches. It also finds the matching bytes in the original request and response and duplicates that match in the newly seen request in the response it sends. This is all the magic string functionality without having to parse the request.
After this initial discovery, there we found two challenges within ODP. (1) The matching can be slow on a large data set and (2) the matching may not be completely accurate as it is not taking into account variability or invariability in the message. For example, in a DPH, we would identify the "Operation" first and use that to find an initial match. With ODP, and we miss that speed and accuracy. Instead, every byte is equal. To increase accuracy and speed of ODP, Entropy Weighting was born. This is a technique that we coupled with ODP to be able to identify the bytes in a message that may be a header or an operation. It strips out the headers due to high level of similarity with other messages and it identify s the operation as a set of bytes found 20-30 percent of the time (for example). Then during matching, the first thing it will try is to find these similarities to match first (instead of the full string of bytes at once). With this implementation we've found the following accuracy statistics on some already known protocols pushed through ODP:
So at this point you're likely boggled by the complexity but excited for the true simplicity of this innovative replacement to the DPH. You're likely asking yourself - "So where do I use this?" "How do I access it?" and "Why do I even need DPH's in the future?"
Let's start with this, ODP provides a turnkey solution to virtualize protocols otherwise not supported by CA Service Virtualization. It's best used when you need reasonable data and you do not need to 'force" specific behavior (like negative test scenarios).
ODP doesn't work with encrypted messages (you'd have to decrypt first for us to find the sequence of bytes). It doesn't detect stateful transactions nor does it support magic dates. The content of the service image that is captured in the ODP process, is not often human readable and therefore you cannot easily edit that content unless you can translate EBCDIC off the top of your head.
With these circumstances, we feel there is clear use for DPH and clear use for ODP and will be keeping both technologies alive to both complement each other in addressing request matching and their coordinating responses. Ideally, the future would allow us to take the matching/learned behavior from ODP and apply it to a more accurate, easier to read DPH with a few clicks!
The Simple Stuff:
Using ODP is a breeze, though it is not in the new DevTest Portal just yet. Instead, you can access ODP under the "Transport Protocol" area in the 1st stage of recording a live virtual service in the DevTest Workstation (picture below):
ODP is an exciting project to be a part of. It not only was initially described as a "dream" by our team and now has become a reality but was developed by an up and coming PhD Candidate, Miao Du, from Swinburne University of Technology, Australia. Just wanted to give a big thank you to the full team that contributed to defining the problem we wanted to solve, finding a solution (in the most unlikely area) and then getting this innovation into our product. I hear they have more to come in the future as well!
Now that's my final gift to you from the CA Service Virtualization team. Tomorrow I will hand the blogging reins over to the CA Continuous Application Insight team for the 5th day of DevTest.
TThe 12 Days of DevTest Blog SeriesT