Enhance Encode/Decode Data Assertion to add html encode/decode

Idea created by Mark.ODonohue Employee on Mar 26, 2018
    • Alexandre Siqueira
    • CBertagnolli
    • Muthugomu
    • Mark.ODonohue
    • ygirouard_stm
    • anand.rudran

    When constructing HTML/XML/JSON document using input directly from the user its essential to escape HTML special characters to avoid insertion attacks,


    For example - if I am building an XML document for sending to a backend server: via: 


    Then if ${item.current} comes directly from user entered data then it may contain characters that will break the HTML/XML structure.   An example of the coding needs is :  "bread" & "butter" becomes :  "bread" & "butter".


    The encoding we need is similar to that provided in most web platforms, here is example of escapeXML from apache commons :



    public static void escapeXml(Writer writer, String str) throws IOException


    Escapes the characters in a String using XML entities.
    For example: "bread" & "butter" => "bread" & "butter".

    Supports only the five basic XML entities (gt, lt, quot, amp, apos). Does not support DTDs or external entities.
    Note that unicode characters greater than 0x7f are currently escaped to their numerical \\u equivalent. This may change in future releases.


    The Encode/Decode Data Assertion, already has many encoding schemes - and would seen the right place to include a  html encode or html decode.  Adding these would be relatively simple, and make safe handling of user entered data easier:  





    Cheers - Mark


    Mapping for the HTML/XML special characters is generally : 

    >   >

    <   &lt;

    "    &quot;

    &    &amp;

    '     this one is a little trickier, since &pos; is only recognised in XML and latter HTML versions, so &#39; is often safer.


    And special handling for unicode characters > 0x7f