Previous | Next | Indexes
Zvon > Tutorials > XSLT 2.0 Tutorial > Functions operating on strings (14/15) >

Escaping of URI with encode-for-uri(), iri-to-uri(), and escape-html-uri()

The function encode-for-uri() escapes characters which can cause problems in path segments of URL addresses. All characters are escaped other than the lower case letters a-z, the upper case letters A-Z, the digits 0-9, the NUMBER SIGN "#" and HYPHEN-MINUS ("-"), LOW LINE ("_"), FULL STOP ".", EXCLAMATION MARK "!", TILDE "~", ASTERISK "*", APOSTROPHE "'", LEFT PARENTHESIS "(", and RIGHT PARENTHESIS ")".

The function iri-to-uri() should be used for strings intended as URL addresses. All characters are escaped other than the lower case letters a-z, the upper case letters A-Z, the digits 0-9, the NUMBER SIGN "#" and HYPHEN-MINUS ("-"), LOW LINE ("_"), FULL STOP ".", EXCLAMATION MARK "!", TILDE "~", ASTERISK "*", APOSTROPHE "'", LEFT PARENTHESIS "(", and RIGHT PARENTHESIS ")", SEMICOLON ";", SOLIDUS "/", QUESTION MARK "?", COLON ":", COMMERCIAL AT "@", AMPERSAND "&", EQUALS SIGN "=", PLUS SIGN "+", DOLLAR SIGN "$", COMMA ",", LEFT SQUARE BRACKET "[", RIGHT SQUARE BRACKET "]", and the PERCENT SIGN "%".

The function escape-html-uri() escapes all characters except printable characters of the US-ASCII coded character set, specifically the octets ranging from 32 to 126 (decimal). The effect of the function is to escape a URI in the manner html user agents handle attribute values that expect URIs. Each character in $uri to be escaped is replaced by an escape sequence, which is formed by encoding the character as a sequence of octets in UTF-8, and then representing each of these octets in the form %HH, where HH is the hexadecimal representation of the octet. This function must always generate hexadecimal values using the upper-case letters A-F. This function is not implemented in saxon 8.6 at this moment (22nd Nov 2005), please send me an email if you find this sentence after March 2006)

XSLT

      <xsl:stylesheet
                  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                  version="2.0">
            <xsl:output  method="xml"
                        indent="yes"
                        omit-xml-declaration="yes"/>

            <xsl:template  match="/aaa">
                  <xxx>
                        <xsl:value-of  select="encode-for-uri(bbb)"/>
                  </xxx>
                  <yyy>
                        <xsl:value-of  select="iri-to-uri(bbb)"/>
                  </yyy> <!--zzz> <xsl:value-of select="escape-html-uri(bbb)"/> </zzz-->
                  <ppp>
                        <a  href="{concat(iri-to-uri(ccc),encode-for-uri(ddd))}">
                              <xsl:value-of  select="concat(ccc,ddd)"/>
                        </a>
                  </ppp> <!--qqq> <a href="{escape-html-uri(concat(ccc,ddd))}"> <xsl:value-of select="concat(ccc,ddd)"/> </a> </qqq-->
            </xsl:template>

      </xsl:stylesheet>
XML

      <aaa>
            <bbb>[Are you (john@my.home)!?]</bbb>
            <ccc>/a dir/</ccc>
            <ddd>~$ADD/RESS%</ddd>
      </aaa>
Output

      <xxx>%5BAre%20you%20(john%40my.home)!%3F%5D</xxx>
      <yyy>[Are%20you%20(john@my.home)!?]</yyy>
      <ppp>
            <a  href="/a%20dir/~%24ADD%2FRESS%25">/a dir/~$ADD/RESS%</a>
      </ppp>


Previous chapter: Arithmetics
Next chapter: Transforming strings with regular expressions
Previous page: Absolute URIs' from relative ones with resolve-uri function
Next page: Strings from unicode codepoints