The recent HTML form navigation work gobsmacked us mildly as we reconnected with the fact that a character like “~” is not changed (ie. encoded) by Javascript’s “encodeURIComponent” encoding function. And then we figured that “~” can get you back to home directories on Linux and Unix, so presumably, could appear in a URL non-argument part (ie. the address of your web script). There are others too, and so we decided to write a practical way for you to experiment with.
Let’s show you the PHP “proof of concept” web application code, it being pretty short, below …
<?php
// encoding_decoding.php
// User experimentation with encoding and decoding systems
// RJM Programming
// July, 2021
$pfrom="The quick brown fox jumps over the lazy dog and tilde (~) followed closely behind, dragging along 0 Mostel!?";
$pto="";
$pvia="";
$blurb="Please choose an encode or decode option below ...";
if (isset($_POST['from']) && isset($_POST['via'])) {
$pfrom=str_replace('+',' ',urldecode($_POST['from']));
$pvia=str_replace('+',' ',urldecode($_POST['via']));
if (strpos(urldecode($_POST['via']), "base64_encode") !== false) {
$pto=base64_encode($pfrom);
$blurb=$pvia;
} else if (strpos(urldecode($_POST['via']), "base64_decode") !== false) {
$pto=base64_decode($pfrom);
$blurb=$pvia;
} else if (strpos(urldecode($_POST['via']), "urlencode") !== false) {
$pto=urlencode($pfrom);
$blurb=$pvia;
} else if (strpos(urldecode($_POST['via']), "urldecode") !== false) {
$pto=urldecode($pfrom);
$blurb=$pvia;
}
}
echo "
<html>
<head>
<script type='text/javascript'>
function consider(sio) {
}
function maybeclient() {
ocument.getElementById('tvia').value=document.getElementById('tvia').value.toLowerCase().replace('uri','URI').replace('component','Component');
if (document.getElementById('tvia').value == '') {
return false;
} else if (document.getElementById('tvia').value.indexOf('base64') == 0) {
return true;
} else if (document.getElementById('tvia').value.indexOf('url') == 0) {
return true;
} else if (document.getElementById('tvia').value.indexOf('atob') == 0) {
try {
document.getElementById('tto').value=atob(document.getElementById('tfrom').value);
} catch(ecvd) { alert('You have some invalid characters for atob to handle.'); }
} else if (document.getElementById('tvia').value.indexOf('btoa') == 0) {
document.getElementById('tto').value=btoa(document.getElementById('tfrom').value);
} else if (document.getElementById('tvia').value.toUpperCase().indexOf('encodeURIComponent'.toUpperCase()) == 0) {
document.getElementById('tto').value=encodeURIComponent(document.getElementById('tfrom').value);
} else if (document.getElementById('tvia').value.toUpperCase().indexOf('decodeURIComponent'.toUpperCase()) == 0) {
document.getElementById('tto').value=decodeURIComponent(document.getElementById('tfrom').value);
} else if (document.getElementById('tvia').value.toUpperCase().indexOf('encodeURI'.toUpperCase()) == 0) {
document.getElementById('tto').value=encodeURI(document.getElementById('tfrom').value);
} else if (document.getElementById('tvia').value.toUpperCase().indexOf('decodeURI'.toUpperCase()) == 0) {
document.getElementById('tto').value=decodeURI(document.getElementById('tfrom').value);
}
return false;
}
</script>
</head>
<body>
<h1>User experimentation with encoding and decoding systems</h1>
<h3>RJM Programming - July, 2021</h3>
<form onsubmit='return maybeclient();' method=POST action=./encoding_decoding.php>
<table style=width:90%; border=60>
<tr><th>From ... </th><th> ... <input type=submit value=via></input> ... </th><th> ... To</th></tr>
<tr><td><textarea rows=30 style=width:100%; name=from id=tfrom>" . $pfrom . "</textarea></td><td style=vertical-align:top;text-align:center;><select size=11 name=via id=tvia onchange=consider(this);>
<option value='" . strtoupper($pvia) . "'>" . $blurb . "</option>
<option value='base64_encode'>base64_encode</option>
<option value='base64_decode'>base64_decode</option>
<option value='btoa'>btoa</option>
<option value='atob'>atob</option>
<option value='urlencode'>urlencode</option>
<option value='urldecode'>urldecode</option>
<option value='encodeURIComponent'>encodeURIComponent</option>
<option value='decodeURIComponent'>decodeURIComponent</option>
<option value='encodeURI'>encodeURI</option>
<option value='decodeURI'>decodeURI</option>
</select></td><td><textarea readonly rows=30 style=width:100%; name=to id=tto>" . $pto . "</textarea></td></tr>
<tr><td colspan=3><input id=ende value='Encode or Decode' type=submit></input></td></tr>
</table>
</body>
</html>";
?>
The space character is one to watch. A client side HTML and Javascript form “encodeURIComponent”s any space to %20 but serverside PHP “urlencode”s a space to the “+” character. This “+” is a legitimate character in data-URIs and we find we need to assume a “urldecode” of passed in characters, if they do not contain …
- data URIs … nor …
- Javascript code in the “head” element of a webpage
- some other real use of the “+” (eg. mathematical formula, north latitude sometimes, west longitude sometimes)
… then we often, coming back from a form to serverside PHP, use …
$varis = str_replace('+', ' ', urldecode($_POST['postedfieldname']));
Try it below if you like …
If this was interesting you may be interested in this too.