Yesterday’s Iframe Srcdoc Equals UTF-8 Issue Primer Tutorial‘s …
- HTML iframe …
- [iframe].src=data:text/html;base64,[HTMLbase64Content] versus [iframe].srcdoc=atob([HTMLbase64Content]) nuance regarding …
- the display of UTF-8 data
… is just that … nuanced, and hard to explain in terms of “what is the issue” and “what are we trying to achieve”. Yesterday, our attempts may have helped with “what are we trying to achieve” but we think we can improve on this, as well as with “what is the issue” by fleshing out the processing and using a favourite reveal tool of ours, the use of the details/summary element combination. Using this, we feel that when a user “reveals” a hidden piece of a puzzle, something happens more clearly in their brain, which has been spared the whole complicated picture all at once. At the other extreme, as it was for yesterday’s work, where there is too little obvious explanation, the user may just pack up and leave without even trying to understand what you are getting at.
Another explanation “friend”, we think, is colour. We use the colours of a stop sign to direct the users eyes through the display to the “green” logical conclusion.
Over to you with the changed fgc_utf_fix.php‘s live run for you to try to see whether the “reveal” usage helps the understanding better.
Previous relevant Iframe Srcdoc Equals UTF-8 Issue Primer Tutorial is shown below.
Do you remember how with Javascript document.querySelectorAll Client Pre-emptive Iframe Tutorial, recently, we said …
Why can’t we manage this new functionality in the one pass through the “onload” event logic? Well, any self-respecting webpage content will contain both apostrophe and double quote characters (let alone line feeds and carriage returns) ( but we can if we can get to a Javascript DOM statement like document.getElementById(‘ifsd’).srcdoc=atob((” + ioissrc).split(‘;base64,’)[1]).replace(‘</bo’ + ‘dy>’, ‘ <style> ‘ + selectorplusis + ‘</style> </bo’ + ‘dy>’); )
? Well, that is true, initializing an iframe’s srcdoc attribute at the same time as the iframe is created can be tricky for HTML data of any complexity. Recently, though, we realized that the …
document.getElementById(‘ifsd’).srcdoc=atob((” + ioissrc).split(‘;base64,’)[1]).replace(‘</bo’ + ‘dy>’, ‘ <style> ‘ + selectorplusis + ‘</style> </bo’ + ‘dy>’);
… can be problematic, too, with UTF-8 (unicode) data (perhaps to do with UTF-16 surrogate pairs (we are not sure)). Of course, discovering this during that recent web application “Testing out document.querySelectorAll” in the blog posting thread owning the blog post above, as well as Javascript document.querySelectorAll Textarea Placeholder Tutorial‘s penchant for using as an absolute URL (thanks Wikipedia) …
HTTP://www.wikipedia.org/wiki/Einstein
… we discovered it outputting strings like …
Kingdom of Württemberg
… rather than, the better …
Kingdom of Württemberg
… leading us to be led down an “irrelevant PHP file_get_contents encoding problem garden path” until we undertook today’s “proof of concept” fgc_utf_fix.php‘s live run simplifying (and thus paring down) the methodologies of that “Testing out document.querySelectorAll” project and decoupling it and putting it back together, plus a good hour of logical calm reasoning, led us to deduct that it was not file_get_contents that was the problem but that [iframe].srcdoc=[HTMLcontent] causing the issue when that [HTMLcontent] contains UTF-8 unicode data. That makes sense. Not all UTF-8 data fits with an initialization statement designed for character data that is made up of one byte per character, so there could be mis-mappings doing this.
But then we stumbled upon the excellent Function to fix ut8 special characters displayed as 2 characters (utf-8 interpreted as ISO-8859-1 or Windows-1252) and adapted its PHP code into a Javascript function equivalent that could help put “Humpty Dumpty back together again”. Cute, huh?!
If this was interesting you may be interested in this too.
If this was interesting you may be interested in this too.