The last time we talked about 404.php was with the recent Search Engine Crawler Bot Traffic Detection Tutorial, but, perhaps we should reiterate, “what is the usual role of a 404.php file on an Apache/PHP/MySql web server?“ …
404 is the HTTP error number returned when a resource is unable to be located on the server. The PHP code at the top of the above file returns this code to ensure systems such as search engines donβt mistake the page for real content.
We’ve been extending its traditional roles here at our RJM Programming WordPress Blog you are reading. This extension involves background imaging a blog post’s image (that we’ve assigned to each blog post) when that blog post is referenced by any of the grouping of blog post lists such as …
- via tag
- via category
- via search string
- via month designation
We did this, at the end of the line of the logic, via PHP GD image code …
<?php
function createScaledImage($newWidth,$newHeight,$path,$datauri) { // thanks to https://stackoverflow.com/questions/16774521/scale-image-using-php-and-maintaining-aspect-ratio
global $ptitle;
$image_name=explode(DIRECTORY_SEPARATOR, $path)[-1 + sizeof(explode(DIRECTORY_SEPARATOR, $path))];
if (!$datauri) {
if (isset($_GET['random'])) {
if (substr(($_GET['random'] . ' '),0,1) == '0') {
$thist=' ' . time();
$thistminustwo=substr($thist,0,(strlen($thist) - 2));
if (!file_exists('./rjmlist.htm')) {
file_put_contents('./rjmlist.htm', '<body><pre>' . "\n" . $thist . '|' . $_GET['random'] . '|' . $ptitle . "\n</pre></body>");
} else {
$sofarc=file_get_contents('./rjmlist.htm');
$newones=explode($thistminustwo, $sofarc);
$lastiis=0;
if (strpos($sofarc, ' ') !== false) { $lastiis=explode('|',explode(' ', $sofarc)[1])[0]; }
if (sizeof($newones) == 1 && (time() - $lastiis) > 100) {
file_put_contents('./rjmlist.htm', '<body><pre>' . "\n" . $thist . '|' . $_GET['random'] . '|' . $ptitle . "\n</pre></body>");
} else {
$sofarcs=explode('|' . $_GET['random'] . '|', $sofarc);
if (sizeof($sofarcs) > 1) {
$sofarpref=' ' . explode(' ', $sofarcs[0])[-1 + sizeof(explode(' ', $sofarcs[0]))] . '|' . $_GET['random'] . '|' . explode("\n", $sofarcs[1])[0] . "\n";
$sofarc=str_replace($sofarpref, "", $sofarc);
}
file_put_contents('./rjmlist.htm', '<body><pre>' . "\n" . $thist . '|' . $_GET['random'] . '|' . $ptitle . "\n" . str_replace('<body><pre>' . "\n", '', $sofarc));
}
}
}
}
}
$mime = getimagesize($path);
if ($mime['mime']=='image/png') {
$src_img = imagecreatefrompng($path);
}
if ($mime['mime']=='image/jpg' || $mime['mime']=='image/jpeg' || $mime['mime']=='image/pjpeg') {
$src_img = imagecreatefromjpeg($path);
}
if ($mime['mime']=='image/gif') {
//echo 'data:image/gif;base64,' . base64_encode(file_get_contents($path));
# New code below backgrounds animated GIFs with animation …
header(‘Content-Type: image/gif’);
echo file_get_contents($path);
exit;
$src_img = imagecreatefromgif($path);
}
$old_x = imageSX($src_img);
$old_y = imageSY($src_img);
if ($old_x > $old_y) {
$thumb_w = $newWidth;
$thumb_h = $old_y/$old_x*$newWidth;
}
if ($old_x < $old_y) {
$thumb_w = $old_x/$old_y*$newHeight;
$thumb_h = $newHeight;
}
if ($old_x == $old_y) {
$thumb_w = $newWidth;
$thumb_h = $newHeight;
}
$dst_img = imagecreatetruecolor($thumb_w,$thumb_h);
imagecopyresampled($dst_img,$src_img,0,0,0,0,$thumb_w,$thumb_h,$old_x,$old_y);
// New save location
$new_thumb_loc = '/tmp/' . $image_name;
if (!$datauri) {
if($mime['mime']=='image/png') {
header('Content-Type: image/png');
imagepng($dst_img); //,$new_thumb_loc,8);
if (file_exists($new_thumb_loc)) {
unlink($new_thumb_loc);
}
imagedestroy($dst_img);
imagedestroy($src_img);
exit;
} else if ($mime['mime']=='image/jpg' || $mime['mime']=='image/jpeg' || $mime['mime']=='image/pjpeg') {
header('Content-Type: image/jpeg');
imagejpeg($dst_img); //,$new_thumb_loc,80);
if (file_exists($new_thumb_loc)) {
unlink($new_thumb_loc);
}
imagedestroy($dst_img);
imagedestroy($src_img);
exit;
} else if ($mime['mime']=='image/gif') {
header('Content-Type: image/gif');
imagegif($dst_img); //,$new_thumb_loc,80);
if (file_exists($new_thumb_loc)) {
unlink($new_thumb_loc);
}
imagedestroy($dst_img);
imagedestroy($src_img);
exit;
}
exit;
}
$result="";
if ($mime['mime']=='image/png') {
imagepng($dst_img,$new_thumb_loc,8);
$result = file_get_contents($new_thumb_loc);
}
if ($mime['mime']=='image/jpg' || $mime['mime']=='image/jpeg' || $mime['mime']=='image/pjpeg') {
imagejpeg($dst_img,$new_thumb_loc,80);
$result = file_get_contents($new_thumb_loc);
}
if ($mime['mime']=='image/gif') {
imagegif($dst_img,$new_thumb_loc);
$result = file_get_contents($new_thumb_loc);
}
imagedestroy($dst_img);
imagedestroy($src_img);
if (file_exists($new_thumb_loc)) {
unlink($new_thumb_loc);
}
return $result;
}
?>
… but the trouble was, involving PHP GD with animated GIFs lost those GIF images the chance to be animated … and we do not want a depressed animated GIF under any circumstances?!
So we intervened … with the …
<?php
if ($mime['mime']=='image/gif') {
//echo 'data:image/gif;base64,' . base64_encode(file_get_contents($path));
# New code below backgrounds animated GIFs with animation ...
header('Content-Type: image/gif');
echo file_get_contents($path);
exit;
$src_img = imagecreatefromgif($path);
}
?>
… intervention, as it slots in way above, in our 404.php code, an obvious “mood lifter”?!
Previous relevant Search Engine Crawler Bot Traffic Detection Tutorial is shown below.
We came across this good precis of What is the aim of a search engine crawling bot?
A web crawler, spider, or search engine bot downloads and indexes content from all over the Internet. The goal of such a bot is to learn what (almost) every webpage on the web is about, so that the information can be retrieved when it’s needed. They’re called “web crawlers” because crawling is the technical term for automatically accessing a website and obtaining data via a software program.
As you might imagine, if your website is crawled by a search engine, such as Google, your website may need to cater for short periods of more intense interest. Perhaps similar for feeds. Perhaps similar for hacking “denial of service” attacks, alas.
Can we identify the first of those sources of traffic? Well, we did some research and development and got to this excellent PHP advice.
Why, for RJM Programming, do we want to identify the first of those sources of traffic? Well, we think the recent WordPress Blog TwentyTen theme 404.php background image work is asking a lot of our web server, and if there is a burst of traffic, we think it might be adversely affecting the website. Given the aims of a search engine crawling bot, above, we don’t think denying the bot those background images is a huge problem. And so, can we change 404.php to lessen that impost on our web server? We think so, as per …
<?php
function bot_detected() { // thanks to https://stackoverflow.com/questions/677419/how-to-detect-search-engine-bots-with-php
return (
isset($_SERVER['HTTP_USER_AGENT'])
&& preg_match('/bot|crawl|slurp|spider|mediapartners/i', $_SERVER['HTTP_USER_AGENT'])
);
}
?>
… used below …
<?php
$uparts=explode("/", $_SERVER['REQUEST_URI']);
if (sizeof($uparts) >= 2) {
if (trim(explode('#',explode('?',$uparts[-1 + sizeof($uparts)])[0])[0]) == '') {
$ioff=-1;
}
if (1 == 1 || ('' . $_SERVER['QUERY_STRING']) == '') {
$usz=sizeof($uparts);
if (str_replace('?' . $_SERVER['QUERY_STRING'],'',trim($uparts[-1 + sizeof($uparts)])) == '') { $usz--; }
if ($usz == 3 && strpos($uparts[-1 + $usz], "%20") !== false || strpos($uparts[-1 + $usz], "+") !== false) { // fix /ITblog/Linux%20mailx%20Primer%20Tutorial/ 18/1/2022 RM
$oky=true;
if (substr(trim($uparts[$ioff - 1 + sizeof($uparts)]) . ' ',0,1) >= '0' && substr(trim($uparts[$ioff - 1 + sizeof($uparts)]) . ' ',0,1) <= '9') {
if (substr(trim($uparts[$ioff - 2 + sizeof($uparts)]) . ' ',0,1) >= '0' && substr(trim($uparts[$ioff - 2 + sizeof($uparts)]) . ' ',0,1) <= '9') {
$oky=false;
}
}
if ($oky) {
if (('' . $_SERVER['QUERY_STRING']) == '') {
header('Location: ' . str_replace('~``','/ITblog/',str_replace('/','',str_replace('/ITBLOG/','~``',str_replace('/itblog/','~``',str_replace('/ITblog/','~``',str_replace('--','-',str_replace('---','-',str_replace('+','-',str_replace('%20','-',$_SERVER['REQUEST_URI']))))))))));
} else {
header('Location: ' . explode('?',str_replace('~``','/ITblog/',str_replace('/','',str_replace('/ITBLOG/','~``',str_replace('/itblog/','~``',str_replace('/ITblog/','~``',str_replace('--','-',str_replace('---','-',str_replace('+','-',str_replace('%20','-',$_SERVER['REQUEST_URI']))))))))))[0] . '?' . $_SERVER['QUERY_STRING']);
}
exit;
}
}
}
if (str_replace("category","cat",strtolower($uparts[-2 + sizeof($uparts)])) == "cat" || strtolower($uparts[-2 + sizeof($uparts)]) == "category") {
$catsare=["","Not Categorised","Ajax","Android","Animation","Anything You Like","Code::Blocks","Colour Matching","Data Integration","Database","Delphi","Eclipse","eLearning","ESL","Event-Driven Programming","Games","GIMP","GUI","Hradware","Installers","iOS","Land Surveying","Moodle","Music Poll","NetBeans","Networking","News","ontop","OOP","Operating System","Photography","Projects","Signage Poll","Software","SpectroPhotometer","Tiki Wiki","Trips","Tutorials","Uncategorized","Visual Studio","Xcode"];
for ($ibh=1; $ibh<sizeof($catsare); $ibh++) {
if (explode("&",strtolower($uparts[-1 + sizeof($uparts)]))[0] == strtolower($catsare[$ibh])) {
if (strtolower($catsare[$ibh]) == "ontop") {
header('Location: https://www.rjmprogramming.com.au/ITblog/category/' . str_replace(" ","-",explode("&",strtolower($uparts[-1 + sizeof($uparts)]))[0])) . '#' . $ibh;
} else {
header('Location: https://www.rjmprogramming.com.au/ITblog/category/' . str_replace(" ","-",explode("&",strtolower($uparts[-1 + sizeof($uparts)]))[0])) . '#' . $ibh;
}
} else if (explode("&",strtolower($uparts[-1 + sizeof($uparts)]))[0] == ('' . $ibh)) {
if (strtolower($catsare[$ibh]) == "ontop") {
header('Location: https://www.rjmprogramming.com.au/ITblog/?cat=' . str_replace(" ","-",explode("&",strtolower($uparts[-1 + sizeof($uparts)]))[0])) . '#' . $ibh;
} else {
header('Location: https://www.rjmprogramming.com.au/ITblog/?cat=' . str_replace(" ","-",explode("&",strtolower($uparts[-1 + sizeof($uparts)]))[0])) . '#' . $ibh;
}
}
}
} else if (!bot_detected() && substr(trim($uparts[$ioff - 1 + sizeof($uparts)]) . ' ',0,1) >= '0' && substr(trim($uparts[$ioff - 1 + sizeof($uparts)]) . ' ',0,1) <= '9') {
if (substr(trim($uparts[$ioff - 2 + sizeof($uparts)]) . ' ',0,1) >= '0' && substr(trim($uparts[$ioff - 2 + sizeof($uparts)]) . ' ',0,1) <= '9') {
$uwidth=trim($uparts[$ioff - 2 + sizeof($uparts)]);
$uheight=trim(explode('#',explode('?',$uparts[$ioff - 1 + sizeof($uparts)])[0])[0]);
$imfnameafterdomainsep="random_background_fadeinout.jpg";
$ptitle="Random Background Webpage Fade Tutorial";
selectNewBlogPostingTutorialPicture();
$postingiurl=explode('ITblog' . DIRECTORY_SEPARATOR, dirname(__FILE__) . DIRECTORY_SEPARATOR)[0] . $imfnameafterdomainsep;
list($iwidth, $iheight, $itype, $iattr) = getimagesize($postingiurl);
$amime = getimagesize($postingiurl);
if ($ioff == 0) {
//header('Content-Type: image/jpeg');
echo "<html>" . $tonl . "<body" . $bonl . " style=\"background:linear-gradient(rgba(255,255,255,0.7),rgba(255,255,255,0.7)),Url('data:image/jpeg;base64," . base64_encode(createScaledImage($uwidth,$uheight,$postingiurl,true)) . "#" . str_replace('+','%20',urlencode($ptitle)) . "') 0px 30px no-repeat;background-size:contain;background-repeat:no-repeat;background-position:0px 30px;\"><pre>data:image/jpeg;base64," . base64_encode(createScaledImage($uwidth,$uheight,$postingiurl,true)) . "#" . str_replace('+','%20',urlencode($ptitle)) . "</pre><br><iframe id=preif style='display:none;width:100%;height:1200px;' src=''></iframe><br><img onclick=\"document.getElementsByTagName('pre')[0].click();\" id=moimg style='display:none;border-width: 28px;border-style: solid; border-image: linear-gradient(to right, lightblue, lightgreen) 1;' src='data:image/jpeg;base64," . base64_encode(createScaledImage($uwidth,$uheight,$postingiurl,true)) . "#" . str_replace('+','%20',urlencode($ptitle)) . "'></img></body></html>";
} else if (1 == 2) {
//header('Content-Type: image/jpeg');
echo '<img src="' . "data:image/jpeg;base64," . base64_encode(file_get_contents($postingiurl)) . "#" . str_replace('+','%20',urlencode($ptitle)) . '"></img>';
} else {
createScaledImage($uwidth,$uheight,$postingiurl,false); //imagecreatefromjpeg($postingiurl);
}
exit;
}
}
}
?>
… as a means of differentiating users “surfing the net” from “search engine crawling bot” web traffic to our website’s WordPress blog (affecting search and tag and category and month list and day list query URLs) you are reading.
If this was interesting you may be interested in this too.
If this was interesting you may be interested in this too.