Should You Be Ranking For [SEO]? Please Boycott KeywordSpy
Jul 04

Running dozens/hundreds of websites can be an expensive business, hosting alone can cost upwards of $250 a year for each site. SEOs will often look to host each website on a different IP range or in a totally different country. You may want to host country-specific sites in their target country, but not want to have to replicate your CMS or custom platform on every hosting server. A slightly darker reason for spreading your sites across different hosts would be to manage micro-sites that appear independent and use them to point link juice into your website.

One experiment I made last year was a script to auto-upload and update WordPress blogs no matter where they were hosted. Most web hosts don’t offer SSH access though and managing a large number of files via a PHP FTP class can get very messy. This also didn’t reduce costs, as each hosting account still required a MySQL database and all the pre-requisites for running WP.

A better solution comes in the form of copying the way that web proxies work. You’ve probably come across web proxies when investigating duplicate/scraped content; they basically allow someone to view a web page without leaking their browser, location or IP details to the website owner or their analytics provider. They’re mostly used by school kids to circumvent website restrictions, the tin-foil hat brigade, and by those living in countries with repressive regimes. The part that we’re interested in though is how the script visits a URL, temporarily stores the HTML, images plus JS/CSS and then displays the page without ever leaving the proxy site. Any web page can be viewed on the proxy site’s domain, without the proxy user noticing any difference.

Using these principles, I’m going to show you how you can manage all of your websites on a single, good quality, hosting account, whilst making them appear to be hosted on different hosts.

Step 1
Find a really reliable web host with high uptime and powerful servers. I use VPS Link for my VPSs and LiquidWeb for my hosting accounts. The larger your planned network will be, the more you should invest in this single site hosting. This will be our “hub site”.

Step 2
Create a simple Content Management System that can load different templates and content based on the URL parameter parsed to it (e.g. ?site=1). Alternatively you can use an existing CMS and adapt it to load data from different databases, depending on the variable sent to it. For WordPress you can make the following change to wp-config.php:

define(’DB_NAME’, ‘database’); // The name of the database
define(’DB_USER’, ‘username’); // Your MySQL username
define(’DB_PASSWORD’, ‘password’); // …and password
define(’DB_HOST’, ‘localhost’); // 99% chance you won’t need to change this value

To:

switch ($_GET[site]) {
case “1″:
define(’DB_NAME’, ‘database1′); // The name of the database
define(’DB_USER’, ‘username1′); // Your MySQL username
define(’DB_PASSWORD’, ‘password1′); // …and password
define(’DB_HOST’, ‘localhost’); // 99% chance you won’t need to change this value
break;
case “2″:
define(’DB_NAME’, ‘database2′); // The name of the database
define(’DB_USER’, ‘username2′); // Your MySQL username
define(’DB_PASSWORD’, ‘password2′); // …and password
define(’DB_HOST’, ‘localhost’); // 99% chance you won’t need to change this value
break;
case “3″:
define(’DB_NAME’, ‘database3′); // The name of the database
define(’DB_USER’, ‘username3′); // Your MySQL username
define(’DB_PASSWORD’, ‘password3′); // …and password
define(’DB_HOST’, ‘localhost’); // 99% chance you won’t need to change this value
break;
default:
define(’DB_NAME’, ‘database’); // The name of the database
define(’DB_USER’, ‘username’); // Your MySQL username
define(’DB_PASSWORD’, ‘password’); // …and password
define(’DB_HOST’, ‘localhost’); // 99% chance you won’t need to change this value
break;
}

I’ll be releasing a free Open Source (GPL) application shortly which will manage this and your blogs for you using a simple browser based GUI.

Step 3
Register as many free or cheap hosting accounts as you’d like, each will be displaying a unique website. You’d be surprised how many companies offer free hosting in exchange for a simple advertisement or link on your site. Each host must support PHP and use Apache Web Server (this qualifies 99% of web hosts). No MySQL databases or special PHP functions are needed. These will host our “node sites”.

Step 4
Now this is where the magic happens. We’re going to upload just 2 files to each of our new node site hosting accounts. After this, you’ll never have to FTP into or touch the hosting account again thanks to our proxy technique. The files that we’ll be uploading are a .htaccess file and a PHP script. I’ve reduced this technique down to just needing a .htaccess file in the past, although this requires a mod_proxy plugin which most free or cheap hosting companies will not support (Hat tip to Mike for pointing this out). Anyways…..

Create a simple text file in Notepad or equivalent named .htaccess and paste in the following:

Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/blog-app.php
RewriteRule ^(.*)$ blog-app.php?secret=monkey&u=$1 [PT,L,NE,QSA]

Once uploaded, this file will activate the mod_rewrite module in Apache (it’s the module that handles your 301 redirects usually) and points any requests (other than for /blog-app.php) to /blog-app.php?secret=monkey&u=$1 without changing the URL in the user’s browser. You’ll notice 2 variables in the URL we’re pointing to, the first (secret) is a security key that I’d recommend changing to a unique alphanumeric password and the second replaces $1 with the URL path requested. e.g. http://www.example.com/dir/test.html gets internally pointed to http://www.example.com/blog-app.php?secret=monkey&u=dir/test.html.

Step 5
Now the next bit of code is a big’en! It’s work-in-progress and therefore a little messy, although basically acts as our proxy. After checking that the request is genuine (using the secret variable), it will proceed to grab content from our hub site. So http://www.example.com/dir/test.html goes to http://www.example.com/blog-app.php?secret=monkey&u=dir/test.html which requests content, images and files from http://www.hubsite.com/dir/test.html?site=2.

Create a new plain text file called blog-app.php and paste in…

< ?php
error_reporting(0);

// BEGIN VARIABLES
$secret = "monkey"; // Should be the same secret key as in the .htaccess
$proxyto = "http://www.hubsite.com/"; // Where to proxy to
$proxyfrom = "http://www.example.com/"; // Where to proxy from
$siteid = "2"; // This Website's ID
// END VARIABLES

if ($_GET[secret] == $secret)
{
$q[] = "site=".$siteid;
foreach($_GET as $name => $value)
{
if ($name != ’secret’ && $name != ‘u’ && $name != ’site’)
{
$q[] = urlencode($name).”=”.urlencode($value);
}
}
if (is_array($q)) { $q = “?”.implode(”&”,$q); }
$u = $_GET[u];
if (strpos($u,”.jpg”) || strpos($u,”.jpeg”) || strpos($u,”.gif”) || strpos($u,”.png”) || strpos($u,”.bmp”) || strpos($u,”.swf”)) {
$q = FALSE;
$u = str_replace(” “, “%20″, $u);
}
$url = $proxyto.$u.$q;
$server = $proxyfrom;

class WebPage {

var $URL;
var $pageData;
var $headers;
var $static;
var $currentServer;
var $scriptName;
var $varName;
var $updatedPageData;
var $relDir;
var $fp;

function WebPage($URL, $static, $currentServer, $scriptName, $varName, $relDir){
$this->URL = $URL;
$this->currentServer = $currentServer;
$this->static = $static;
$this->relDir = $relDir;
$this->pageData = “”;
$this->varName = $varName;
$this->scriptName = $scriptName;
}
function openLink(){
if ($this->fp = fopen($this->URL, “rb”)) {
$fileheaders = stream_get_meta_data($this->fp);
$this->headers = $fileheaders;
}
else {
$error404 = TRUE;
}
}
function getRawPageData(){
fpassthru($this->fp);
fclose($this->fp);
return;
}
function getPageData(){
global $proxyto;
global $proxyfrom;
global $css;
global $u;
$updatedpd = str_replace($proxyto,$proxyfrom,$this->updatedPageData);
$strpath = substr($u, 0, strrpos($u, “/”));
$noslash = substr($proxyto, 0, -1);
$strpath = substr($strpath, 0, strrpos($strpath, “/”)).”/”;
$updatedpd = str_replace($proxyfrom.”../”,”../”,$updatedpd);
$updatedpd = str_replace(”url(’../”,”url(’”.$proxyfrom.$strpath,$updatedpd);
$updatedpd = str_replace(”url(../”,”url(”.$proxyfrom.$strpath,$updatedpd);
$updatedpd = str_replace(”url(’/”,”url(’”.$proxyfrom,$updatedpd);
$updatedpd = str_replace($proxyfrom.$proxyfrom,$proxyfrom,$updatedpd);
$updatedpd = str_replace($noslash.”\”",$proxyfrom.”\”",$updatedpd);
$updatedpd = str_replace($noslash.”‘”,$proxyfrom.”‘”,$updatedpd);
return $updatedpd;
}
function processPageData(){
$this->pageData = “”;
while( !feof( $this->fp)){
$this->pageData .= fgets($this->fp, 4096);
}
fclose($this->fp);
$delim[]=’”‘;
$delim[]=”‘”;
$delim[]=”";
$pre[]=”src=”;
$pre[]=”background=”;
$pre[]=”href=”;
$pre[]=”url\(”;
$pre[]=”codebase=”;
$pre[]=”url=”;
$pre[]=”archive=”;
$this->redirect($pre,$delim);
}
function fileName(){
return eregi_replace(”[#?].*$”, “”,
eregi_replace(”^.*/”, “”, $this->URL));
}
function redirect($prefixArray,$delimArray){
$a = $this->pageData;
$name = $this->currentServer;
$fileDir = $this->relDir;
foreach($prefixArray as $prefix){
$start2 = stripslashes($prefix);
$start = $prefix . “[ ]*”;
foreach($delimArray as $delim){
if(eregi($prefix . “[ ]*” . $delim, $a) && ($delim == “” ? eregi($prefix . “[ ]*” . “[a-z,A-Z,/,0-9]“, $a) : 1)){
$a = eregi_replace($start . $delim . “//”,
$start2 . ‘\a’ . “//”,
$a);
$a = eregi_replace($start . $delim . “/”,
$start2 . $delim . $name,
$a);
$a = eregi_replace($start . $delim . “http://”,
$start2 . ‘\a’ . “http://”,
$a);
$a = eregi_replace($start . $delim . “mailto:”,
$start2 . ‘\a’ . “mailto:”,
$a);
$a = eregi_replace($start . $delim . “#”,
$start2 . ‘\a’ . “#”,
$a);
$a = eregi_replace($start . $delim . “javascript:”,
$start2 . ‘\a’ . “javascript:”,
$a);
$a = eregi_replace($start . $delim,
$start2 . $delim . $name,
$a);
$a = eregi_replace($start . ‘[\]a’,
$start2 . $delim,
$a);
}
}
}
$this->updatedPageData = str_replace(”\a”,”",$a);
}

}

function processHeaders($headers, $filename, $mime_dl, &$type, &$isDown, &$isHTML, &$isImage){

global $url;
$openfp = fopen($url, “rb”);
$headers = stream_get_meta_data($openfp);

foreach ($headers as $header) {
foreach ($header as $headervalue) {
if (strpos(strtolower($headervalue), “image”)) {
$isImage = true;
}
if (strpos(strtolower($headervalue), “text”)) {
$isHTML = true;
}
if (strpos(strtolower($headervalue), “css”)) {
$headers[] = “Content-Type: text/css”;
$isHTML = true;
$css = 1;
}
}
}
return $headers;
}

$page = new WebPage($url,true,$server,”blog-app.php”,”u”,$relDir);
$page->openLink();
$head = processHeaders($headers,$file,$mime_dl,$type,$isDown,$isHtml,$isImage);
foreach($head as $h) header($h);

if($isHtml){
$page->processPageData();
echo $page->getPageData();

}else{
if($isImage)
$page->getRawPageData();
}

}
else
{
$error404 = TRUE;
}

if ($error404) {
header(”HTTP/1.0 404 Not Found”);
print “< !DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">

Not Found

The requested URL “.$_SERVER[PHP_SELF].” was not found on this server.

“;
exit;
}
?>

(Thanks to the PHP Proxy project for some of this code)

Step 6
Edit the above files as needed and then upload across your node site hosting accounts. You’ll only need to change the $proxyfrom and $siteid variables for each site, where $siteid is a unique ID and $proxyfrom is the root URL of the domain used on that specific node site.

Hopefully this will be of some help to the advanced readers of my blog. If this might as well be Egyptian hieroglyphs to you, keep an eye out for the release of my Blog-App PHP application that puts all of this behind a simple user interface.


18 Responses to “Running Micro-Sites and Reducing Hosting Costs”

  1. Burgo Says:

    Ok… so this might be a bit out of my reach (for now), but can’t wait for your app release Rob… it couldn’t have come at a better time for me :)

  2. Yossarian Says:

    As a British SEO don’t you feel the need for a UK host? Or do have a UK host on top of the 2 you mentioned?

    It is something we have been concerned about for the past few weeks. We really need a dedicated server and have been looking at theplanet.com as it seems to offer better value for money than UK hosts (the exchange rate obviously helps) but we have some clients that are UK orientated but with a .com. I know you can set you location in webmaster tools but I think Id sooner have a UK IP and be safe than sorry.

    Good idea though. Depending on how many microsoites there are you don’t always need free hosting. Some US hosts are stupidly cheap and it’s feasable to buy many ultra cheap accounts. Though I guess free is always better!

  3. evilgreenmonkey Says:

    I don’t sell anything on evilgreenmonkey.com and most of my readers are based in the US, so server location isn’t very important for this site. On my UK targeted sites I usually choose a .uk domain, host in the UK or both. You can of course set the country that your website’s targeted towards in Google’s Webmaster Tools if a .uk domain or UK hosting isn’t possible.

  4. DaveN Says:

    couldn’t you trace it back to the DB server thou, we used to do something very similar.. http://www.diablos.co.uk, they never stop ,, that’s banned you can still see the banned adsense account lol

    DaveN

  5. evilgreenmonkey Says:

    Hi Dave, The proxy script parses everything including JS, CSS. Images and Flash, plus any absolute URLs get re-written from the hub site URL to the node site URL. All rewriting is also internal, so no mention of the hub site should be found in the headers or the source code. The only floor that I’ve found is when the hub site goes down or produces errors, although the error_reporting(’0′) should filter out code errors.

  6. Jez Says:

    You ever played with yacg (getyacg dot com) a more recent version of myGen…

    Also, what do you make of services like 20linksaday dot com ?

  7. Tony Spencer Says:

    We recently built the same thing and its sooooo nice to be able to manage all sites from one central CMS. I modeled the CMS on all the good things about Drupal such as blocks and left out all the bad things about Drupal.

    The end result is a system that is hosting 100+ sites with very robust features but without the headache of getting all those robust features working on 100+ hosting accounts. The hardest part was getting AJAX features to work over the proxy.

    Oh and the blocks can be dragged to new locations in the admin panel with AJAX. I’m like a kid in a candy store now.

  8. Gary Says:

    Hi Rob, thanks for this. have been waiting eagerly since the first mention on the WBF. Cheers!!

  9. evilgreenmonkey Says:

    @Jez: I don’t do anything with AdSense at the moment, so YACG wasn’t of much interest to me. I prefer spending a little more time creating half decent looking micro-sites which pass on link juice and can potentially generate sales from long tail search in future. That doesn’t mean that YACG isn’t any good though, and I’m sure that some of my friends have dabbled with it. Regarding 20linksaday, haven’t tried it myself although will sign up and take a look. Fantomaster is an evil genius, so it should be pretty good.

    @Tony: I’m not jealous at all! :(

    @Gary: No problem, I haven’t shared code in a while, so it’s about time that I put something out there :)

  10. Lyme Says:

    Trying to get this going, but having a hell of a time finding free hosts that support htaccess. Anyone?

  11. Jake Says:

    Maybe this is a stupid question, but I am still learning a lot of stuff so bare with me please, but once you get the sites setup across these free web hosts, how do you get them indexed in the SE’s. It would seem that may take longer than setting up the initial sites or is there a relatively quick way to get them indexed, a script or something.

    I have no problems getting sites indexed, usually within a day or two, but if you setup a hundred or so of these “runoff” sites, it would seem challenging to index all of them. And I figure they would be worthless if not in Googles index?

    BTW, thank you for the information! :)

  12. eDigest Says:

    Hosting regional micro Websites in different countries is definitely better off for the SEO and marketing initiative. This challenge does worth the efforts for the Internet business, especially in combination with the robust e-commerce software to sell more and convert leads into customers easily.

  13. Gary Says:

    Hi Rob, I guess you are snowed under, but it never hurts to ask – any luck with the Open Source (GPL) application :-)

  14. Living the Dream with AutoBlogs - Page 2 - Black Hat Forum Says:

    [...] Re: Living the Dream with AutoBlogs Do you guys think this would work? SERPs still see the sites separate even if they are all redirected to one main domain right? http://www.evilgreenmonkey.com/blog/…o-hosting.html [...]

  15. Andrew Scherer Says:

    This is a great post even if its a little over my head. I’m going to spend the next few days trying to put this together, thanks for the great info.

  16. Glenn Says:

    Are there any compatibility issues with WordPress 2.7? I’ve been trying your code but am running into all sorts of problems.

  17. Glenn Says:

    Rob,

    Disregard my last comment I figured it out. For those of you using WordPress, I found that if you have background images load via CSS they will appear on the proxied site if you use absolute references instead of relative ones.

  18. groneg Says:

    i’m missing something here. what keeps $proxyto from getting indexed in the serps?

Leave a Reply