Converting a Drupal Site to Straight HTML

10.19.2008

When Bad Hosts (Yahoo Small Business) Happen to Good Drupal Sites

Time and again, I work for a client who is stuck with a horrid server environment -- say -- Yahoo Small Business.* Surely you must think that a company like Yahoo -- who employs none other than Rasmus Lerdorf -- would offer a decent environment. That's where you'd be wrong.

Its so bad that Drupal 6 won't even allow itself to ATTEMPT an installation. And with good reason. Consider the facts:

It seems like its not even worth the effort to try to get drupal to run there. Since the site has to be live on monday, I employed a crude method that gathers all URL aliases, and a few more paths, and generates a full tree of directories and index.php files to mirror the pathauto links. Break the glass, and use this in case of emergency. [its best to run this on a local machine, btw -- it takes a while, and you'll want to turn up max page execution time up to near 120 seconds and beyond for most sites in the php.ini files.

<?php
// this function will dump the html in a file called "exports" in your files directory
because_yahoo_hosting_sucks();
function
because_yahoo_hosting_sucks() {
 
$output = 'test';
 
$sql = db_query("SELECT dst FROM {url_alias}");
 
$count = 0;
  while(
$result = db_fetch_object($sql)) {
  
/// use absolute paths with drupal_http_request
   
$response = drupal_http_request(url($result->dst, array('absolute' => true)));
   
// result->data is the actual page
   
$output .= drupal_to_html($result->dst, $response->data);
   
$output .= "<br />";
   
$count++;
  }
// couple of pages left out of the aliased table
 
$more = array(
   
'contact',
   
'case-studies'
 
);
  foreach(
$more as $url) {
   
$response = drupal_http_request(url($url, array('absolute' => true)));
   
$output .= drupal_to_html($url, $response->data);
   
$output .= "<br />";
     
$count++;
  }
 
$output .= "<br />";
// be on the lookout for pages with incorrect paths, and debug at will...
 
$output .= "wrote ".$count." crappy html files because yahoo sucks...";
  print
$output;
  exit();
}

function drupal_to_html($url, $page) {
 
// only creates a directory if there's a file for the directory.
  // this can be fixed, but only bugged me in 1 or 2 pages.
 
$path = file_create_path('export/'.$url);
 
file_check_directory($path, FILE_CREATE_DIRECTORY);
 
// we add index.php because of pathauto urls
 
file_save_data($page, $path .'/index.php', FILE_EXISTS_REPLACE);
  return
$path.'/index.php';
}

// now weep at your dreamweaver site
?>

Tips

  • Clear cache, and turn on css and JS propagation, files/js, files/css to to the same relative path as it is on your drupal install
  • lrn2use version control - it makes dealing with themes, files, and images much easier in this sad scenario

Notes

  • Yes, I thought of recommending they switch hosts... they need the site live on monday, and have 80-100 email address on their yahoo domain that can't stop working...
  • Yes, I know you can route mail to one IP, and web traffic to another IP. Guess what? They don't have their own IP.
  • And, they bought the domain using yahoo, so they have no control over it anyhow. Its either all traffic to them, or all traffic somewhere else.

Comments

What would you suggest

I am new to Drupal so please bear with me. I am working on creating a site Slotsregeln that I would have to hand over to someone but they do not have Drupal installed and want site in HTM/CSS only. Is there any way to convert drupal site to HTML/CSS extracting out the info from directories ? I do not need dynamic database drive information, just the layout/structure ? can that be done, is there any documentation ?

Drupal Development

It’s great to find such a well laid out article, thanks for the information.

ugh! yahoo hosting

i just ran into this problem trying to put a wordpress site onto someone's existing yahoo hosting. can't imagine trying to do drupal on there. so many limitations! it's not even super cheap is it?

No, ignorance is the only

No, ignorance is the only excuse for choosing it -- to be perfectly frank.

I have found interesting

I have found interesting sources and would like to give the benefit of my experience to you.
I am tuning my pc by the best software for free, with the file search engine BecoMon
May be you have your own experience and could give some useful sites too. Because this social site help me much.

Drupal Modules

There is already a Drupal Module which does nearly the same thing you coded above: http://drupal.org/project/html_export

Also, I have developed a custom module, Publish to FTP, which saves nodes as .html files and publishes them to your ftp server. Unfortunately, it is taking a while to get CVS access so they arent up at drupal.org yet, but let me know if you are interested.

Actually, as it turns out, it

Actually, as it turns out, it doesn't do the same thing.... it names files like:
bobs-autoparts-what-makes-our-autoparts-special-dicks-guarantee

when the actual path is:
bobs-autoparts/what-makes-our-autoparts-special/dicks-guarantee

not a small difference.

But it DOES gather theme, module, and js files files for you which is VERY useful.

Why not wget?

Your approach will work for pages that have aliases only. What about pages are that are not nodes? For examples, panels, views and module generated pages?

Why not use wget to mirror the entire site?

Well, I guess it is the Monday deadline. Next time, you have another tool in your bag.

Heh. Guess I need to overcome

Heh. Guess I need to overcome my fear of the commandline! That's what I'll do next time! Oh well, at least it was an opportunity to show drupal_http_request, and file writing functions in action.

Don't FEAR the command line!!!

The CLI is your friend... and so is wget!

Kudos for coming up with a solution in time using what you had. As we know, sometimes that's all that really matters ;-)

Justin

see also

http://drupal.org/node/27882

http://drupal.org/project/html_export

Are comments moderated here? I tried posting before, and maybe having only links in the body triggered a spam filter or something?

I think its a combination of

I think its a combination of cache and mollom that seems to give an illusion of moderation

Patience of a saint.

Hi Nick,

That client is lucky to have you, and not me, as their developer.

I simply wouldn't put up with such a hosting environment at all - flat HTML or Drupal.

In this case, I would register a new URL for the company - newdomain.com or something.
Deploy the new site to this domain on a proper host.
Put a .htaccess file on the old domain, which does a 302 [temporary] redirect to the new domain.

So this works as an immediate solution, redirecting web requests only, allowing time to plan a proper migration.,
at which point, again use a redirect, this time 301 [permanent] , from newdomain.com to domain.com.

Regards
Alan

Yahoo doesn't allow .htaccess

Yahoo doesn't allow .htaccess files. They do nothing -- you can try, but its in vain. Furthermore, its not a small company, so can't just *tell* a contact to *tell* their CEO that we are switching domains. If I truly had patience and virtue, I would have attempted to get drupal 6 running on the ol' chestnut of a platform.

As an aside, we'll be switching hosts soon, but the key was a -- like now -- launch that didn't involve excuses like, "well ya see, drupal needs..."

What about a PHP Redirect?

a PHP redirect to the temporary domain so that things were running might be a workaround for the lack of .htaccess

eg:

<?php
header
('Refresh: 0; URL=http://www.askapache.com'); exit; exit();
?>

BTW - what is life and death about the launch? If your customer truly cared about their website they would delay the launch to ensure the website worked correctly first.

Let me guess - big media campaign, running monday, relies on the new website to be working... told you at the last possible moment?

Heh, actually, I'm not

Heh, actually, I'm not totally sure: all that mattered to me was that they wanted it on monday, so we got it up on monday. I presented the redirect as an option, but I guess they weren't keen on the idea, which is understandable. They've been good clients, so I wouldn't say they told us at the last second. Really its more my fault for not finding out information about their host sooner.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.