Drupal Articles

Stupid Simple Web Scraping with SimpleXML

Section:

The other day, I was tasked with building a data scraper. Having never built such a contraption, I naturally turned to the Internets for preexisting code. I was horrified with what I found.

The “free” PHP scripts (that’s “free” as in “free baby vomit”) were all infested with the worst sorts of newfangled regex, and PHP 4 era DOM traversing.

WWYUIGD? Two Guidelines for Writing Hell's Best CSS

WWYUIGD (What would Yahoo User Interface Grids Do?)  is defined as two things:

  1. Exaggerating a good idea to such ridiculous proportions that it becomes a bad idea.
  2. The duel guidelines that anyone can use to write hell's best CSS

Guideline One: Meaningless numbers are great

The .yui-t(x) set of classes offer powerful control over sidebar widths, and positions. Indeed, so powerful that the classes themselves offer neither clue, nor an understandable pattern. 

  1. .yui-t1: 160 on left
  2. .yui-t2: 180 on left
  3. .yui-t3: 300 on left
  4. .yui-t4: 180 on right
  5. .yui-t5: 240 on right
  6. .yui-t6: 300 on right
  7. .yui-t7: One full width column 

If Yahoo was full of sissies, they would have made this system semi easy to remember by using this pattern:

Enabling/Installing New Modules via Update.php: The Complete Solution

In our last episode of enabling new modules via update.php, Steve McKenzie pointed me to a better method: module_enable(). A quick test found, however, that it didn't run the install files, and didn't rebuild the module files cache. So after spending 5 minutes in system.module, I found all the missing pieces. The example update function below will install and enable the new module, as well as rebuild all the css, node type, and menu caches.

Enabling New Modules Via Update.php

UPDATE: There's a better way.

I work with 3 other developers, all of whom have their own local sandbox of our site. Since we're constantly adding new modules, I found a simple way to enable a new module via another module's .install file. That way, all we have to do is run update.php when we update our source tree.

Here's a simple example update function:

<?php

Drupal trick: Returning a themed menu tree with nothing more than the system path

I remember something a long time teacher said, "Nick, if you make a suit out of a gorrilla, the arms are too long." I forgot why that was relevent to the topic of theming menu trees.

Moving on, here's a nice little function I wrote to return a themed menu tree by path.

<?php
// will return all menu items under "administration".
print theme('menu_tree_by_path','admin');

// will return links to all node submission forms
print theme('menu_tree_by_path','node/add');

// return the correct menu array by path
function menu_get_mid_by_path($path) {
// oddly, menu_get_item accepts a path, but returns the parent id.

7 jQuery Plugins That Made Our Lives Easier at ON Networks

We, the developers of ON Networks released version 1.1 of our website this evening (its built off of drupal of course... if it weren't, than I would go sharing it with the planet, would i?). The notable improvements are ajax comments, tooltips for episodes, and a global navigation.

WYMeditor (What You Mean is What You Get) Poised for World Dominance

Back at ON Networks, we just finished upgrading from tinyMCE* to WYMeditor to power the textareas on the backend.

WYMeditor's main concept is to leave details of the document's visual layout, and to concentrate on its structure and meaning, while trying to give the user as much comfort as possible (at least as WYSIWYG editors).

...The end-user defines content meaning, which will determine its aspect by the use of style sheets. The result is easy and quick maintenance of information.

The Key to Jquery Form Plugin + Drupal Formapi

Today I bring you an incomplete, yet stunningly easy solution to a problem that's been making want to set buildings on fire.

ON Networks Redesigns

Today, we, the people of ON Networks launched version 1.0* of our website. Its built off of Drupal 5.2, and more than ready for 6.0 (please god, grant me the API improvements in 6.0 right now).

Since this is the first time I’ve introduced my company on my blog, I ask you forgive me for giving you “the pitch.”

We’re into online video.

We’re not idiots: we are not a “youtube 2.0… with tagging, ajax, and a bunch of fake content that is supposed to look like it came from real people”. (this seems to be the majority of online video startups)

Rather, we’ve decided that old media may actually have a few good ideas: say… professionally produced content, high production values, and video quality unmatchable on the internet (not counting subscription, and pay-per-view sites) . Oh, and letting you watch our videos on your ipod, iphone, tv, or screen, be it via rss, email , or appletv subscriptions.

Here is Backpack Picnic (sketch comedy), and Play Value (an exploration of the history of video games): two shows that really illustrate what we are after.

Our business model comes from an analysis of the structural weakness of old media. One way you could put it is, “We’re what NBC, BBC, and Fox would be if they could start their entire business over again.”

Our traffic and content is growing exponentially, and the daily bandwidth required to pump out our videos is measured in terabytes (tigs, as we call them in the office).

Anyhow, enough about us:

Thank you drupal. Expect a donation very soon.

*Nobody at the company is actually using version numbers, so I’m calling it 1.0

R.I.P. PHP 4.x (2000-2008)

The PHP development team has sentenced PHP 4.x to death. On December 31st, 2007, there will be no more releases of PHP 4.4. On August, 8th, 2008, they will discontinue even critical security updates.

I wonder if the drupal community had anything to do with instigating this? I'm tempted to think so...

Praise Allah! PHP lives!

Pages

Subscribe to Drupal Articles