Your Ad Here

The easiest way to scrape details from a myspace profile page with php (you won’t believe how simple it is)

It’s amazing how just a little optimization on the part of myspace makes crawling their site so much easier. We’re going to scrape the user detail (name, age, sex, etc..) from a profile, using the header info like so:

Set your myspace url:

$your_profile_url = ‘http://www.myspace.com/waxjelly’;

Now grab the file using the “file()” function. (We want an array, so we can crawl it and use “trim()” to clean it up)

$file = file($your_profile_url);

Now we setup a string and loop through the profile array to clean it up.

$profile = ”;
for ($i=0; $i<count($file); $i++) {
$profile .= trim($file[$i]);
}

Now we use simple explode functions to do the rest of the work. What we’re looking for is the “<meta” tag at the beginning of the file that will grab the basic details. (This got thrown in place when myspace partnered with google for search optimization. Thanks, guys.).

$det_arr = explode(’<meta name=”description” content=”myspace profile - ‘, strtolower($profile));
$det_arr = explode(’” />’, $det_arr[1]);

We’ve got the whole string that we need, but now we need to separate it into an array we can manage. This is the prettiest part of this script. Myspace prints the element even if it’s blank, so if you leave your city blank, we’ll get a nice little “…Male, , Texas,…” string. Note the double commas. That means we can explode on the comma, and still get a consistently indexed array. (Index 3 will always be city, even if it’s blank. And index 4 will always be state, even if city is blank. Make sense?)

$details = explode(’,', $det_arr[0]);

Now that we have the array that we want, we simply assign them to a more usable system.

$det['name'] = $details[0];
$det['age'] = $details[1];
$det['sex'] = $details[2];
$det['city'] = $details[3];
$det['state'] = $details[4];
$det['country'] = $details[5];
$det['phrase'] = $details[6];

… and print the results.

print_r($det);

That’s it! You can get a working version of the script here. Enjoy!

Your Friend and Mine,
Meshach

[digg=http://digg.com/programming/The_easiest_way_to_scrape_details_from_a_myspace_profile_page_with_php]

12 comments so far

  1. pcdconny March 19, 2007 1:41 pm

    Hi,
    nice tutorial, thanks ;-)
    I just translate it in german, you can find it here:
    http://www.php-developer-blog.de/50226711/myspacecomprofil_einfach_mit_php_auslesen.php

    Regards from Stuttgart
    Conny

  2. bryan March 19, 2007 4:53 pm

    Rock on Conny, rock-on!

  3. [...] the WaxJelly blog today comes a handy bit of code for anyone out there looking to scrape details from just about any MySpace page out there (quick [...]

  4. Lord skelet-o May 5, 2007 7:08 am

    How do i run the script?

  5. ShinZaiaku May 27, 2007 6:19 pm

    I’m not really sure how I’m going to use this one. I don’t have much reason for scraping myspace profiles that i could think of.

  6. Ralph-E-Boy August 28, 2007 11:26 am

    Am I missing something? I made the page, then downloaded the working version to make sure, but I don’t get any info… just “Array ( [name] => [age] => [sex] => [city] => [state] => [country] => [phrase] => )” prints out in the browser and none of my info… any help would be appreciated. I’m working on one of those fancy flash layouts for my myspace page and this would put it over the top… Will gladly share when / if it gets done…

  7. Zaiaku August 29, 2007 2:30 pm

    Don’t you hate wheny ou download something and can’t find it and have to download it again.

  8. Capt. Kirk August 29, 2007 8:23 pm

    “Am I missing something? I made the page, then downloaded the working version to make sure, but I don’t get any info”

    By looking at myspace profile it seems info are now stored in [title], non in [meta], so the script should be changed accordingly.

  9. Zaiaku August 31, 2007 1:16 pm

    I don’t think this version works any more. Myspace ha been changing alot of stuff on their site and very often.

  10. Nate November 14, 2007 5:41 pm

    I want to use this…but I get the array printing and not the actual data…any help appreciated

    http://www.jenomedia.com/crawlmyspace/crawlmyspace.php

  11. Bucky January 29, 2008 1:10 am

    I’m trying to replace a block of text.

    .contactTable .whitetext12{VISIBILITY:HIDDEN;}

    Your Text

    Anyone know what the problem is?

  12. Bucky January 29, 2008 1:11 am

    I’m trying to replace a block of text.

    .contactTable .whitetext12{VISIBILITY:HIDDEN;}

    DIV

    Your Text

    Anyone know what the problem is?

Leave a comment

Please be polite and on topic. Your e-mail will never be published.