Ask about general coding issues or problems here.
Moderators: macek, egami, gesf
by pendulum » Mon Sep 17, 2012 8:37 am
I'm a php n00b, but I've been trying to create a script that will parse a URL and echo the results. It works now (sort of) but I know there's a better way of doing this. Would anyone care to take a look and see if they can help me make this work better?
- Code: Select all
<?php
// Score Scraper
?>
<html>
<head>
<link rel="stylesheet" type="text/css" href="scraper.css">
</head>
<body>
<?php
// Report all PHP errors (pull in prod. vers.)
// error_reporting(E_ALL);
//Data Source. Don't touch!
$data = file_get_contents('http://sports.espn.go.com/nfl/bottomline/scores');
// Creating data validation stamp from nfl_s_stamp
preg_match_all('/[0-9]{10,10}/', $data, $stamp);
// printing time stamp
echo "<p>Data Validation: ".$stamp[0][0]."</p>";
// Lets pull 1 game from $data and split it up into teams and scores.
// We'll close php so we can add some lame HTML shtuff here:
?>
<div id="scoreTicker">
<hgroup>
<h1>Latest Scores</h1>
</hgroup>
<!-- Now we'll re-open PHP and do the good stuff -->
<?php
// here we'll pull each game frome $data. Let's hope this still works next week
preg_match_all('/&\w*_left[0-9]*=\^?\w*%20\w*%20(%20|\w*\.?)%20\^?\w*(%20[A-Z][a-z]*|%20)?(%20[0-9]*%20|%20)?(%20[0-9]*%20)?(\(\w*\)|\(\d:\d*%20(AM|PM)?%20[A-Z]*\))&\w*_right[0-9]*_count\W[0-9]*&\w*\W\w*\W*\w*\W\w*\W\w*\W\w*\W\w*\W\w*\?\w*\W[0-9]{9,9}/', $data, $games);
foreach ($games[0] as $game) {
// pulling team one from $game
preg_match_all('/&\w*_left[0-9]*\W\^?\w*(%20[A-Z][a-z]*)?/', $game, $teamOne);
// pulling team two from $game
preg_match_all('/([0-9][0-9]%20%20|at)%20\^?[A-Z]*[a-z]*%20?[A-Z]*[a-z]*/', $game, $teamTwo);
// pulling score one from $game
preg_match_all('/%20[0-9][0-9]%20%20%20/', $game, $scoreOne);
// pulling score two from $game
preg_match_all('/%20[0-9][0-9]%20\(/', $game, $scoreTwo);
// pulling notes from $game
preg_match_all('/\(\w*\)|\(\d\D\d*%20(PM|AM)%20\w*\)/', $game, $notes);
// below we're going to print all the stuff we assigned above. Each array will have a preg_replace ran on it, to remove the ugly characters
echo "<div class=\"game\">
<div class=\"notes\">
".preg_replace('/%20/', ' ', $notes[0][0])."
</div><!-- end notes -->
<div class=\"home\">";
echo "<table><tr><td class=\"team\">".preg_replace('/&\w*_left[0-9]*\W/', '', $teamOne[0][0])."</td><td class=\"score\">".preg_replace('/%20/', '', $scoreOne[0][0])."</td></tr></table>";
echo "</div><div class=\"away\">
<table><tr><td class=\"team\">".preg_replace('/at%20|\d*\W\d*\W\d*\^|^[0-9]*%/', '', $teamTwo[0][0])."</td><td class=\"score\">".preg_replace('/%20|\(/', '', $scoreTwo[0][0])."</td></tr></table></div></div>";
}
?>
</body>
</html>
-
pendulum
- New php-forum User

-
- Posts: 1
- Joined: Mon Sep 17, 2012 8:19 am
by johnj » Mon Sep 17, 2012 10:59 pm
You need to post a sample url. Only then can somebody suggest if there is a better way to match and extract.
-
johnj
- php-forum Super User

-
- Posts: 1523
- Joined: Thu Mar 10, 2011 5:07 pm
Return to PHP coding => General
Who is online
Users browsing this forum: Google [Bot] and 2 guests