 |

12-12-2005, 10:20 AM
|
|
WebProWorld Pro
|
|
Join Date: Aug 2004
Location: Maryland
Posts: 219
|
|
Parsing Another Website With PHP
My son, who plays a video game called SOCOM 3, came to me with an interesting question. They have a website at
Code:
http://socom3.scea.com/ <= Short URL
where their stats are stored in a database. You can go to the website and search for a players stats:
Code:
http://socom3html-prod.svo.pdonline.scea.com:10070/SOCOM3_HTML/stats/Stats_CareerSearch_Submit.jsp?userName=INSANE.CASPER&gameMode=0
Only thing is, is that you have to be logged in to get to the page. He gave me his username and password for the site, and I've tried passing it in every way I could find.
My question is, how can I get to the page, and grab the info? Any suggestions?
|

12-12-2005, 02:20 PM
|
|
WebProWorld Pro
|
|
Join Date: Aug 2004
Location: Maryland
Posts: 219
|
|
Here's what I have so far
This is my code so far, but it doesn't seem to work. The verification just won't seem to work.
Code:
<?php
$ch = curl_init();
$url = "http://socom3html-prod.svo.pdonline.scea.com:10070/";
$url = $url . "SOCOM3_HTML/stats/Stats_CareerSearch_Submit.jsp?userName=";
$stats_for = "INSANE.CASPER";
$url = $url . $stats_for . "&gameMode=0";
$user_name = "username";
$user_pass = "password";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'PHP scraper 0.01');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERPWD, "($user_name.:$user_pass)");
curl_setopt($ch, CURLOPT_POST, "10070");
curl_setopt($ch, CURLOPT_HEADER, 1);
$content = curl_exec($ch);
curl_close($ch);
$content = ereg_replace ('"/', '"http://socom3html-prod.svo.pdonline.scea.com:10070/', $content);
echo $content;
?>
|

12-12-2005, 02:25 PM
|
|
WebProWorld Member
|
|
Join Date: Jul 2003
Location: Eastern US
Posts: 86
|
|
cURL (command line or via PHP). Simple as that :D. It's a bit difficult to find resources on it, so if you need help using it let me know. It allows you to request a page exactly as if the server was a regular user. You can set things like cookie storage location, user-agent, and GET and POST (what you would be looking for to login) info. The result can be returned as a variable, allowing you to do w/e parsing you need to do to it. If you can't find a sample, let me know and I can work with you, cURL has been a life-saver for me, lol.
EDIT: You beat me to it, I will look at your code and see what I can see...
EDIT2: Try this and see how it works, obviously I am unable to try it myself :D. If there is something wrong with it and you can't figure it out, I will be back tonight and I can see what I can do again.
Code:
<?php
// login and store cookie data
$cookielocation = "socomcookies.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookielocation);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookielocation);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$url = "https://socom3html-prod.svo.pdonline.scea.com:10079/SOCOM3_HTML/account/Account_Login_Submit.jsp";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "userName=" . $usernamehere . "&passWord=" . $passwordhere);
$result = curl_exec($ch);
// get actual data
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookielocation);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookielocation);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$url = "http://socom3html-prod.svo.pdonline.scea.com:10070/SOCOM3_HTML/stats/Stats_CareerSearch_Submit.jsp?userName=" . $usernamehere . "&gameMode=0";
curl_setopt($ch, CURLOPT_URL, $url);
$result = explode("\n", curl_exec($ch) );
curl_close($ch);
// can parse here, currently ouputing each line
foreach( $result as $currline ) { echo $currline . "\n"; }
?>
|

12-12-2005, 02:59 PM
|
|
WebProWorld Pro
|
|
Join Date: Aug 2004
Location: Maryland
Posts: 219
|
|
Thanks so much for your help, and your offer for more. I implimented the changes you made, and all the outup I get is the numeral 1. That's it "1".
|

12-13-2005, 05:45 AM
|
|
WebProWorld Member
|
|
Join Date: Jul 2003
Location: Eastern US
Posts: 86
|
|
Sorry about not getting back to you last night like I said I would, I got overloaded with schoolwork. I can't wait until next year when I can go off to college and have more free time, yay! lol Anyways...
It would seem as though something got set wrong. I would first check the cookie file to see if it has anything in it (might look something like:).
Code:
scea.com FALSE /SOCOM3_HTML/account/Account_Login_Submit.jsp FALSE 0 username *Username*
scea.com FALSE /SOCOM3_HTML/account/Account_Login_Submit.jsp FALSE 0 password *random characters*
If you don't have anything at all there, obviously its an issue with the first part. I would probably also try changing the code to:
Code:
curl_setopt($ch, CURLOPT_POSTFIELDS, "userName=" . $usernamehere . "&passWord=" . $passwordhere);
// display actual output of the first page
echo curl_exec($ch);
echo "<hr />\n<hr />\n";
Code:
curl_setopt($ch, CURLOPT_URL, $url);
// display actual output of the second page
echo curl_exec($ch)
// can parse here, currently ouputing each line
//foreach( $result as $currline ) { echo $currline . "\n"; }
: so that you can further narrow down where the problem might be. Let me know what you get with those changes, I will try to a little more prompt this time :D.
|

12-13-2005, 10:14 AM
|
|
WebProWorld Pro
|
|
Join Date: Aug 2004
Location: Maryland
Posts: 219
|
|
I added the HEADER option to both $ch and the second one returned the following header.
The socomcookies.txt file I made is empty. The permission set is 777.
http://www.eastcoastassassins.com/get_stats.php
|

12-13-2005, 02:32 PM
|
|
WebProWorld Pro
|
|
Join Date: Aug 2004
Location: Maryland
Posts: 219
|
|
OK, I made small changes to the script we've worked on. This is what I have now:
Code:
<?php
$user_socom_name = "";
$user_socom_pass = "";
$socom_stats_for = "INSANE.CASPER";
// login and store cookie data
$cookielocation = "socomcookies.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookielocation);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookielocation);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$url = "https://socom3html-prod.svo.pdonline.scea.com:10079/SOCOM3_HTML/account/Account_Login_Submit.jsp";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "userName=" . $user_socom_name . "&passWord=" . $user_socom_pass);
$result = curl_exec($ch);
// get actual data
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookielocation);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookielocation);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$url = "http://socom3html-prod.svo.pdonline.scea.com:10070/SOCOM3_HTML/stats/Stats_CareerSearch_Submit.jsp?userName=" . $socom_stats_for . "&gameMode=0";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
$content = curl_exec($ch);
curl_close($ch);
$content = ereg_replace('"/', '"http://socom3html-prod.svo.pdonline.scea.com:10070/', $content);
echo $content;
?>
Now the socomcookies.txt file has something in it:
It still will not allow me to access the page we are aiming for though. I know you'r ebusy, but I just wanted to update you.
|

12-13-2005, 03:01 PM
|
|
WebProWorld Member
|
|
Join Date: Jul 2003
Location: Eastern US
Posts: 86
|
|
Wow, this is really stumping me. I only have two more things you can try:
- Add curl_setopt($ch, CURLOPT_VERBOSE, 1); to each of the executions (should show all of cURL's output and such)
- Change the follow location option to true (1) like this for both of them: curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); That was one thing I just noticed, that should take care of the redirect problem
- If those don't help, I would try posting on a PHP dedicated forum such as http://www.phpdn.net. Although I am experienced with it, sometimes it just takes a fresh look :D
Let me know what you get.
EDIT: lol, went to grab something to eat in the middle of typing this and saw that you responded again, let me check what you've got out.
EDIT2: That is very strange, it is accessing the first page fine and not the second... at this point, I would go ahead and post on the forum I suggested, since they are dedicated to PHP. Obviously you can refer to this thread so that you don't have to retype everything. Very sorry I couldn't be much of a help, but let me know what you find, it might be useful to me in case I ever run into a difficult site.
Now that I think about it, you could just go chew out the web design team for making such a difficult website... lol.
|

12-13-2005, 03:29 PM
|
|
WebProWorld Pro
|
|
Join Date: Aug 2004
Location: Maryland
Posts: 219
|
|
Please don't apologize, you've already taken the script leaps and bounds past where I had it.
I will post on the other forum you suggested. Thanks again for all of your help, and I'll post the fix - if I ever get it.
Also, if you think it would help, I'll get you a username and password to use. There's no financial information stored, just game stats. So the only thing that could be messed up is his score, but he'd probably hate me for life for that.
Let me know if you're interested.
|

12-13-2005, 08:29 PM
|
|
WebProWorld Member
|
|
Join Date: Jul 2003
Location: Eastern US
Posts: 86
|
|
Absolutely, I can't stand to leave a problem unfixed :D. I will send you a PM with contact info (although you could just send it that way, it's up to you). You can trust me with that information; as a teenage guy (almost an adult though, lol), I know what games mean to the people that play them 8).
|
| Thread Tools |
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|