Total members 11894 |It is currently Thu Nov 21, 2024 3:42 pm Login / Join Codemiles

Java

C/C++

PHP

C#

HTML

CSS

ASP

Javascript

JQuery

AJAX

XSD

Python

Matlab

R Scripts

Weka





i want to grab the traffic news at this website:
Code:
http://www.onemotoring.com.sg/publish/o ... _news.html


i was able to screen scrape the whole page however, i only want to grab the traffic news which are in the table, is there any way that i could do that?

codes at my prac1.aspx:
Code:
<%@ Page Language="C#" AutoEventWireup="true" CodeFile="Prac1.aspx.cs" Inherits="Prac1" %>

<!
DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<
html xmlns="http://www.w3.org/1999/xhtml">
<
head runat="server">
    <
title>Untitled Page</title>
</
head>
<
body>
    <
form id="form1" runat="server">
    <
div>
    
    
Displaying a web page on your own page using Screen Scraping 
    
<br />
     <
asp:Button ID="btnDisplay" runat="server" onclick="btnDisplay_Click" 
            
Text="Display webpage now" />
        <
br />
        <
br />
        <
asp:Label ID="lblWebpage" runat="server"></asp:Label>
    
    </
div>
    </
form>
</
body>
</
html>
 

codes at my prac1.aspx.cs:
Code:

using System
;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;
using System.Net//namespace for webclient
using System.Text;

public 
partial class Prac1 System.Web.UI.Page
{
    protected 
void Page_Load(object senderEventArgs e)
    {

    }
    protected 
void btnDisplay_Click(object senderEventArgs e)
    {
        
WebClient webClient = new WebClient();
        const 
string strUrl "http://www.onemotoring.com.sg/publish/onemotoring/en/on_the_roads/traffic_news.html";
        
byte[] reqHTML;
        
reqHTML webClient.DownloadData(strUrl);
        
UTF8Encoding objUTF8 = new UTF8Encoding();
        
lblWebpage.Text objUTF8.GetString(reqHTML);
    }


any help??
thank you in advance guys
:))




Author:
Newbie
User avatar Posts: 1
Have thanks: 0 time

i was able to screen scrape the whole page however, i only want to grab the traffic news which are in the table, is there any way that i could do that?


Author:
Newbie
User avatar Posts: 29
Have thanks: 0 time

can you get at it using the DOM?

or maybe you will ned to use PREG on the relevant section.

I have done similar in cURL using pattern matching to grab the code I wanted.


Author:
Newbie
User avatar Posts: 3
Have thanks: 0 time

Download biterscripting from
Code:
http://www.biterscripting.com
. Start biterscripting. Enter the following command.

The entire code below is just one command. Enter the whole command on one line.

Code:
script SS_WebPageToCSV.txt page("http://www.onemotoring.com.sg/publish/onemotoring/en/on_the_roads/traffic_news.html") number(11)


Try it now. This particular script seems to have been written just for you :-) It is open source. I did not write it, but I have been using it and other biter scripts.

Hope this helps. I am assuming you are getting this data only for your personal use and not to republish.

Randi


Author:
Newbie
User avatar Posts: 1
Have thanks: 0 time
Post new topic Reply to topic  [ 4 posts ] 

  Related Posts  to : how to screen scrape or grab some parts of a website?
 link to parts in the same page     -  
 Splash Screen     -  
 Screen Capture and multicast     -  
 My Header is Not Fitting to the Screen!     -  
 full Screen Graphics     -  
 Full Screen graphics (Lesson 2).     -  
 MacBook Pro Turned Into a Blue Screen     -  
 need code for my website     -  
 Website designer     -  
 Make A Website In Flash     -  



cron





Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
All copyrights reserved to codemiles.com 2007-2011
mileX v1.0 designed by codemiles team
Codemiles.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com