php - Query google search engine? -


i trying query google search engine date first page results process it. query using returns results not in date range set; if copied same query google works date not php script. script returns current or normal results if date parameter not set. part of code snippet used below. query referring below in code snippet posted in $url variable.

query:https://www.google.com/search?q='.$query.'&source=lnt&tbs=cdr%3a1%2'.$startdate.$enddate.'&tbm=

$query= $_post['query']; $query=str_replace(" ","+",$query); if ($_post['start_date']==''){ $startday='1'; $startmonth='11'; $startyear='2011'; } if ($_post['end_date']==''){ $endday='1'; $endmonth='11'; $endyear='2013'; } $startdate='ccd_min%3a'.$startmonth.'%2f'.$startday.'%2f'.$startyear.'.%2'; $enddate='ccd_max%3a'.$endmonth.'%2f'.$endday.'%2f'.$endyear.'';  if ($_post['query']!=''){ $url  = 'https://www.google.com/search?    q='.$query.'&source=lnt&tbs=cdr%3a1%2'.$startdate.$enddate.'&tbm='; echo $url .'<p>'; $html = file_get_html($url); $searchresults=array(); $linkobjs = $html->find('h3.r a'); foreach ($linkobjs $linkobj) { $link   = trim($linkobj->href);      // if not direct link url reference found inside it, extract     if (!preg_match('/^https?/', $link) && preg_match('/q=(.+)&amp;sa=/u', $link, $matches) && preg_match('/^https?/', $matches[1])) {         $link = $matches[1];     } else if (!preg_match('/^https?/', $link)) { // skip if not valid link         continue;     }     array_push($searchresults,$link); } 

google presents different html structure devices without javascript enabled (file_get_html($url)). temporarily disable javascript on chrome , inspect page. way you'll sure correct div id's, classes, etc use on script.


update based on comments:

google doesn't allow searching date range via direct url if javascript disabled. although, can still use daterange google operator find pages indexed googlebot within date range specified. dates submitted must in julian date format , fractions should omitted operator work properly.

example: daterange:2452671-2452671 lisbon 

the daterange operator requires @ least 1 proper search term , can combined other operators.


gregoriantojd()

to convert gregorian date julian date can use php function gregoriantojd( int $month , int $day , int $year ), i.e.:

$startdate = gregoriantojd(12, 28, 2011); //2455924  $enddate = gregoriantojd(12, 28, 2014); //2457020 

your search $url should this:

$url = "https://www.google.pt/search?q=lisbon+daterange:2455924-2457020&btng=search&num=100&gbv=1" 

final code:

include_once("simple_html_dom.php");  $startdate = gregoriantojd(12, 28, 2011); //2455924 $enddate = gregoriantojd(12, 28, 2014); //2457020 $nresults = "100"; $query= "lisbon";  $url = "https://www.google.com/search?q=$query+daterange:$startdate-$enddate&btng=search&num=$nresults&gbv=1";  echo $url .'<p>'; $html = file_get_html($url); $searchresults=array(); $linkobjs = $html->find('h3.r a'); foreach ($linkobjs $linkobj) { $link   = trim($linkobj->href);      // if not direct link url reference found inside it, extract     if (!preg_match('/^https?/', $link) && preg_match('/q=(.+)&amp;sa=/u', $link, $matches) && preg_match('/^https?/', $matches[1])) {         $link = $matches[1];     } else if (!preg_match('/^https?/', $link)) { // skip if not valid link         continue;     }     array_push($searchresults,$link); } print_r($searchresults);  /* array ( [0] => http://www.cnn.com/2014/01/25/travel/lisbon-coolest-city/ [1] => http://www.tripadvisor.com/tourism-g189158-lisbon_lisbon_district_central_portugal-vacations.html etc... */ 

Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -