In this part of the tutorial (Building a simple scraping website) we will finish up by displaying the articles we fetched in previous part in the home page.
Series Topics:
- Building A Simple Scraping Website With PHP Laravel Part1: Beginning
- Building A Simple Scraping Website With PHP Laravel Part2: Dashboard and Crud
- Building A Simple Scraping Website With PHP Laravel Part3: Article Display
We will update our home controller and add the code required to display the latest articles and articles by a specific category.
Open app/Http/Controllers/HomeController.php and modify it like this:
<?php namespace App\Http\Controllers; use App\Article; use App\Category; use Illuminate\Http\Request; class HomeController extends Controller { public function index() { $articles = Article::orderBy('id', 'DESC')->paginate(20); return view('home')->withArticles($articles); } public function getArticleDetails($id) { $article = Article::find($id); return view('details')->withArticle($article); } public function getCategory($id) { $category = Category::find($id); $articles = Article::where('category_id', $id)->orderBy('id', 'DESC')->paginate(20); return view('category')->withArticles($articles)->withCategory($category); } }
resources/views/home.blade.php
@extends('layout') @section('content') <div class="row"> <div class="col-md-12"> <h2>Articles</h2> @if(count($articles) > 0) @foreach($articles as $article) <div class="row"> <div class="col-md-12"> @if(!empty($article->image)) <img src="{{ $article->image }}" class="pull-left img-responsive thumb margin10 img-thumbnail" width="200" /> @endif <h4><a href="{{ url('article-details/' . $article->id) }}">{{ $article->title }}</a></h4> <span class="label label-info"><a href="{{ url('category/'.$article->category_id) }}">{{$article->category->title}}</a></span> @if(!empty($article->excerpt)) <article> <p>{!! $article->excerpt !!}</p> </article> @endif <em>Source: </em><a class="label label-danger" href="{{ $article->source_link }}" target="_blank">{{ $article->website->title }}</a> <a class="btn btn-warning pull-right" href="{{ url('article-details/' . $article->id) }}">READ MORE</a> </div> </div> <hr/> @endforeach @if(count($articles) > 0) <div class="pagination"> <?php echo $articles->render(); ?> </div> @endif @else <i>No articles found</i> @endif </div> </div> @endsection
resources/views/category.blade.php
@extends('layout') @section('content') <div class="row"> <div class="col-md-12"> <h2>Category: {{$category->title}}</h2> @if(count($articles) > 0) @foreach($articles as $article) <div class="row"> <div class="col-md-12"> @if(!empty($article->image)) <img src="{{ $article->image }}" class="pull-left img-responsive thumb margin10 img-thumbnail" width="200" /> @endif <h4><a href="{{ url('article-details/' . $article->id) }}">{{ $article->title }}</a></h4> <span class="label label-info">{{$article->category->title}}</span> @if(!empty($article->excerpt)) <article> <p>{!! $article->excerpt !!}</p> </article> @endif <em>Source: </em><a class="label label-danger" href="{{ $article->source_link }}" target="_blank">{{ $article->website->title }}</a> <a class="btn btn-warning pull-right" href="{{ url('article-details/' . $article->id) }}">READ MORE</a> </div> </div> <hr/> @endforeach @if(count($articles) > 0) <div class="pagination"> <?php echo $articles->render(); ?> </div> @endif @else <i>No articles found</i> @endif </div> </div> @endsection
resources/views/details.blade.php
@extends('layout') @section('content') <div class="row"> <div class="col-md-12"> <h2>{{ $article->title }}</h2> <div class="row"> <div class="col-md-12"> <img src="{{ $article->image }}" class="pull-left img-responsive thumb margin10 img-thumbnail" /> <span class="label label-info"><a href="{{ url('category/'.$article->category_id) }}">{{$article->category->title}}</a></span> <em>Source: </em><a class="label label-danger" href="{{ $article->source_link }}" target="_blank">{{ $article->website->title }}</a> <article> <p>{!! $article->content !!}</p> </article>> </div> </div> </div> </div> @endsection
Download source
https://bitbucket.org/webmobtuts/web_scraper/src/master/
What’s Missing?
Now after you tried the scraping when clicking on link now you need away to implement it automatically, well this need to be run in something like cronjob, i leave this part to you as an assignment.
What's your reaction?
Excited
3
Happy
0
Not Sure
0
Confused
3
The post like very much like this was very good about this ,very nice.
I got undefined index title? what should I do?
Where exactly?
Clone this repo https://github.com/mnaderian/web-scraper
We have the same code and also we got the same error. The error appears above the table after a few seconds when we click scrape
Give an example of the website that you want to scrape
Hi bro thanks one of the best tuto in the web bro can you add new tutorial about task scheduller using laravel
Ok
I already entered the schema item but the scrape link section was never finished.
The scrap doesn’t stop twisting
what do you mean stop twisting
what do you mean with never finished?
i got error when i click scrape undefined index source_link
Why? what’s the url you want to scrape?
thank you very informative post , i have a issue with route when try to edit previous session (Missing required parameters for [Route: links.update] [URI: dashboard/links/{link}]. (View: C:\xampp\htdocs\laravel\webscraping\resources\views\dashboard\link\edit.blade.php)
This is a working version
https://bitbucket.org/webmobtuts/web_scraper/src/master/
This working version has exactly the same issue when you click on an edit button. I am using laravel 6.
It’s working with NY times site but not with my site. It shows Scraping done but actually no data fetched. Strange!
Remember that for scraping to work:
* Scraping works on websites that is server rendered in other means sites that is fetching data through javascript or ajax it fails.
* Some websites work through proxy so you need to configure that first.
* Some websites uses firewalls to prevent data to be fetched from.
* With all of that also adjust the schema, if the schema or selectors not properly composed it will fails.
Ok, thank you.
How can i scrap data which has pagination ? I can only scrap first 10 data only. There are 100 data. total pages 10.
Can you please help with this ?
I think this simple you can tweak the code to fetch the pagination links then do the same http request on each page
Problem with the SSL CA cert (path? access rights?) for “https://www.nytimes.com/section/politics
please help..
Nytimes works properly for me it might be something in your operating system
I have all the code just like yours, I tried to scrape the NY times website and when I click “scrape” the process never ends. I get this error:
Failed to load resource: the server responded with a status of 500 (Internal Server Error)
And the resource that can’t load is dashboard/links/scrape
Failed to load resource it seems that you entered invalid link
THere is no invalid link, i checked and i have the exact same code as you 🙁
This is the error i get:
jquery-1.12.4.min.js:formatted:4208 POST http://localhost:8000/dashboard/links/scrape 500 (Internal Server Error)
send @ jquery-1.12.4.min.js:formatted:4208
ajax @ jquery-1.12.4.min.js:formatted:3990
(anonymous) @ links:204
dispatch @ jquery-1.12.4.min.js:formatted:2117
r.handle @ jquery-1.12.4.min.js:formatted:1996
The line 204 in links is the $.ajax({}) function in scrape method
I’m trying to scrape the NY times website but it always says “something went wrong”, I copied all the html tags you used in the video example, except for one that has changed
Thanks, but when I click in scrape it take a long time loading and it does not end ever
what I should do, please?
Check if laravel log reports any errors
thanks for sharing this post is helpful to me
i want to do it
Thanks