Backend Development

Building A Simple Scraping Website With PHP Laravel Part3: Article Display

website scraping with laravel

In this part of the tutorial (Building a simple scraping website) we will finish up by displaying the articles we fetched in previous part in the home page.

 

 

Series Topics:

 

We will update our home controller and add the code required to display the latest articles and articles by a specific category.

Open app/Http/Controllers/HomeController.php and modify it like this:

<?php

namespace App\Http\Controllers;

use App\Article;
use App\Category;
use Illuminate\Http\Request;

class HomeController extends Controller
{
    public function index()
    {
        $articles = Article::orderBy('id', 'DESC')->paginate(20);

        return view('home')->withArticles($articles);
    }

    public function getArticleDetails($id)
    {
        $article = Article::find($id);

        return view('details')->withArticle($article);
    }

    public function getCategory($id)
    {
        $category = Category::find($id);

        $articles = Article::where('category_id', $id)->orderBy('id', 'DESC')->paginate(20);

        return view('category')->withArticles($articles)->withCategory($category);
    }
}

resources/views/home.blade.php

@extends('layout')

@section('content')

    <div class="row">
        <div class="col-md-12">
            <h2>Articles</h2>

                @if(count($articles) > 0)

                    @foreach($articles as $article)
                        <div class="row">
                            <div class="col-md-12">
                                @if(!empty($article->image))
                                    <img src="{{ $article->image  }}" class="pull-left img-responsive thumb margin10 img-thumbnail" width="200" />
                                @endif

                                <h4><a href="{{ url('article-details/' . $article->id) }}">{{ $article->title }}</a></h4>
                                <span class="label label-info"><a href="{{ url('category/'.$article->category_id) }}">{{$article->category->title}}</a></span>

                                @if(!empty($article->excerpt))
                                    <article>
                                        <p>{!! $article->excerpt !!}</p>
                                    </article>
                                @endif

                                <em>Source: </em><a class="label label-danger" href="{{ $article->source_link }}" target="_blank">{{ $article->website->title }}</a>
                                <a class="btn btn-warning pull-right" href="{{ url('article-details/' . $article->id) }}">READ MORE</a>
                            </div>
                        </div>
                        <hr/>
                    @endforeach

                        @if(count($articles) > 0)
                            <div class="pagination">
                                <?php echo $articles->render();  ?>
                            </div>
                        @endif

                @else
                        <i>No articles found</i>

                @endif
        </div>
    </div>

@endsection

resources/views/category.blade.php

@extends('layout')

@section('content')

    <div class="row">
        <div class="col-md-12">
            <h2>Category: {{$category->title}}</h2>

                @if(count($articles) > 0)

                    @foreach($articles as $article)
                        <div class="row">
                            <div class="col-md-12">

                                @if(!empty($article->image))
                                    <img src="{{ $article->image  }}" class="pull-left img-responsive thumb margin10 img-thumbnail" width="200" />
                                @endif

                                <h4><a href="{{ url('article-details/' . $article->id) }}">{{ $article->title }}</a></h4>
                                <span class="label label-info">{{$article->category->title}}</span>

                                @if(!empty($article->excerpt))
                                    <article>
                                        <p>{!! $article->excerpt !!}</p>
                                    </article>
                                @endif

                                    <em>Source: </em><a class="label label-danger" href="{{ $article->source_link }}" target="_blank">{{ $article->website->title }}</a>
                                <a class="btn btn-warning pull-right" href="{{ url('article-details/' . $article->id) }}">READ MORE</a>
                            </div>
                        </div>
                        <hr/>
                    @endforeach

                        @if(count($articles) > 0)
                            <div class="pagination">
                                <?php echo $articles->render();  ?>
                            </div>
                        @endif

                @else
                        <i>No articles found</i>

                @endif
        </div>
    </div>

@endsection

resources/views/details.blade.php

@extends('layout')

@section('content')

    <div class="row">
        <div class="col-md-12">
            <h2>{{ $article->title }}</h2>
            <div class="row">
                <div class="col-md-12">
                    <img src="{{ $article->image  }}" class="pull-left img-responsive thumb margin10 img-thumbnail" />
                    <span class="label label-info"><a href="{{ url('category/'.$article->category_id) }}">{{$article->category->title}}</a></span>
                    <em>Source: </em><a class="label label-danger" href="{{ $article->source_link }}" target="_blank">{{ $article->website->title }}</a>
                    <article>
                        <p>{!! $article->content !!}</p>
                    </article>>
                </div>
            </div>
        </div>
    </div>

@endsection

 

Download source

https://bitbucket.org/webmobtuts/web_scraper/src/master/

 

What’s Missing?

Now after you tried the scraping when clicking on link now you need away to implement it automatically, well this need to be run in something like cronjob, i leave this part to you as an assignment.

 

What's your reaction?

Excited
0
Happy
0
Not Sure
0
Confused
0

You may also like

29 Comments

  1. The post like very much like this was very good about this ,very nice.

  2. I got undefined index title? what should I do?

    1. Where exactly?

      1. Clone this repo https://github.com/mnaderian/web-scraper
        We have the same code and also we got the same error. The error appears above the table after a few seconds when we click scrape

        1. Give an example of the website that you want to scrape

  3. Hi bro thanks one of the best tuto in the web bro can you add new tutorial about task scheduller using laravel

    1. Ok

  4. I already entered the schema item but the scrape link section was never finished.

    1. The scrap doesn’t stop twisting

      1. what do you mean stop twisting

    2. what do you mean with never finished?

  5. i got error when i click scrape undefined index source_link

    1. Why? what’s the url you want to scrape?

  6. thank you very informative post , i have a issue with route when try to edit previous session (Missing required parameters for [Route: links.update] [URI: dashboard/links/{link}]. (View: C:\xampp\htdocs\laravel\webscraping\resources\views\dashboard\link\edit.blade.php)

      1. This working version has exactly the same issue when you click on an edit button. I am using laravel 6.

  7. It’s working with NY times site but not with my site. It shows Scraping done but actually no data fetched. Strange!

    1. Remember that for scraping to work:
      * Scraping works on websites that is server rendered in other means sites that is fetching data through javascript or ajax it fails.
      * Some websites work through proxy so you need to configure that first.
      * Some websites uses firewalls to prevent data to be fetched from.
      * With all of that also adjust the schema, if the schema or selectors not properly composed it will fails.

    2. Ok, thank you.

      How can i scrap data which has pagination ? I can only scrap first 10 data only. There are 100 data. total pages 10.

      Can you please help with this ?

      1. I think this simple you can tweak the code to fetch the pagination links then do the same http request on each page

  8. Problem with the SSL CA cert (path? access rights?) for “https://www.nytimes.com/section/politics

    please help..

    1. Nytimes works properly for me it might be something in your operating system

  9. I have all the code just like yours, I tried to scrape the NY times website and when I click “scrape” the process never ends. I get this error:
    Failed to load resource: the server responded with a status of 500 (Internal Server Error)
    And the resource that can’t load is dashboard/links/scrape

    1. Failed to load resource it seems that you entered invalid link

      1. THere is no invalid link, i checked and i have the exact same code as you 🙁

      2. This is the error i get:

        jquery-1.12.4.min.js:formatted:4208 POST http://localhost:8000/dashboard/links/scrape 500 (Internal Server Error)
        send @ jquery-1.12.4.min.js:formatted:4208
        ajax @ jquery-1.12.4.min.js:formatted:3990
        (anonymous) @ links:204
        dispatch @ jquery-1.12.4.min.js:formatted:2117
        r.handle @ jquery-1.12.4.min.js:formatted:1996

        The line 204 in links is the $.ajax({}) function in scrape method

  10. I’m trying to scrape the NY times website but it always says “something went wrong”, I copied all the html tags you used in the video example, except for one that has changed

  11. Thanks, but when I click in scrape it take a long time loading and it does not end ever
    what I should do, please?

    1. Check if laravel log reports any errors

Leave a reply

Your email address will not be published. Required fields are marked *