Hyppää sisältöön
    • Suomeksi
    • På svenska
    • In English
  • Suomi
  • Svenska
  • English
  • Kirjaudu
Hakuohjeet
JavaScript is disabled for your browser. Some features of this site may not work without it.
Näytä viite 
  •   Ammattikorkeakoulut
  • Metropolia Ammattikorkeakoulu
  • Opinnäytetyöt
  • Näytä viite
  •   Ammattikorkeakoulut
  • Metropolia Ammattikorkeakoulu
  • Opinnäytetyöt
  • Näytä viite

Building Application Powered by Web Scraping

Phan, Huy (2019)

 
Avaa tiedosto
Building Application Powered by Web Scraping (1.212Mt)
Lataukset: 


Phan, Huy
2019
All rights reserved. This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-201904175517
Tiivistelmä
Being able to collect and process online contents to help can help businesses to make informed decisions. With the explosion of data available online this process cannot be practically accomplished with manual browsing but can be done with Web Scraping, an automated system that can collect just the necessary data. This paper examines the use of Web Scraper in building Web Applications, in order to identify the major advantages and challenges of web scraping. Two applications based on web scrapers are built to study how scraper can help developers retrieve and analyze data. One has a web scraper backend to fetch data from web stores as demanded. The other scraps and accumulates data over time.

A good web scraper requires very robust, multi-component architecture that is fault tolerant. The retrieval logic can be complicated since the data can be in different format. A typical application based on web scraper requires regular maintenance in order to function smoothly. Site owners may not want such a robot scraper to visit and extract data from their sites so it is important to check the site’s policy before trying to scrap its contents.

It will be beneficial to look into ways to optimize the scraper traffic. The next step after data retrieval is to have a well-defined pipeline to process the raw data to get just the meaningful data that the developer intended to get.
Kokoelmat
  • Opinnäytetyöt
Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatKoulutusalatAsiasanatUusimmatKokoelmat

Henkilökunnalle

Ammattikorkeakoulujen opinnäytetyöt ja julkaisut
Yhteydenotto | Tietoa käyttöoikeuksista | Tietosuojailmoitus | Saavutettavuusseloste