basically how to do Web Scraping?
Consider you are browsing a website with very many pages in it, you kind of like it and now you want to have all that content/data for yourself.
Saving it manually would be a very hectic task or next to impossible.
You see a similarity in the webpages, basic Client-Server-Database interactions.
Consider the links of webpages are in sequence
for eg:
www.examplewebsite.com/aaa
www.examplewebsite.com/aab
www.examplewebsite.com/aac
www.examplewebsite.com/aad
…
www.examplewebsite.com/zzx
www.examplewebsite.com/zzy
www.examplewebsite.com/zzz
Each and every link/webpage has same html id/class
for eg:
[hash]name
[hash]price
.photo
How can you as a client PROGRAMME to save the necessary data from each webpage
i.e.
the programme should generate links, send request to server, receive request & data from server, save relevant data (by extracting) on client memory
1 Like
You can use many libraries available online for example Try BeautifulSoup in Python. It has good documentation as to how to go about scrapping information from a particular webpage.
1 Like
I’m using for very the same for statistics retrieval Phantom JS driver .
rb08
January 24, 2015, 6:04pm
4
hope to get some help from you guys’ advice. thanks for the reply.
gkcs
January 28, 2015, 12:24am
8
Check the OP’s profile. You will know
@gkcs
you seem on the 7th sky dude
c:
u r indeed delighted i see
fuk ths forum u know
like i care if they suspend me
there is always a new registration
still this behaviour doesnt effect the put question
1 Like