Data from APIs

Sometimes data for web page is loaded dynamically, i.e. on demand. This can be done for performance reasons, to make initial load faster. Sometimes you can identify this pattern by observing spinners or loaders, or content appearing incrementally:

On the browser, this can be observed by inspecting Network tab of debugger tools. For example, on Chrome, you can find Network tab by clicking on menu “View > Developer > Developer Tools” and going into Network tab

. Watch screencast

Having problems finding the right request? Try searching by data

Now you can find API request which yields required data

curl https://gedimino37.lt//catalog.php

Which would (in this case) would yield structured data:

[{"id":"B0.1","status":"sold","direction":"PR","floor":"0","size":"87,69","rooms":"6}]

The code snippet in Python to retrieve this kind of data:

import json
from dphelper import DPHelper

helper = DPHelper()
headers = helper.create_headers(authority="gedimino37.lt")

content = helper.from_url('https://gedimino37.lt//catalog.php', headers=headers)
data = json.loads(content)

Or, without or helper library (using standard requests library):

import requests
from dphelper import DPHelper

helper = DPHelper()
headers = helper.create_headers(authority="gedimino37.lt")

r = requests.get('https://gedimino37.lt/catalog.php', headers=headers)
data = r.json()

Leave a Reply

Your email address will not be published. Required fields are marked *