S5P-PAL Product Search

UPDATE 2024-03-18: A new version of the Data Portal API was released, containing some backwards-incompatibile changes. Consult the Upgrade Guide for information on how to adapt your existing scripts.

Interactive download

S5P-PAL product files can be selected and downloaded using the SpatioTemporal Asset Catalog (STAC) browser interface built on top of our implementation of the STAC Collection Specification.

Because STAC is an open standard, you are not restricted to using this S5P-PAL-hosted GUI. If you have access to another STAC browser or viewer, you can point it directly towards the S5P-PAL 's5p-l2' catalogue at:

https://data-portal.s5p-pal.com/api/s5p-l2/collection.json

(or any of the "root collections" for individual product types you will see displayed on the screen when browsing) and it should work out of the box.

Programmatic access

Similarly, you can use a STAC library, such as the PySTAC Client for Python, to programmatically access the S5P-PAL product catalogue and, for example, obtain download links for product files that way. (There are similar packages and tools for other languages.)

S5P-PAL implements the STAC Item Search API, exposing a search endpoint for querying items in the catalogue based on geospatial or temporal criteria. This search endpoint is not available through the current version of the browser interface, so in order to make use of it doing so programmatically is, for now, the only way.

Item Search Examples

In this section we give some basic examples to help you on your way, using the aforementioned PySTAC Client library for Python. PySTAC Client requires that you also have the lower-level PySTAC library installed.

The following is a simple script that will print some information about the most recently generated product in the catalogue, and then download it. (This script uses the Python Requests library to handle the actual download.)

from pystac import Collection
from pystac_client import ItemSearch
import requests
import hashlib

COLLECTION = "https://data-portal.s5p-pal.com/api/s5p-l2/collection.json"

def get_most_recent_product(collection_url):
    collection = Collection.from_file(collection_url)
    endpoint = collection.get_single_link("search").target

    items = ItemSearch(
        endpoint,
        sortby=[{"field": "properties.archive_date", "direction": "desc"}],
        max_items=1,
    ).items()
    item = list(items)[0]

    download_url = item.assets["download"].href
    product_filename = item.properties["physical_name"]
    product_hash = item.properties["hash"]

    print(f"Downloading {product_filename}...")
    r = requests.get(download_url)
    with open(f"./{product_filename}", "wb") as product_file:
        product_file.write(r.content)
    file_hash = "md5:" + hashlib.md5(open(product_filename, "rb").read()).hexdigest()
    print("Checking hash...")
    assert file_hash == product_hash
    print("Product was downloaded correctly")


if __name__ == "__main__":
    get_most_recent_product(COLLECTION)
Downloading S5P_PAL__L2__TCWV___20240127T031604_20240127T041428_32585_03_010500_20240130T075820.nc...
Checking hash...
Product was downloaded correctly

As you can see in the code, we configure an instance of the ItemSearch class to query the endpoint. The various filtering parameters you can supply are listed in the PySTAC API reference, but of particular interest are datetime and intersects. The former allows searching for product within a time interval, the second allows searching for products that intersect with a polygon. For example:

timefilter = "2018-05-01"
items = ItemSearch(endpoint, datetime=timefilter).items()
print(f"found {len(list(items))} products for time period {timefilter}")

geofilter = {
    'type': 'Polygon',
    'coordinates': [[[6.42425537109375, 53.174765470134616], [7.344360351562499, 53.174765470134616], [7.344360351562499, 53.67393435835391], [6.42425537109375, 53.67393435835391], [6.42425537109375, 53.174765470134616]]]
}
items = ItemSearch(endpoint, datetime=timefilter, intersects=geofilter).items()
print(f"found {len(list(items))} products for time period {timefilter} when geofiltered")
found 59 products for time period 2018-05-01
found 8 products for time period 2018-05-01 when geofiltered

The first ItemSearch gives us the 59 products generated for May 01, 2018. The second ItemSearch narrows the search down to those products whose footprint intersects with the specified region.

The call to items() returns objects of type pystac.item.Item, which can then be further examined to retrieve, for example, the product download links:

print("Download links:")
for item in list(items):
    url = item.assets['download'].href
    print(f"  {url}")
Download links:
  https://data-portal.s5p-pal.com/api/s5p-l2/download/06d39838-5903-4d4c-a39e-91506d7f3011
  https://data-portal.s5p-pal.com/api/s5p-l2/download/b57cfe5b-d6c6-4200-9f55-ee89658de8d2

Item Search extensions

The S5P-PAL STAC search implementation also supports the STAC Sort and the STAC Filter fragments. These are OpenAPI extensions to the basic search query parameters accepted by the Search API.

The Sort fragment introduces the sortby parameter we already used in the 'get_most_recent_product' example above, allowing you to define fields by which to sort the query results (in that case: descending by archive_date).

The Filter fragment provides a mechanism for searching based on item attributes. For example:

items = list(ItemSearch(
    endpoint,
    filter="s5p:file_type='L2__TCWV__' and (s5p:orbit<2830 or s5p:orbit=12841)",
).items())

will select only those products matching that particular filter expression.

Finally, it it also possible for the filter to be specified as a JSON dictionary (as detailed in the Filter Fragment documentation) rather than as a text string. Our previous example would then become:

items = list(ItemSearch(
    endpoint,
    filter=
    {
        "op": "and", "args": [
            { "op": "=", "args": [ {"property": "s5p:file_type"}, "L2__TCWV__" ] },
            {
                "op": "or", "args": [
                    { "op": "<" , "args": [ {"property": "s5p:orbit"}, 2830 ] },
                    { "op": "=" , "args": [ {"property": "s5p:orbit"}, 12841 ] }
                ]
            }
        ]
    }
).items())

Note that in the text example the single quotes around string values such as the file type specification L2__TCWV__ are mandatory, whereas in the JSON version they should not be used. Similarly, in the text version, operators such as and and or are case-insensitive, whereas in the JSON version they must be lowercase.

Browsing examples

It is also possible to use just the PySTAC library to programmatically query the Catalog endpoint. Typically this is a less interesting use case, because traversing the catalogue tree is easier done by using the interactive browser, but it is certainly possible.

For example, the TCWV Catalog can also be retrieved via the root catalog, and subsequently drilled further down into to obtain its Items:

from pystac import Collection


def browse_to_tcwv_items():

    coll = Collection.from_file("https://data-portal.s5p-pal.com/api/catalog.json")
    tcwvcoll = coll.get_child(id="s5p-l2").get_child(id="L2__TCWV__")
    daycoll = tcwvcoll.get_child(id="2024").get_child(id="01").get_child(id="27")
    print("Collection ID:", daycoll.id)
    print("Collection description:", daycoll.description)
    print("Spatial extent:", daycoll.extent.spatial.bboxes)
    print("Temporal extent:", daycoll.extent.temporal.intervals)

    items = daycoll.items()
    print("Items in collection:")
    for index, item in enumerate(items, 1):
        product_filename = item.properties["physical_name"]
        download_url = item.assets["download"].href
        print(f"  product {index:2}: {product_filename}")
        print(f"         url: {download_url}")


if __name__ == "__main__":
    browse_to_tcwv_items()

This will output all the TCWV products for Sep 27, 2024:

Collection ID: 27
Collection description: Collection for product type L2__TCWV__ (2024-1-27)
Spatial extent: [[-180.0, -90.0, 180.0, 90.0]]
Temporal extent: [[datetime.datetime(2024, 1, 27, 0, 0, tzinfo=tzutc()), datetime.datetime(2024, 1, 28, 0, 0, tzinfo=tzutc())]]
Items in collection:
  product  1: S5P_PAL__L2__TCWV___20240127T013434_20240127T023257_32584_03_010500_20240128T174228.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/7e571431-f66f-498c-9fcb-cbc799dae649
  product  2: S5P_PAL__L2__TCWV___20240127T031604_20240127T041428_32585_03_010500_20240130T075820.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/087d06cc-6b1f-4e7a-8768-9e414ede6320
  product  3: S5P_PAL__L2__TCWV___20240127T045734_20240127T055558_32586_03_010500_20240201T085353.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/3b6ce09b-99c6-457c-8946-2fcda491e8b2
  product  4: S5P_PAL__L2__TCWV___20240127T063904_20240127T073728_32587_03_010500_20240129T121230.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/b2ba8538-c55f-4b56-ab38-100cf93d82eb
  product  5: S5P_PAL__L2__TCWV___20240127T082034_20240127T091858_32588_03_010500_20240129T115724.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/b281594e-923f-4981-8064-9207c2d9318b
  product  6: S5P_PAL__L2__TCWV___20240127T100205_20240127T110028_32589_03_010500_20240129T115830.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/ffb954ee-9cf8-47cb-ae3c-c80e12a6765d
  product  7: S5P_PAL__L2__TCWV___20240127T114335_20240127T124159_32590_03_010500_20240129T041716.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/e852fb89-66f1-4ab8-bdc9-fcbbb04bfbf5
  product  8: S5P_PAL__L2__TCWV___20240127T132505_20240127T142329_32591_03_010500_20240129T052727.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/048fb098-efa0-4052-8b89-1665b691f419
  product  9: S5P_PAL__L2__TCWV___20240127T150635_20240127T160459_32592_03_010500_20240129T072825.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/9f5f2705-3d5b-49b2-954d-b73b8092cb71
  product 10: S5P_PAL__L2__TCWV___20240127T164805_20240127T174022_32593_03_010500_20240129T091326.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/88d15e7b-d26a-4b7b-8e27-d9efd0e37f7e
  product 11: S5P_PAL__L2__TCWV___20240127T182936_20240127T192759_32594_03_010500_20240129T111927.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/7f4f0b15-ad14-4b22-ac7f-2ad401ad8e29
  product 12: S5P_PAL__L2__TCWV___20240127T201105_20240127T210929_32595_03_010500_20240129T122717.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/0729d256-e171-467b-9f3b-e6b9b4a3ee41
  product 13: S5P_PAL__L2__TCWV___20240127T215236_20240127T225100_32596_03_010500_20240129T151221.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/467208cb-f8c6-44b7-9c08-989dd500ec73
  product 14: S5P_PAL__L2__TCWV___20240127T233406_20240128T003230_32597_03_010500_20240129T172729.nc
         url: https://data-portal.s5p-pal.com/api/s5p-l2/download/7cf90188-d6bc-47cd-84a8-53f4a6b98de6

Limitations

The PySTAC Client package also contains a command-line utility (as well as some other programmatic classes and functions) that are not currently compatible with S5P-PAL's catalogues. This may change in the future, but for now the method described above, using ItemSearch and an explicitly created endpoint are the way to approach searching. This caveat is of course specific to PySTAC and may not apply to other libraries and tools.

Support

Questions regarding this service can be send to the ESA EO Support Helpdesk.

This service is provided as part of the Sentinel-5P Product Algorithm Laboratory (S5P-PAL) and contains modified Copernicus Sentinel data processed by S[&]T.