phidown.search
Classes
Module Contents
- class phidown.search.CopernicusDataSearcher[source]
-
- query_by_filter(base_url: str = 'https://catalogue.dataspace.copernicus.eu/odata/v1/Products', collection_name: str | None = 'SENTINEL-1', product_type: str | None = None, orbit_direction: str | None = None, cloud_cover_threshold: float | None = None, attributes: Dict[str, str | int | float] | None = None, aoi_wkt: str | None = None, start_date: str | None = None, end_date: str | None = None, top: int = 1000, count: bool = False, order_by: str = 'ContentDate/Start desc', burst_mode: bool = False, burst_id: int | None = None, absolute_burst_id: int | None = None, swath_identifier: str | None = None, parent_product_name: str | None = None, parent_product_type: str | None = None, parent_product_id: str | None = None, datatake_id: int | None = None, relative_orbit_number: int | None = None, operational_mode: str | None = None, polarisation_channels: str | None = None, platform_serial_identifier: str | None = None) None[source]
Set and validate search parameters for the Copernicus data query.
- Parameters:
base_url (str) – The base URL for the OData API.
collection_name (str, optional) – Name of the collection to search. Defaults to ‘SENTINEL-1’.
product_type (str, optional) – Type of product to filter. Defaults to None.
orbit_direction (str, optional) – Orbit direction to filter (e.g., ‘ASCENDING’, ‘DESCENDING’). Defaults to None.
cloud_cover_threshold (float, optional) – Maximum cloud cover percentage to filter. Defaults to None.
attributes (Dict[str, Union[str, int, float]], optional) – Additional attributes for filtering. Defaults to None.
aoi_wkt (str, optional) – Area of Interest in WKT format. Defaults to None.
start_date (str, optional) – Start date for filtering (ISO 8601 format). Defaults to None.
end_date (str, optional) – End date for filtering (ISO 8601 format). Defaults to None.
top (int, optional) – Maximum number of results to retrieve. Defaults to 1000.
order_by (str, optional) – Field and direction to order results by. Defaults to “ContentDate/Start desc”.
burst_mode (bool, optional) – Enable Sentinel-1 SLC Burst mode searching. Defaults to False.
burst_id (int, optional) – Burst ID to filter (burst mode only). Defaults to None.
absolute_burst_id (int, optional) – Absolute Burst ID to filter (burst mode only). Defaults to None.
swath_identifier (str, optional) – Swath identifier (e.g., ‘IW1’, ‘IW2’) (burst mode only). Defaults to None.
parent_product_name (str, optional) – Parent product name (burst mode only). Defaults to None.
parent_product_type (str, optional) – Parent product type (burst mode only). Defaults to None.
parent_product_id (str, optional) – Parent product ID (burst mode only). Defaults to None.
datatake_id (int, optional) – Datatake ID (burst mode only). Defaults to None.
relative_orbit_number (int, optional) – Relative orbit number (burst mode only). Defaults to None.
operational_mode (str, optional) – Operational mode (e.g., ‘IW’, ‘EW’) (burst mode only). Defaults to None.
polarisation_channels (str, optional) – Polarisation channels (e.g., ‘VV’, ‘VH’) (burst mode only). Defaults to None.
platform_serial_identifier (str, optional) – Platform serial identifier (e.g., ‘A’, ‘B’) (burst mode only). Defaults to None.
- _load_config(config_path=None)[source]
Load the configuration file.
- Parameters:
config_path (str, optional) – Path to the configuration file. Defaults to None.
- Raises:
FileNotFoundError – If the configuration file is not found.
ValueError – If the configuration file is not a valid JSON file.
- _validate_collection(collection_name)[source]
Validate the collection name against the available collections in the configuration.
- _get_valid_product_types(collection_name)[source]
Extracts and filters valid product types from a configuration dictionary based on the given collection name.
- _validate_product_type()[source]
Validates the provided product type against a list of valid product types. If the product type is None, the validation is skipped.
- Raises:
ValueError – If the product type is not in the list of valid product types.
TypeError – If the product type is not a string.
- _validate_order_by()[source]
Validate the ‘order_by’ parameter against valid fields and directions.
- Raises:
ValueError – If the ‘order_by’ parameter is invalid.
- _validate_top()[source]
Validate the ‘top’ parameter to ensure it is within the allowed range.
- Raises:
ValueError – If the ‘top’ parameter is not between 1 and 1000.
- _validate_cloud_cover_threshold()[source]
Validate the ‘cloud_cover_threshold’ parameter to ensure it is between 0 and 100.
- Raises:
ValueError – If the ‘cloud_cover_threshold’ parameter is not between 0 and 100.
- _validate_orbit_direction()[source]
Validate the ‘orbit_direction’ parameter to ensure it is one of the allowed values.
- Raises:
ValueError – If the ‘orbit_direction’ parameter is not ‘ASCENDING’, ‘DESCENDING’, or None.
- _validate_aoi_wkt() None[source]
Validate and normalize the ‘aoi_wkt’ parameter to ensure it is a valid WKT polygon. Automatically fixes common issues like extra whitespace and missing closing coordinates.
- Raises:
ValueError – If the ‘aoi_wkt’ parameter is not a valid WKT polygon.
TypeError – If the ‘aoi_wkt’ parameter is not a string.
- _validate_time()[source]
Validate the ‘start_date’ and ‘end_date’ parameters to ensure they are in ISO 8601 format and that the start date is earlier than the end date.
- Raises:
ValueError – If the dates are not in ISO 8601 format or if the start date is not earlier than the end date.
- _validate_attributes()[source]
Validate the ‘attributes’ parameter to ensure it is a dictionary with valid key-value pairs.
- Raises:
TypeError – If ‘attributes’ is not a dictionary, or if its keys are not strings, or if its values are not strings, integers, or floats.
- _validate_burst_parameters()[source]
Validate burst-specific parameters.
- Raises:
ValueError – If any burst parameter is invalid.
TypeError – If any burst parameter has the wrong type.
- _initialize_placeholders()[source]
Initializes placeholder attributes for the class instance.
This method sets up several attributes with default values of None to serve as placeholders. These attributes include:
filter_condition (Optional[str]): A string representing a filter condition.
query (Optional[str]): A string representing the query.
url (Optional[str]): A string representing the URL.
response (Optional[requests.Response]): A requests.Response object for HTTP responses.
json_data (Optional[dict]): A dictionary to store JSON data from the response.
df (Optional[pd.DataFrame]): A pandas DataFrame to store tabular data.
- execute_query()[source]
Execute the query and retrieve data.
If count=True and the total number of results exceeds the ‘top’ limit, this method will automatically paginate through all results using multiple requests with the $skip parameter, combining all results into a single DataFrame.
- Returns:
DataFrame containing all retrieved products.
- Return type:
pd.DataFrame
- _execute_paginated_query()[source]
Execute paginated queries when results exceed top limit using asyncio
- query_by_name(product_name: str) pandas.DataFrame[source]
Query Copernicus data by a specific product name. The results (DataFrame) are stored in self.df.
- Parameters:
product_name (str) – The exact name of the product to search for.
- Returns:
- A DataFrame containing the product details.
Returns an empty DataFrame if the product is not found or an error occurs.
- Return type:
pd.DataFrame
- Raises:
ValueError – If product_name is empty or not a string.
- search_products_by_name_pattern(name_pattern: str, match_type: str, collection_name_filter: str | None = None, top: int | None = None, order_by: str | None = None) pandas.DataFrame[source]
Searches for Copernicus products by a name pattern using ‘exact’, ‘contains’, ‘startswith’, or ‘endswith’. Optionally filters by a specific collection name or uses the instance’s current collection if set. The results (DataFrame) are stored in self.df.
- Parameters:
name_pattern (str) – The pattern to search for in the product name.
match_type (str) – The type of match. Must be one of ‘exact’, ‘contains’, ‘startswith’, ‘endswith’.
collection_name_filter (str, optional) – Specific collection to filter this search by. If None, and self.collection_name (instance attribute) is set, self.collection_name will be used. If both are None, no collection filter based on collection name is applied for this specific search.
top (int, optional) – Maximum number of results. If None, uses self.top (instance default). Must be between 1 and 1000.
order_by (str, optional) – Field and direction to order results (e.g., ‘ContentDate/Start desc’). If None, uses self.order_by (instance default).
- Returns:
DataFrame with product details. Empty if no match or error.
- Return type:
pd.DataFrame
- Raises:
ValueError – If name_pattern is empty, match_type is invalid, or effective ‘top’ is out of range. Also if ‘collection_name_filter’ is provided and is invalid.
- download_product(eo_product_name: str, output_dir: str, config_file='.s5cfg', verbose=True, show_progress=True)[source]
Download the EO product using the downloader module.
- Parameters:
eo_product_name – Name of the EO product to download
output_dir – Local output directory for downloaded files
config_file – Path to s5cmd configuration file
verbose – Whether to print download information
show_progress – Whether to show tqdm progress bar during download
- Returns:
True if download was successful, False otherwise
- Return type: