Learn about exporting and downloading Marqeta platform data sets using the DiVA API.
The DiVA API enables you to define a Marqeta platform dataset and export it as a compressed CSV file. You can choose between Zip or Gzip compression. After export, you use the API to download the compressed file.
You can export any dataset as a CSV file by sending a GET request to the appropriate endpoint. To construct your endpoint URL, start with the URL you would use to retrieve that same dataset in JSON format, for example:/views/authorizations/month?program=my_programThen insert the export_type path parameter (/csv) before the query string, for example:/views/authorizations/month/csv?program=my_programBy default, the resulting dataset is compressed as a gz file. You can compress it as a zip file by including the compress query parameter, for example:/views/authorizations/month/csv?compress=zip&program=my_programBecause the export operation is processed asynchronously, you should receive an immediate 202 Accepted response. The JSON-formatted response body contains a token that you will use in downloading your data-set file, for example:
Note
By default, the DiVA API returns 1,048,575 rows in a file export and can take several minutes to generate the file. You can increase the download limit up to 5,000,000 rows by including the max_count=5000000 parameter.
To retrieve your file, send a GET request to the /download?token={my_download_token} endpoint, where {my_download_token} is the value of the token field that was returned in response to your export request, for example:/download?token=db63c24d8307c24b7e17d33735114dc8f807838a.csv.gz
Note
The token value includes two filename extensions (for example, .csv.gz). You must include these extensions in your request URL.
The API returns one of these responses:
If the job is not finished: The 202 "Accepted" HTTP response code and a plain-text body containing the word Pending.
If the job is finished: The 200 "OK" HTTP response code and the file as an application/octet-stream.
If the job has expired: The 410 "Gone" HTTP response code. Completed jobs expire after 60 minutes.
When saving your file, use the same filename extensions you used in your URL request, for example: my_downloaded_file.csv.gzThe following example of Python code illustrates how you can download an exported report file in CSV format:
Python
import requests from requests.auth import HTTPBasicAuth import time import pandas as pd # Constants for HTTP response codes RC_SUCCESS = 200 RC_ACCEPTED = 202 RC_UNAUTHORIZED = 401 # Generate authentication string username = 'APPLICATION_TOKEN' # replace APPLICATION_TOKEN with your application token password = 'ACCESS_TOKEN' # replace ACCESS_TOKEN with your access token basic_auth = HTTPBasicAuth(username, password) # Download an exported file with the specified token # Parameters: # file_token - token of the file to download # auth - authentication string # base_url - base api path for download url # retry_seconds - maximum time to retry, in seconds def getCSV(file_token, auth, base_url, retry_seconds = 300): # Set timeout to current time plus maximum time to retry timeout = time.time() + retry_seconds # Build URL to download exported file download_file_url = base_url + '/download?token=' + file_token # Check status whether the file is ready for download code = requests.head(download_file_url, auth = basic_auth).status_code while (code != RC_SUCCESS) and time.time() < timeout: time.sleep(1) # Retry check status code = requests.head(download_file_url, auth = basic_auth).status_code if code == RC_SUCCESS: # check status succeeded - the file is ready to download download_response = requests.get(download_file_url, auth = basic_auth) # Save the response content into a temporary file file = open('temp.csv.gz', 'wb') file.write(download_response.content) file.close() # Read the CSV content from the gzipped file data_out = pd.read_csv('temp.csv.gz', compression = 'gzip', error_bad_lines = False) else: data_out = 'no timely response' # check status timed out return data_out # Build URL to export dataset for resource of interest (e.g. cards) in desired file format (e.g. CSV) api_base_path = 'https://diva-api.marqeta.com/data/v2' resource_format_path = '/views/cards/detail/csv' program_selector = '?program=MY_PROGRAM' # replace MY_PROGRAM with the name of your program export_dataset_url = api_base_path + resource_format_path + program_selector # Invoke request to export the dataset export_response = requests.get(export_dataset_url, auth = basic_auth) if export_response.status_code == RC_ACCEPTED: # export request succeeded # Obtain the CSV file token from the response export_file_token = export_response.json().get('token') # Call the getCSV function to download the CSV file data = getCSV(file_token = export_file_token, auth = basic_auth, base_url = api_base_path) if data == 'no timely response': print('Failure: No timely response') else: print('Success: Dataset length = ' + str(len(data))) elif export_response.status_code == RC_UNAUTHORIZED: print('Failure: Unauthorized access') # authentication failed else: print('Failure: Unknown error') # export request failed