### In-situ Parquet


![In-situ architecture](./in-situ-architecture.png "in-situ-architecture")

### Today's demo:
![In-situ architecture Demo](./in-situ-architecture-demo.png "in-situ-architecture-demo")

### Import necessary libraries and check if the ingester pod is up

In [2]:
import requests
import json

In [3]:
a = requests.get(url='http://localhost:9801/1.0/doc')
a

<Response [200]>

#### Ingesting a new file from S3

##### Input Data Format
```
{
    "provider": "Saildrone",
    "project": "ATOMIC EUREC4A 2020",
    "observations": [
        {
            "time": "2020-01-18T00:00:00Z",
            "latitude": 10.5648704,
            "longitude": -55.9281152,
            "depth": -5.2,
            "platform": {
                "code": "3B",
                "id": "1026",
                "type": "saildrone"
            },
            "device": "101",
            "meta": "report_payload/payload/1026/2020/01/report_payload-payload-1578754800000-1580515140000.snappy.parquet",
            "wind_speed": 6.84,
            "wind_speed_quality": 2,
            "eastward_wind": -6.64,
            "northward_wind": -2.49,
            "wind_component_quality": 2,
            "wind_from_direction": 69.4
        },
        {
            "time": "2020-01-18T00:00:00Z",
            "latitude": 10.5648704,
            "longitude": -55.9281152,
            "depth": 0.5,
            "platform": {
                "code": "3B",
                "id": "1026",
                "type": "saildrone"
            },
            "device": "130",
            "meta": "report_payload/payload/1026/2020/01/report_payload-payload-1578754800000-1580515140000.snappy.parquet",
            "sea_water_temperature": 27.4648,
            "sea_water_temperature_quality": 2,
            "sea_water_salinity": 35.238,
            "sea_water_salinity_quality": 2
        }
   ]
}
```

##### In-situ incoming data S3 file Structure

![In-situ incoming data S3 file Structure](./in-situ-s3-input-data.png "in-situ-s3-input-data")

s3_url = 's3://cdms-dev-fsu-in-situ-stage/KTDQ_20180730v20001_str.json.gz'
a = requests.put(url='http://localhost:9801/1.0/ingest_json_s3', headers={'Content-Type':'application/json'}, data=json.dumps({"s3_url": s3_url}))
a.content

In [6]:
s3_url = 's3://cdms-dev-fsu-in-situ-stage/KTDQ_20180730v20001_str.json.gz'
a = requests.put(url='http://localhost:9801/1.0/ingest_json_s3', headers={'Content-Type':'application/json'}, data=json.dumps({"s3_url": s3_url}))
a.content

b'{"message": "Internal Server Error"}\n'

Sample Response:
`{"message": "ingested, different sha512", "cause": "missing S3 sha512", "job_id": "6e0f3669-4bef-4917-9d40-03e342233ed4"}`

#### Replacing exisiting file from S3

Using query endpoint similar to DOMS 

In [7]:
s3_url = 's3://cdms-dev-fsu-in-situ-stage/KTDQ_20180730v20001_str.json.gz'
job_id = '8fd1f707-c540-4152-88b8-ea5bc10d738c'
a = requests.put(url='http://localhost:9801/1.0/replace_json_s3', headers={'Content-Type':'application/json'}, data=json.dumps({"s3_url": s3_url, "job_id": job_id}))
a.content

b'{"message": "Internal Server Error"}\n'

#### Parquet Partitions
![In-situ Parquet Partition](./in-situ-parquet-partition.png "in-situ-parquet-partition.png")

### data subsetting API imitating DOMS API

In [27]:
start_index = 0
items_per_page = 20
provider = 'Saildrone'
project = 'atlantic_to_med_2019_to_2020'
platform = '3B'
min_depth = -5.2
max_depth = -5.1
variable = 'relative_humidity'
start_time = '2020-06-01T00:00:00Z'
end_time = '2020-06-03T00:00:00Z'
west_aka_min_lon = 14
south_aka_min_lat = 38.03801
east_aka_max_long = 14.04
north_aka_max_lat = 38.03802


requesting_url = f'http://localhost:9801/1.0/query_data_doms?startIndex={start_index}&itemsPerPage={items_per_page}&'\
                 f'provider={provider}&project={project}&platform={platform}&'\
                 f'minDepth={min_depth}&maxDepth={max_depth}&variable={variable}&'\
                 f'startTime={start_time}&endTime={end_time}&'\
                 f'platform=3B&bbox={west_aka_min_lon},{south_aka_min_lat},{east_aka_max_long},{north_aka_max_lat}'
print(requesting_url)
a = requests.get(url=requesting_url)
result = json.loads(a.content.decode())
print(json.dumps(result, indent=2))

http://localhost:9801/1.0/query_data_doms?startIndex=0&itemsPerPage=20&provider=Saildrone&project=atlantic_to_med_2019_to_2020&platform=3B&minDepth=-5.2&maxDepth=-5.1&variable=relative_humidity&startTime=2020-06-01T00:00:00Z&endTime=2020-06-03T00:00:00Z&platform=3B&bbox=14,38.03801,14.04,38.03802
{
  "total": 451,
  "results": [
    {
      "air_pressure": null,
      "air_pressure_quality": null,
      "air_temperature": null,
      "air_temperature_quality": null,
      "depth": -5.2,
      "eastward_wind": -0.96,
      "latitude": 38.0380192,
      "longitude": 14.0336048,
      "meta": null,
      "northward_wind": -0.55,
      "platform": {
        "type": "saildrone",
        "code": "3B",
        "id": "1053"
      },
      "relative_humidity": null,
      "relative_humidity_quality": null,
      "time": "2020-06-01T17:02:00Z",
      "wind_component_quality": 2,
      "wind_from_direction": 60.2,
      "wind_from_direction_quality": null,
      "wind_speed": 1.11,
      "wind_sp