GeoSpace: Anaconda environment for the earth’s observations

  1. Introduction
  2. GeoSpace Anaconda environment
  3. My Original Python Tools/modules
  4. Tool’s division by categories based on their applications/use
    • GPS data Conversion/Map-Viewer
    • Download/Manipulation/Viewing of NetCDF files

1-Introduction

Since the time of my studies until the present days I had always a focus in exploit information from the earth’s observation data. In the last 10 years, in addition to the use of specific softwares as can be Erdas, Argis, Qgis, Surfer, Autocad and similar, I started to dig in the programming world. The difference of my personal technical skill path can be seen also from the paper published across the time. In fact, to realise my first published paper and then analyse/visualise the data acquired (it was a GPS survey) I used mainly Excel, Surfer and Qgis as main processing tools. In contrast, the last paper published I did not use any ready tool/software but essentially I developed them from scratch by coding. I would like also point it out that this consideration does’t affect the quality of a paper, I’m just using it as an example to describe my personal growing path inside the programming world.

Said that, then in time I developed for my personal interest many tools to interact with earth’s observation data. The code used for me was Python because I consider it a very flexible programming language but in the past I also used a mix of languages if really needed (clearly every programming languages have its own usefull peculiarity).

However be able to do everything with one language, if possible, has its benefit. A way to share a set of tools and specific modules for a particular field or processes is using “Anaconda” which is a distribution of the Python programming languages for scientific computing. With it is very simple share entire environments which are natively and theoretically cross-platforms. I said theoretically because it can happen that some modules required some external library to work properly and this part need to be addressed singularly in each OS involved. Anyway, these and more things are the aspect to be considered when it is going to be created an environment to share with the world. It is a process that require some thought but easy to both update and in time improve/expand as it is in this case. I kindly suggest to use a Debian based linux distribution (as can be the Ubuntu distribution) that in my opinion is the best OS to for these scopes and more generally for developing purposes. For your information as second position in terms of practicality I consider MacOS a good alternative and for last Window. Knowing that I leave you the choice to you but don’t tell me that i did not warn you!

2-GeoSpace Anaconda environment

Below a link to the Github repository of the “GeoSpace” Anaconda environment in which it is possible have its overview and instruction about its installation.

Here in this article I would like to focus the attention on my original development for earth’s observation that I embedded in the GeoSpace Anaconda Environment.

3-My Original Python Tools/modules

They are listed here below with the link to relative GitHub repository where you can find more specific information and installation steps in the case you would like install them singularly (In time I hope to add others tools and when done I will update this article, so keep it an eye to it!)

– Tool4NC : A python module for the netcdf file manipulation and conversions. [LINK]

– MerOC : A Python module (with a GUI interface) to download data and manipulate/convert netCDF-files. [LINK]

– ads4MO : A Python module which adds new downloads services to the CMEMS portal (by Copernicus – registration to the portal required to download the data). It is applied mainly to big data requests using just a CLI and the HTTP data requests. [LINK]

– FTPsubsetMO : Python module to download file from the CMEMS FTP servers (by Copernicus – registration required to download the data) and automatically subset them using many decisional criteria. It is applied mainly to big data requests of product with an elevated number of data requests and where then the HTTP protocol fail. It has a very intuitive GUI interface. [LINK]

– GPSconverter : Python application to manipulate & view/plot GPS data. It is possible convert data generated by a GPS (GPX files mainly) in different many formats. I addition Give to the user the possibility to realise professional maps showing the path or points acquired. It is possible to generate a GPX file from a CSV file generated locally of coming from other sources. [LINK]

As you already saw in the small description above, the first four tool in the list (Tool4NC, MerOC, ads4MO and FTPsubsetMO) are mainly use for the manipulation/download of the NetCDF data format (Network Common Data Form) which in extreme synthesis is an array-oriented data structure for sharing scientific information. This format is very common to store satellites information as can be the one coming from the Sentinel’s constellation. In particular MerOC, ads4MO and FTPsubsetMO are able to download data from the Copernicus marine service or in short CMEMS web portal (in charge to distribute data acquired by the Sentinels) by the use of a simple GUI interface(FTPsubsetMO and MerOC) or by sending instruction into the terminal/command_prompt (ads4MO). I would like add that potentially they can be applied to other Copernicus services (in total there are 6 thematic steams, and for more info click HERE). Tool4NC and a part of MerOC are able to convert/manipulate NetCDF data too. The first using just the CLI or as Python module while the latter by using a simple and intuitive GUI interface.

The GPSconverter tool is related to GPS navigation and its application. In particular with this tool I wanted to create functionalities where it is possible convert data generated by a GPS (GPX/CSV files mainly) in different many formats. In addition It gives to the user the possibility to realise professional maps showing the path or points acquired. It is possible to generate a GPX file from a CSV file generated locally or coming from other sources, convert the GPX file in many others format, realise maps showing the CSV/GPX data and many more.

In the next chapter I will describe the tools above and with the aim to make it more clear I will divide them by categories based on their application/use.

4-Tool’s division by categories based on their applications/use

The python tools can be divided by their type o use/applications in the following categories:

Download/Manipulation/Viewing of NetCDF files

GPS data Conversion/Map-View
er

——————————————

GPS data Conversion/Map-Viewer

——————————————

GPSconverter

Python application to manipulate & view GPS data. I wanted a fast way to convert and plot all the GPS file automatically and with the minimal effort. Below a screenshots of the tab/functionalities available:

TAB-1 INPUT : You can select the starting CSV/GPX file to working with. You can view it and modify it if you wish. furthermore you can decide to export the editing as a txt file.

TAB-2 CSV-CONVERTER: If the starting/input file is a CSV then you can convert it in GPX (at this point you unlock many others possibilities of conversions, see TAB-3 below).

TAB-3 GPX-CONVERTER: It allow you to convert GPX. Below all the possibilities available:

  • Convert from GPX to CSV
  • Convert from GPX to JSON
  • Convert from GPX to HTML
  • Convert from GPX to KML/KMZ
  • Convert from GPX to GeoJSON (LINE)
  • Convert from GPX to Shapefile (LINE)
  • Convert from GPX to GeoJSON (POINTS)
  • Convert from GPX to Shapefile (POINTS)

TAB-4 MAPS: It allows to convert as a map GPX/CSV data. Below a list of the possibilities available:

  • GPX to GMT-MAP
  • CSV to GMT-MAP
  • HTML to RASTER
  • HTML to FLASK-PROJECT

Below a videoclip of what can be considered a small and simple example of the GPSconverter’s usage. We start with a CSV file. I will view it before converting it to GPX. I will produce an high quality overview of the GPX data on a topographic map. Then I will convert the GPX file as KML/KMZ to be easily visualised on Google Earth. I will stop here but much more can be done… It is very intuitive, just explore the tools and If you want to know more just contact me.

——————————————–

Download/Manipulation/Viewing of NetCDF files

——————————————–

tool4NC

One you finish to create environment or install the tool you can import all the python modules functions with:

from tool4NC import *   

For the list of functions available a in “tool4NC” module please continue to read below.

I have many netCDF files and I would like to convert all of them in CSV:
import os
from tool4NC import nctocsv

Input_DIR = 'the/directory/you/want/to/use'
Out_DIR = 'the/directory/you/want/to/use'

for filename in os.listdir(Input_DIR):
    if filename.endswith(".nc"):
       nctocsv (filename, Out_DIR)
I want to overlay in my GIS project (as shape-file) data from a variable which is contained in my netCDF file:
import os
from tool4NC import nctoshape

Input_DIR = 'the/directory/you/want/to/use'
Out_DIR = 'the/directory/you/want/to/use'
Variable = ‘Variable name’

for filename in os.listdir(Input_DIR):
    if filename.endswith(".nc"):
       nctoshape (filename, Out_DIR, Variable)
I have a folder with a month of data divided in daily files. These files are downloaded from the same dataset and I would like to concatenate all the daily files in a montly one:
from tool4NC import concatnc

Input_DIR = 'the/directory/where/you/store/the/daily/file'

concatnc (Input_DIR) #it will generate the concatenated.nc file
I have many netCDF files and I would like to convert all of them in GRD:
import os
from tool4NC import nctogrd

Input_DIR = 'the/directory/you/want/to/use'
Out_DIR = 'the/directory/you/want/to/use'

for filename in os.listdir(Input_DIR):
    if filename.endswith(".nc"):
       nctogdr (filename, Out_DIR)
I have one year file but i realised that it is better have the data organised by Month. Furthermore, I would like also add a suffix to each file:
from tool4NC import splitnc

Input_file = 'my_input_file.nc'
Out_DIR = 'the/directory/to/output/the/results'
type = "MONTH"
suffix = "text/to/append/at/each/file"


splitnc (Input_file, Out_DIR, type, suffix)

MerOC

This tool bring in a GUI the tool described previously (tool4NC) and add download functionalities to simplify the Copernicus marine data download (Fig 1). This download functionalities is a totally new concept and development never done before. It is the first automated download tool created of its type inside the Copernicus framework. In more detail, the download tab (Fig 1, TAB-1) is capable to recreate the “motuclient” HTTP download link but it adds new functionalities that for long where requested/asked from the free and Open-source Copernicus community. In fact starting from a simple “motuclient” HTTP request (that it is possible to download as it is using the option as “single file”) it makes possible for the first time to split automatically the time window by days or months. Furthermore it is also possible download data by depth, monthly&depths and if needed also yearly.

Below I am going to display a more detailed image of the “TAB 1: netCDF-Download” (The real new cool and original added features for the first time developed). With different colours are highlighted the different “TAB” sections.

ads4MO

This python modules brings in a very intuitive scripting way what was already proposed with MerOC and specifically about the data download. The core is the same but change the type of the user interaction that now is totally as CLI experience. Below an example:

FTPsubsetMO

Python software able to download files over FTP protocol and subset the files retrieved by parameters as time-range, bounding box, variables and single/range Depth levels. Together with this tool is distributed a database which store all the information needed to download the files from each datasets (type of data-set (NRT/MY), time steps (DAILY/MONTLY) and other two parameters needed to correctly identify and select the files prior the download (Be aware that in time the database need to be updated to be totally functional). The key value to retrive such information is the FTP URL of the targeted dataset. It was ideated and implemented by me to address the download automatisation in case as was happening the motuclient HTTP download system failed (Not able to handle big download traffic). However I really push the Producers to uniform all the data_structure/file_names and Metadata info (which will make easier the database creation/updates). Below a screen shot of the master stable version:

However, another version was created, in which it is possible to select manually the variables instead of typing them (“FTPsubsetMO_VS” branch) here below displayed and possible to download from HERE if you want to try yourself:

I decided to make possible to use the program in a pure scripting way and then be able to be free in look/modify/customise the code. To allow that please to follow the below steps:

  1. Open the Terminal/command_prompt in the location where you desire download the files or anyway have the script
  2. Activate your python environment and import the module:
from FTPsubsetMO import script
  1. Run the function “script” as follow:
script()

The above function will allow you to add, in the path folder where you run the command, the files needed (which are CMEMS_Database.json and FTPsubsetMO.py) to run the subsetting process in a pure scripting way. “FTPsubsetMO.py” is the only file to modify based on your data request needs. The script’s inputs are highlighted with “”. More information can be found as form of comments in FTPsubsetMO.py script.

Conclusion:

My Python tools included in this environment and listed above are result of personal intellectual work and development, so as such I will not be held responsible for any use you make of it, nor for the results and conclusions you may find using them. Also Although I have cross-checked the whole code, I cannot warranty it is exempt of bugs.

Please also to remember to cite them if they help your research or jobs activities and let me know of course!

Feedbacks are well accepted.

Enjoy! 🙂