nimble.fetchFiles

nimble.fetchFiles(source, overwrite=False)

Get data files from the web or local storage.

Downloads new data files from the web and stores them in a “nimbleData” directory placed in a configurable location (see next paragraph). Once stored, any subsequent calls to fetch the same data will identify that the data is already available locally, avoiding repeated downloads. For zip and tar files, extraction will be attempted. If successful, the returned list paths will include the extracted files, otherwise it will include the archive file.

The location to place the “nimbleData” directory is configurable through nimble.settings by setting the “location” option in the “fetch” section. By default, the location is the home directory (pathlib.Path.home()). The file path within “nimbleData” matches the the download url, except for files extracted from zip and tar files.

Special support for the UCI repository is included. The source can be ‘uci::<Name of Dataset>’ or the url to the main page for a specific dataset.

Parameters:
  • source (str) – Downloadable url or valid string to UCI database (see above).

  • overwrite (bool) – If True, will overwrite any files stored locally with the data currently available from the source.

Returns:

list – The paths to the available files.

See also

fetchFile, data

Examples

A single dataset from a downloadable url.

>>> url = 'https://openml.org/data/get_csv/16826755/phpMYEkMl'
>>> titanic = nimble.fetchFiles(url) 

Replacing the path to the root storage location with an ellipsis and using a Unix operating system, the titanic return is ['.../nimbleData/openml.org/data/get_csv/16826755/phpMYEkMl']. Note how the directory structure mirrors the url.

For the UCI database, two additional options are available. A string starting with ‘uci:’ followed by the name of a UCI dataset or the url to the main page of the dataset.

>>> iris = nimble.fetchFiles('uci::Iris') 
>>> url = 'https://archive.ics.uci.edu/ml/datasets/Wine+Quality'
>>> wineQuality = fetchFiles(url) 

Keywords: get, download, local, store, files, url, obtain, retrieve, get, open, create, folder