UrlDownload.jl
UrlDownload.jl is a small package aimed to simplify process of data downloading and postprocessing, without intermediate files storing. Additionally UrlDownload.jl provides progress bar for big files with long download time.
Currently these types of data are supported
- PIC: image files, such as jpeg, png, bmp etc
- CSV: files with comma separated values
- FEATHER
- JSON
Unsupported file formats can be processed with the help of custom parsers.
Installation
To install UrlDownload
either do
using Pkg
Pkg.add("UrlDownload")
or switch to Pkg
mode with ]
and issue
pkg> add UrlDownload
Note: this package uses many different packages for data processing, which should be installed separately. SO, if you receive message like ERROR: ArgumentError: Package CSV not found in current path
, install CSV.jl manually and error will go away. No additional work is needed, since UrlDownload.jl import necessary packages on it's own.
Functions
UrlDownload.urldownload
— Functionurldownload(url, progress = false;
parser = nothing, format = nothing, save_raw = nothing,
compress = :auto, multifiles = false, headers = HTTP.Header[],
httpkw = Pair[], update_period = 1, kw...)
Download file from the corresponding url
in memory and process it to the necessary data structure.
Arguments
url
: url of downloadprogress
: showProgressMeter
, by default it is not shownparser
: custom parser, function that should accept one positional argument of the typeVector{UInt8}
and optional keyword arguments and return necessary data structure. If parser is set than it overrides all other settings, such asformat
. If parser is not set, than internal parsers are used for data process.format
: one of the fixed formats (:CSV, :PIC, :FEATHER, :JSON), if set overrides autodetection mechanism.save_raw
: if set toString
orIO
then downloaded raw data is stored in corresponding stream.compress
: :auto by default, can be one of :none, :xz, :gzip, :bzip2, :lz4, :zstd, :zip. Determines whether file is compressed and compression type. Decompressed data is processed either by customparser
or by internal parser. By default for any compression type except of:zip
internal parser isCSV.File
, for:zip
usual rules applies. Ifcompress
is:none
than custom parser should decompress data on its own.multifiles
:false
by default, for:zip
compressed data defines, whether process only first file inside archive or return an array of decompressed and processed objects.headers
:HTTP.jl
arguments that set http header of the request.httpkw
:HTTP.jl
additional keyword arguments that is passed to theGET
function. Should be supplied as a vector of pairs.update_period
: period ofProgressMeter
update, by default 1 seckw...
: any keyword arguments that should be passed to the data parser.
UrlDownload.@f_str
— Macrof_str(name)
Use this macro to explicitly show that downloaded url is actually local file resource. It is useful if resource type autodetection fails.
Example
using UrlDownload: @f_str
url = f"/tmp/data"
res = urldownload(url)
# Alternatively
using UrlDownload: File
url = File("/tmp/data")
res = urldownload(url)
UrlDownload.@u_str
— Macrou_str(name)
Use this macro to explicitly show that downloaded url is remote http resource. It is useful if resource type autodetection fails.
Example
using UrlDownload: @u_str
url = u"https://example.com/data.csv"
res = urldownload(url)
# Alternatively
using UrlDownload: URL
url = URL("https://example.com/data.csv")
res = urldownload(url)