Note: this command is very inefficient for large tables, and in
most cases cdsskymatch or tapskymatch provide better
alternatives.
coneskymatch is a utility which performs a cone search-like
query to a remote server for each row of an input table. Each of these
queries returns a table with one row for each item held by the server in the
region of sky represented by the input row. The results of all the queries
are then concatenated into one big output table which is the output of this
command.
The type of virtual observatory service queried is determined by
the servicetype parameter. Typically it will be a Cone Search
service, which queries a remote catalogue for astronomical objects or
sources in a particular region. However, you can also query Simple Image
Access and Simple Spectral Access services in just the same way, to return
tables of available image and spectral resources in the relevant
regions.
The identity of the server to query is given by the
serviceurl parameter. Some advice about how to locate URLs for
suitable services is given in SUN/256.
The effect of this command is like doing a positional crossmatch
where one of the catalogues is local and the other is remote and exposes its
data via a cone search/SIA/SSA service. Because of both the network
communication and the necessarily naive crossmatching algorithm (which
scales linearly with the size of the local catalogue) however, it is only
suitable if the local catalogue has a reasonably small number of rows,
unless you are prepared to wait a long time.
The parallel parameter allows you to perform multiple cone
searches concurrently, so that instead of completing the first cone search,
then the second, then the third, the program can be executing a number of
them at once. This can speed up operation considerably, especially in the
face of network latency, but beware that submitting a very large number of
queries simultaneously to the same server may overload it, resulting in some
combination of failed queries, ultimately slower runtimes, and unpopularity
with server admins. Best to start with a low parallelism and cautiously
increase it to see whether there are gains in performance.
Note that when running, coneskymatch can generate a lot of
WARNING messages. Most of these are complaining about badly formed VOTables
being returned from the cone search services. STILTS does its best to work
out what the service responses mean in this case, and usually makes a good
enough job of it.
Note: this task was known as multicone in its experimental
form in STILTS v1.2 and v1.3.
- ifmt=<in-format>
Specifies the format of the input table as specified by
parameter
in. The known formats are listed in SUN/256. This flag can be
used if you know what format your table is in. If it has the special value
(auto) (the default), then an attempt will be made to detect the format
of the table automatically. This cannot always be done correctly however, in
which case the program will exit with an error explaining which formats were
attempted. This parameter is ignored for scheme-specified tables.
- istream=true|false
If set true, the input table specified by the
in
parameter will be read as a stream. It is necessary to give the
ifmt
parameter in this case. Depending on the required operations and processing
mode, this may cause the read to fail (sometimes it is necessary to read the
table more than once). It is not normally necessary to set this flag; in most
cases the data will be streamed automatically if that is the best thing to do.
However it can sometimes result in less resource usage when processing large
files in certain formats (such as VOTable). This parameter is ignored for
scheme-specified tables.
- in=<table>
The location of the input table. This may take one of the
following forms:
- A filename.
- A URL.
- The special value "-", meaning standard input. In this
case the input format must be given explicitly using the ifmt
parameter. Note that not all formats can be streamed in this way.
- A scheme specification of the form
:<scheme-name>:<scheme-args>.
- A system command line with either a "<" character at
the start, or a "|" character at the end
("<syscmd" or "syscmd|"). This
executes the given pipeline and reads from its standard output. This will
probably only work on unix-like systems.
In any case, compressed data in one of the supported compression formats (gzip,
Unix compress or bzip2) will be decompressed transparently.
- icmd=<cmds>
Specifies processing to be performed on the input table
as specified by parameter
in, before any other processing has taken
place. The value of this parameter is one or more of the filter commands
described in SUN/256. If more than one is given, they must be separated by
semicolon characters (";"). This parameter can be repeated multiple
times on the same command line to build up a list of processing steps. The
sequence of commands given in this way defines the processing pipeline which
is performed on the table.
Commands may alteratively be supplied in an external file, by
using the indirection character '@'. Thus a value of
"@filename" causes the file filename to be read for
a list of filter commands to execute. The commands in the file may be
separated by newline characters and/or semicolons, and lines which are blank
or which start with a '#' character are ignored.
- ocmd=<cmds>
Specifies processing to be performed on the output table,
after all other processing has taken place. The value of this parameter is one
or more of the filter commands described in SUN/256. If more than one is
given, they must be separated by semicolon characters (";"). This
parameter can be repeated multiple times on the same command line to build up
a list of processing steps. The sequence of commands given in this way defines
the processing pipeline which is performed on the table.
Commands may alteratively be supplied in an external file, by
using the indirection character '@'. Thus a value of
"@filename" causes the file filename to be read for
a list of filter commands to execute. The commands in the file may be
separated by newline characters and/or semicolons, and lines which are blank
or which start with a '#' character are ignored.
- omode=out|meta|stats|count|checksum|cgi|discard|topcat|samp|tosql|gui
The mode in which the result table will be output. The
default mode is
out, which means that the result will be written as a
new table to disk or elsewhere, as determined by the
out and
ofmt parameters. However, there are other possibilities, which
correspond to uses to which a table can be put other than outputting it, such
as displaying metadata, calculating statistics, or populating a table in an
SQL database. For some values of this parameter, additional parameters
(
<mode-args>) are required to determine the exact behaviour.
Possible values are
- out
- meta
- stats
- count
- checksum
- cgi
- discard
- topcat
- samp
- tosql
- gui
Use the
help=omode flag or see SUN/256 for more information.
- out=<out-table>
The location of the output table. This is usually a
filename to write to. If it is equal to the special value "-" (the
default) the output table will be written to standard output.
This parameter must only be given if omode has its default
value of "out".
- ofmt=<out-format>
Specifies the format in which the output table will be
written (one of the ones in SUN/256 - matching is case-insensitive and you can
use just the first few letters). If it has the special value
"
(auto)" (the default), then the output filename will be
examined to try to guess what sort of file is required usually by looking at
the extension. If it's not obvious from the filename what output format is
intended, an error will result.
This parameter must only be given if omode has its default
value of "out".
- ra=<expr>
Right ascension in degrees in the ICRS coordinate system
for the position of each row of the input table. This may simply be a column
name, or it may be an algebraic expression calculated from columns as
explained in SUN/256. If left blank, an attempt is made to guess from UCDs,
column names and unit annotations what expression to use.
- dec=<expr>
Declination in degrees in the ICRS coordinate system for
the position of each row of the input table. This may simply be a column name,
or it may be an algebraic expression calculated from columns as explained in
SUN/256. If left blank, an attempt is made to guess from UCDs, column names
and unit annotations what expression to use.
- sr=<expr/deg>
Expression which evaluates to the search radius in
degrees for the request at each row of the input table. This will often be a
constant numerical value, but may be the name or ID of a column in the input
table, or a function involving one.
- find=best|all|each
Determines which matches are retained.
- best: Only the matching query table row closest to the input table
row will be output. Input table rows with no matches will be omitted.
(Note this corresponds to the best1 option in the pair matching
commands, and best1 is a permitted alias).
- all: All query table rows which match the input table row will be
output. Input table rows with no matches will be omitted.
- each: There will be one output table row for each input table row.
If matches are found, the closest one from the query table will be output,
and in the case of no matches, the query table columns will be blank.
Determines whether an attempt will be made to restrict
searches in accordance with available footprint information. If this is set
true, then before any of the per-row queries are performed, an attempt may be
made to acquire footprint information about the servce. If such information
can be obtained, then queries which fall outside the footprint, and hence
which are known to yield no results, are skipped. This can speed up the search
considerably.
Currently, the only footprints available are those provided by the
CDS MOC (Multi-Order Coverage map) service, which covers VizieR and a few
other cone search services.
Determines the HEALPix Nside parameter for use with the
MOC footprint service. This tuning parameter determines the resolution of the
footprint if available. Larger values give better resolution, hence a better
chance of avoiding unnecessary queries, but processing them takes longer and
retrieving and storing them is more expensive.
The value must be a power of 2, and at the time of writing, the
MOC service will not supply footprints at resolutions greater than
nside=512, so it should be <=512.
Only used if usefoot=true.
- copycols=<colid-list>
List of columns from the input table which are to be
copied to the output table. Each column identified here will be prepended to
the columns of the combined output table, and its value for each row taken
from the input table row which provided the parameters of the query which
produced it. See SUN/256 for list syntax. The default setting is
"
*", which means that all columns from the input table are
included in the output.
- scorecol=<col-name>
Gives the name of a column in the output table to contain
the distance between the requested central position and the actual position of
the returned row. The distance returned is an angular distance in degrees. If
a null value is chosen, no distance column will appear in the output table.
- parallel=<n>
Allows multiple cone searches to be performed
concurrently. If set to the default value, 1, the cone query corresponding to
the first row of the input table will be dispatched, when that is completed
the query corresponding to the second row will be dispatched, and so on. If
set to
<n>, then queries will be overlapped in such a way that up
to approximately
<n> may be running at any one time.
Whether increasing <n> is a good idea, and what might
be a sensible maximum value, depends on the characteristics of the service
being queried. In particular, setting it to too large a number may overload
the service resulting in some combination of failed queries, ultimately
slower runtimes, and unpopularity with server admins.
The maximum value permitted for this parameter by default is 5.
This limit may be raised by use of the service.maxparallel system property
but use that option with great care since you may overload services and make
yourself unpopular with data centre admins. As a rule, you should only
increase this value if you have obtained permission from the data centres
whose services on which you will be using the increased parallelism.
- erract=abort|ignore|retry|retry<n>
Determines what will happen if any of the individual cone
search requests fails. By default the task aborts. That may be the best thing
to do, but for unreliable or poorly implemented services you may find that
some searches fail and others succeed so it can be best to continue operation
in the face of a few failures. The options are:
- abort: Failure of any query terminates the task.
- ignore: Failure of a query is treated the same as a query which
returns no rows.
- retry: Failed queries are retried until they succeed; an increasing
delay is introduced for each failure. Use with care - if the failure is
for some good, or at least reproducible reason this could prevent the task
from ever completing.
- retry<n>: Failed queries are retried at most a fixed number
<n> of times; an increasing delay is introduced for each
failure. If failures persist the task terminates.
- ostream=true|false
If set true, this will cause the operation to stream on
output, so that the output table is built up as the results are obtained from
the cone search service. The disadvantage of this is that some output modes
and formats need multiple passes through the data to work, so depending on the
output destination, the operation may fail if this is set. Use with care (or
be prepared for the operation to fail).
- fixcols=none|dups|all
Determines how input columns are renamed before use in
the output table. The choices are:
- none: columns are not renamed
- dups: columns which would otherwise have duplicate names in the
output will be renamed to indicate which table they came from
- all: all columns will be renamed to indicate which table they came
from
If columns are renamed, the new ones are determined by
suffix*
parameters.
- suffix0=<label>
If the
fixcols parameter is set so that input
columns are renamed for insertion into the output table, this parameter
determines how the renaming is done. It gives a suffix which is appended to
all renamed columns from the input table.
- suffix1=<label>
If the
fixcols parameter is set so that input
columns are renamed for insertion into the output table, this parameter
determines how the renaming is done. It gives a suffix which is appended to
all renamed columns from the cone result table.
- servicetype=cone|ssa|sia1|sia2|sia
Selects the type of data access service to contact. Most
commonly this will be the Cone Search service itself, but there are one or two
other possibilities:
- cone: Cone Search protocol - returns a table of objects found near
each location. See Cone Search standard.
- ssa: Simple Spectral Access protocol - returns a table of spectra
near each location. See SSA standard.
- sia1: Simple Image Access protocol version 1 - returns a table of
images near each location. See SIA 1.0 standard.
- sia2: Simple Image Access protocol version 2 - returns a table of
images near each location. See SIA 2.0 standard.
- sia: alias for sia1
- serviceurl=<url-value>
The base part of a URL which defines the queries to be
made. Additional parameters will be appended to this using CGI syntax
("
name=value", separated by '&' characters). If this
value does not end in either a '?' or a '&', one will be added as
appropriate.
See SUN/256 for discussion of how to locate service URLs
corresponding to given datasets.
- verb=1|2|3
Verbosity level of the tables returned by the query
service. A value of 1 indicates the bare minimum and 3 indicates all available
information.
- dataformat=<value>
Indicates the format of data objects described in the
returned table. The meaning of this is dependent on the value of the
servicetype parameter:
- servicetype=cone: not used
- servicetype=ssa: gives the MIME type of spectra referenced in the
output table, also special values "votable",
"fits", "compliant",
"graphic", "all", and others (value of
the SSA FORMAT parameter).
- servicetype=sia1: gives the MIME type required for images/resources
referenced in the output table, corresponding to the SIA FORMAT parameter.
The special values "GRAPHIC" (all graphics formats) and
"ALL" (no restriction) as defined by SIAv1 are also
permissible. For SIA version 1 only, this defaults to
"image/fits".
- servicetype=sia2: gives the MIME type required for images/resources
referenced in the output table, corresponding to the SIA FORMAT parameter.
The special values "GRAPHIC" (all graphics formats) and
"ALL" (no restriction) as defined by SIAv1 are also
permissible.
- servicetype=sia: gives the MIME type required for images/resources
referenced in the output table, corresponding to the SIA FORMAT parameter.
The special values "GRAPHIC" (all graphics formats) and
"ALL" (no restriction) as defined by SIAv1 are also
permissible. For SIA version 1 only, this defaults to
"image/fits".
- emptyok=true|false
Whether the table metadata which is returned from a
search result with zero rows is to be believed. According to the spirit,
though not the letter, of the cone search standard, a cone search service
which returns no data ought nevertheless to return the correct column
headings. Unfortunately this is not always the case. If this parameter is set
true, it is assumed that the service behaves properly in this respect;
if it does not an error may result. In that case, set this parameter
false. A consequence of setting it false is that in the event of no
results being returned, the task will return no table at all, rather than an
empty one.
- compress=true|false
If true, the service is requested to provide HTTP-level
compression for the response stream (Accept-Encoding header is set to
"
gzip", see RFC 2616). This does not guarantee that
compression will happen but if the service honours this request it may result
in a smaller amount of network traffic at the expense of more processing on
the server and client.